Splunk SAF Offboarding Notes.md 11 KB

Splunk SAF Data Offboarding.txt

Currently a 3 node multi-site cluster. Possible solution, set search and rep factor to 3 and 3 then pull the index files off one of the indexers to a new instance. On the new instance, setup multi-site cluster with one site and see if you can read the indexed files.

Splunk Enterprise 8.0.2 - "Managing Indexers and Clusters of Indexers" - Decommission a site in a multisite indexer cluster
Splunk Enterprise 7.0.3 - "Managing Indexers and Clusters of Indexers" - Multisite indexer cluster deployment

1 - cluster master
1 - indexer with search

/opt/splunkdata/hot/normal_primary/

indexes:
app_mscas
app_o365
dns
forescout
network
security
te


File paths
#Where in salt are the search / rep factors
salt/pillar/saf_variables.sls 
splunk:
  cluster_name: saf
  license_master: saf-splunk-cm
  idxc:
    label: "saf_index_cluster"
    pass4SymmKey: "$1$ekY601SK1y5wfbd2ogCNRIhn+gPeQ+UYKzY3MMAnPmmz"
    rep_factor: 2
    search_factor: 2

#where in splunk are the configs written to?
/opt/splunk/etc/system/local/server.conf

[clustering]
mode = master
multisite = true
replication_factor = 2
search_factor = 2
max_peer_sum_rep_load = 15
max_peer_rep_load = 15
max_peer_build_load = 6
summary_replication = true
site_search_factor = origin:1, total:2
site_replication_factor = origin:1,site1:1,site2:1,site3:1,total:3
available_sites = site1,site2,site3
cluster_label = afs_index_cluster

Steps

  1. Change /opt/splunk/etc/system/local/server.conf site_search_factor to origin:1,site1:1,site2:1,site3:1,total:3. This will ensure we have a searchable copy of all the buckets on all the sites. Should I change site_replication_factor to origin:1, total:1? This would reduce the size of the index.
  2. Restart CM ( this will apply the site_search_factor )
  3. Send data to junk index (oneshot)
    • 3.1 /opt/splunk/bin/splunk add oneshot /opt/splunk/var/log/splunk/splunkd.log -sourcetype splunkd -index junk
  4. Stop one indexer and copy index to new cluster.
  5. On new cluster, setup CM and 1 indexer in multisite cluster. the clustermaster will be a search head in the same site
  6. Setup new cluster to have site_mappings = default:site1
  7. Attempt to search on new cluster

made the new junk index on test saf number of events: 64675 latest = 02/21/20 9:32:01 PM UTC earlest = 02/19/20 2:32:57 PM UTC

Before copying the buckets, ensure they are ALL WARM buckets, HOT buckets maybe be deleted on startup.

#check on the buckets 
| dbinspect index=junk

uploaded brad_LAN key pair to AWS for new instances. 

vpc-041edac5e3ca49e4d
subnet-0ca93c00ac57c9ebf
sg-0d78af22d0afd0334

saf-offboarding-cm-deleteme
saf-offboarding-indexer-1-deleteme

CentOS 7 (x86_64) - with Updates HVM

t2.medium (2 CPU 4 GB RAM)
100 GB drive

msoc-default-instance-role

saf-offboarding-ssh Security group <- delete this not needed just SSH from Bastion host

splunk version 7.0.3

#setup proxy for yum and wget
vi /etc/yum.conf
proxy=http://proxy.msoc.defpoint.local:80
yum install vim wget
vim /etc/wgetrc
http_proxy = http://proxy.msoc.defpoint.local:80
https_proxy = http://proxy.msoc.defpoint.local:80

#Download Splunk

wget -O splunk-7.0.3-fa31da744b51-linux-2.6-x86_64.rpm 'https://www.splunk.com/page/download_track?file=7.0.3/linux/splunk-7.0.3-fa31da744b51-linux-2.6-x86_64.rpm&ac=&wget=true&name=wget&platform=Linux&architecture=x86_64&version=7.0.3&product=splunk&typed=release'

#install it
yum localinstall splunk-7.0.3-fa31da744b51-linux-2.6-x86_64.rpm

#setup https
vim /opt/splunk/etc/system/local/web.conf

[settings]
enableSplunkWebSSL = 1

#start it
/opt/splunk/bin/splunk start --accept-license

#CM
https://10.1.2.170:8000/en-US/app/launcher/home

#Indexer
https://10.1.2.236:8000/en-US/app/launcher/home

#Change password for admin user
/opt/splunk/bin/splunk edit user admin -password Jtg0BS0nrAyD -auth admin:changeme

Turn on distributed search in the GUI

#on CM
/opt/splunk/etc/system/local/server.conf
[general]
site = site1

[clustering]
mode = master
multisite = true
replication_factor = 2
search_factor = 2
max_peer_sum_rep_load = 15
max_peer_rep_load = 15
max_peer_build_load = 6
summary_replication = true
site_search_factor = origin:1,site1:1,site2:1,site3:1,total:3
site_replication_factor = origin:1,site1:1,site2:1,site3:1,total:3
available_sites = site1,site2,site3
cluster_label = saf_index_cluster
pass4SymmKey = password
site_mappings = default:site1

#on IDX
/opt/splunk/etc/system/local/server.conf
[general]
site = site1

[clustering]
master_uri = https://10.1.2.170:8089
mode = slave
pass4SymmKey = password
[replication_port://9887]

ensure networking is allowed between the hosts

The indexer will show up in the Cluster master

#create this file on the indexer
/opt/splunk/etc/apps/saf_all_indexes/local/indexes.conf

[junk]
homePath       = $SPLUNK_DB/junk/db
coldPath       = $SPLUNK_DB/junk/colddb
thawedPath     = $SPLUNK_DB/junk/thaweddb

#copy the index over to the indexer
cp junk_index.targz /opt/splunk/var/lib/splunk/
tar -xzvf junk_index.targz

###################################################################################
PROD testing Notes

SAF PROD Cluster testing with the te index.
The indexers do not have the space to move to search/rep factor 3/3. Duane suggests keeping the current 2/3 and letting the temp splunk cluster make the buckets searchable. According to the monitoring console:

te index gathered on Feb 26
total index size: 3.1 GB
total raw data size uncompressed: 10.37 GB
total events: 12,138,739
earliest event: 2019-05-17 20:40:00
latest event: 2020-02-26 16:43:32

| dbinspect index=te | stats count by splunk_server
count of buckets
indexer1: 105
indexer2: 103
indexer3: 104

| dbinspect index=te | search state=hot
currently 6 hot buckets

index=te | stats count ALL TIME fast mode
6069419

size on disk
1.1 GB

size of tarball
490 MB

Allow instance to write to S3 bucket

{
    "Id": "Policy1582738262834",
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1582738229969",
            "Action": [
                "s3:PutObject"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::mdr-saf-off-boarding/*",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::477548533976:role/msoc-default-instance-role"
                ]
            }
        }
    ]
}
./aws s3 cp rst2odt.py s3://mdr-saf-off-boarding
./aws s3 cp /opt/splunkdata/hot/normal_primary/saf_te_index.tar.gz s3://mdr-saf-off-boarding

aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_te_index.tar.gz --expires-in 604800

uploaded brad_LAN key pair to AWS for new instances.

vpc-0202aedf3d0417cd3
subnet-01bc9f77742ff132d
sg-03dcc0ecde42fc8c2, sg-077ca2baaca3d8d97

saf-offboarding-splunk-cm
saf-offboarding-splunk-indexer

CentOS 7 (x86_64) - with Updates HVM

t2.medium (2 CPU 4 GB RAM)
100 GB drive for te index test

msoc-default-instance-role

tag instances 
Client saf     

use the msoc_build key

#CM
ip-10-1-3-72

#indexer-1
ip-10-1-3-21

#indexer-2
ip-10-1-3-24

#indexer-3
ip-10-1-3-40

use virtualenv to grab awscli

export https_proxy=http://proxy.msoc.defpoint.local:80
sudo -E ./pip install awscli

./aws s3 cp s3://mdr-saf-off-boarding/saf_te_index.tar.gz /opt/splunk/var/lib/splunk/saf_te_index.tar.gz

move index to CM rep buckets are not expanding to Search buckets

  1. rm -rf saf_all_indexes
  2. create it on the CM

    • 2.1 mkdir -p /opt/splunk/etc/master-apps/saf_all_indexes/local/
    • 2.2 vim /opt/splunk/etc/master-apps/saf_all_indexes/local/indexes.conf
      [te] homePath = $SPLUNK_DB/te/db coldPath = $SPLUNK_DB/te/colddb thawedPath = $SPLUNK_DB/te/thaweddb repFactor=auto

    • 2.3 cluster bundle push

    • 2.3.1 /opt/splunk/bin/splunk list cluster-peers

    • 2.3.1 splunk validate cluster-bundle

    • 2.3.2 splunk apply cluster-bundle

################### #

Actual PROD offboarding!

# ##################

What indexes do we got? | rest /services/data/indexes/ | stats count by title

#estimate size and age

| rest /services/data/indexes/
| search title=app_mscas OR title = app_o365 OR title=dns OR title=forescout OR title=network OR title=security OR title=Te
| eval indexSizeGB = if(currentDBSizeMB >= 1 AND totalEventCount >=1, currentDBSizeMB/1024, null())
| eval elapsedTime = now() - strptime(minTime,"%Y-%m-%dT%H:%M:%S%z")
| eval dataAge = ceiling(elapsedTime / 86400)
| stats sum(indexSizeGB) AS totalSize max(dataAge) as oldestDataAge by title
| eval totalSize = if(isnotnull(totalSize), round(totalSize, 2), 0)
| eval oldestDataAge = if(isNum(oldestDataAge), oldestDataAge, "N/A")
| rename title as "Index" totalSize as "Total Size (GB)" oldestDataAge as "Oldest Data Age (days)"

  1. adjust CM and push out new data retention limits per customer email
  2. allow indexers to prune old data
  3. stop splunk on one indexer
  4. tar up splunk directory
  5. upload to s3
  6. download from s3 to temp indexers and extract to ensure data is readable
  7. repeat for all indexes

prune data based on time Updated Time 1/6/2020, 1:59:50 PM Active Bundle ID? 73462849B9E88F1DB2B9C60643A06F67 Latest Bundle ID? 73462849B9E88F1DB2B9C60643A06F67 Previous Bundle ID? FF9104B61366E1841FEDB1AF2DE901C2

4 With encryption tar cvzf saf_myindex_index.tar.gz myindex/

without encryption tar cvf /hubble.tar hubble/

trying this: Github repo for s3-multipart-uploader

use virtualenv

bin/python s3-multipart-uploader-master/s3_multipart_uploader.py -h

bucket name mdr-saf-off-boarding

bin/aws s3 cp /opt/splunkdata/hot/saf_te_index.tar.gz s3://mdr-saf-off-boarding/saf_te_index.tar.gz

DID NOT NEED TO USE THE MULTIPART uploader!

aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_app_mscas_index.tar.gz --expires-in 86400
aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_app_o365_index.tar.gz --expires-in 86400
aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_dns_index.tar.gz --expires-in 86400
aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_forescout_index.tar.gz --expires-in 86400
aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_network_index.tar --expires-in 86400
aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_security_index.tar.gz --expires-in 86400
aws --profile=mdr-prod s3 presign s3://mdr-saf-off-boarding/saf_te_index.tar.gz --expires-in 86400