Splunk AFS Thaw Request Notes.md 13 KB

Splunk AFS Thaw Request Notes

This documents the process for searching the frozen data that is stored in S3.

Plan:

  • charge time to CIRT Ops Support SPROJ.061 S&ID CIRT Ops Support_A
  • Don't use TF to manage the servers.
  • stand up muliple ( minimum 3? ) centos7 servers with large EBS disks
  • stand up one SH for the indexers
  • add EC2 instance policies with access to the S3 buckets
  • use aws s3 cp to pull the data down
    • Use zztop.sh ( see below ) script to pull down data faster
    • Data going back to Jan 1, 2020 (1577836861)
  • Thaw data ( no license needed since it is not ingestion )
    • no need to thaw the data if it has all the needed parts in the bucket ( tsidx files )
  • Install AFS splunk apps on SH
    • Zip them up and upload them to S3
    • Download them from S3 on the new SH
  • Hand over to SOC for searching

Assumptions:

  • The AWS account will be deleted when we are done.
  • No cluster master needed
  • Note: Data does not get replicated ( in the cluster ) from the thawed directory. So, if you thaw just a single copy of some bucket, instead of all the copies, only that single copy will reside in the cluster, in the thawed directory of the peer node where you placed it.

Build the VPC in the same region as the data is located in S3!

VPC info ( pick a CIDR that has not been used just in case you need to use transit gateway ) afs-data-thaw 10.199.0.0/22

indexers: c5.4xlarge on-demand $7/day m5d.xlarge or larger 1 TB EBS storage attached to instances search head: m5a.xlarge centos7 AMI key: msoc-build instance role: default-instance-role naming scheme: afs-splunk-sh encrypt EBS with default key give AWS IAM user both Administrator and IAMfullaccess to be able to launch instances!

Indexes

Needed indexes:
app_mscas           967.3 GB        done
app_o365            1.3 TB          done    afs-splunk-idx-1
av                  86.8 MB         done
azure               149.4 GB        done
ids                 17.4 GB         done
network_firewall    8.2 TB          done
network             99.4 GB         done
threat_metrics      8.0 GB          done    afs-splunk-idx-1
websense            771.1 GB        done
wineventlog         5.0 TB          done
zscaler             87.3 GB         done
Total               16.6 TB

Use AWS console to calculate total size of folders in S3 bucket. This will help to see how many indexers are needed.

Permissions for S3 bucket

Steps in New AWS Account

  • add role and attach to EC2 instance
  • add policy to the role allowing KMS* and S3* permissions

Steps in Old AWS Account

  • modify KMS role to allow for role in New Account
  • modify S3 bucket policy to allow for role in New Account

New Account policy for role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1610588870140",
      "Action": "kms:*",
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Sid": "Stmt1610588903413",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

Changes for the Old Accout KMS key policy

        {
            "Sid": "Allow use of the key",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::948010823789:role/default-instance-role",
                    "arn:aws:iam::477548533976:role/mdr_powerusers",
                    "arn:aws:iam::477548533976:role/msoc-default-instance-role"
                ]
            },
            "Action": [
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:Encrypt",
                "kms:DescribeKey",
                "kms:Decrypt"
            ],
            "Resource": "*"
        }

Old Account S3 Bucket Policy

{
    "Version": "2012-10-17",
    "Id": "Policy1584399307003",
    "Statement": [
        {
            "Sid": "DownloadandUpload",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::948010823789:role/default-instance-role"
            },
            "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion",
                "s3:RestoreObject"
            ],
            "Resource": "arn:aws:s3:::mdr-afs-prod-splunk-frozen/*"
        },
        {
            "Sid": "ListBucket",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::948010823789:role/default-instance-role"
            },
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::mdr-afs-prod-splunk-frozen"
        }
    ]
}

test permissions/access from second account aws s3 ls s3://mdr-afs-prod-splunk-frozen

after the objects have been restored from Glacier try to download some objects aws s3 cp s3://mdr-afs-prod-splunk-frozen/junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE/ . --recursive --force-glacier-transfer

Restore Glacier objects

https://infinityworks.com/insights/restoring-2-million-objects-from-glacier/ https://github.com/s3tools/s3cmd

s3cmd was the best option for restoring because it can pull the list of files for you and restore an entire directory recursivly s3cmd at this time, does not work with assumeRole STS credentials ( Don't run the command from laptop, just run the command from a new instance with the permissions )

Steps

  • ensure your awscli can access the S3 buckets
  • use s3cmd to restore objects

Plan

  • Restore for 30 days
  • Standard restore ( cheaper than Expedited )

test access for file restore from glacier aws s3 ls s3://mdr-afs-prod-splunk-frozen

warning: Skipping file s3://mdr-afs-prod-splunk-frozen/junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE/rawdata/slicesv2.dat. Object is of storage class GLACIER. Unable to perform download operations on GLACIER objects. You must restore the object to be able to perform the operation. See aws s3 download help for additional parameter options to ignore or force these transfers.

Need to restore the data from glacier for a period of time.

Restore TIER Expedited $$$ 1-5 minutes less than 250MB Standard $$ 3-5 hrs (Default if not given) Bulk $ 5-12 hrs

See s3cmd command down below! It is better than s3api command.

List objects

aws s3api list-objects-v2 --bucket mdr-afs-prod-splunk-frozen --prefix junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE --query "Contents[?StorageClass=='GLACIER']" --output text

Output the results to a file

aws s3api list-objects-v2 --bucket mdr-afs-prod-splunk-frozen --prefix junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE --query "Contents[?StorageClass=='GLACIER']" --output text | awk '{print $2}' > file.txt

Test access aws s3api restore-object –restore-request Days=2 --bucket mdr-afs-prod-splunk-frozen --key junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE/splunk-autogen-params.dat

aws s3api restore-object --restore-request Days=2 --bucket mdr-afs-prod-splunk-frozen --key junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE/rawdata/slicesv2.dat

All in one command

aws s3api list-objects-v2 --bucket mdr-afs-prod-splunk-frozen --prefix junk/frozendb/db_1594951175_1594844814_53_94F7BD8A-9043-487B-8BD5-41AA54D7A925 --query "Contents[?StorageClass=='GLACIER']" --output text | awk '{print $2}' | xargs -L 1 aws s3api restore-object --restore-request '{ "Days" : 2, "GlacierJobParameters" : { "Tier":"Expedited" } }' --bucket mdr-afs-prod-splunk-frozen --key

This just means, we are kinda busy right now try again later. Not an error in your code, but your code needs to retry the request to ensure the request gets processed. This only happened with expedited requests.

An error occurred (GlacierExpeditedRetrievalNotAvailable) when calling the RestoreObject operation (reached max retries: 2): Glacier expedited retrievals are currently not available, please try again later
#!bin/sh
for x in $(cat file.txt); do
echo "Start restoring the file $x"
aws s3api restore-object restore-request Days=2  "$x"
echo "Completed restoring the file $x"
done

Expedite that mother

#!bin/sh
TIER=Expedited
#TIER=Standard
#TIER=Bulk
DAYS=2
for x in $(cat file.txt); do
echo "Start restoring the file $x"
aws s3api restore-object --restore-request '{ "Days" : 2, "GlacierJobParameters" : { "Tier":"Expedited" } }' --bucket mdr-afs-prod-splunk-frozen --key $x
echo "Completed restoring the file $x"
done

With s3cmd! Be sure to use the exclude rb_* in the command, no need to restore the replicated buckets.

Just a bucket ./s3cmd restore --restore-priority=expedited --restore-days=2 --recursive s3://mdr-afs-prod-splunk-frozen/junk/frozendb/db_1566830011_1562776263_316_BBE343D5-D0D2-4120-A307-8B35B5E48D95/

Whole index ./s3cmd restore --restore-priority=standard --restore-days=30 --recursive s3://mdr-afs-prod-splunk-frozen/av/

Exclude rb_*

time ./s3cmd restore --restore-priority=standard --restore-days=30 --recursive --exclude="frozendb/rb_*" s3://mdr-afs-prod-splunk-frozen/av/

Distribute load to all servers via salt salt afs-splunk-idx-8 cmd.run '/root/s3cmd-2.1.0/s3cmd restore --restore-priority=standard --restore-days=30 --recursive s3://mdr-afs-prod-splunk-frozen/zscaler/' --async

Splunk

Server Prep

  • hostnamectl set-hostname afs-splunk-idx-2
  • install salt-master ( SH only), salt-minion (which includes python3) rpm --import https://repo.saltstack.com/py3/redhat/7/x86_64/archive/3002.2/SALTSTACK-GPG-KEY.pub vi /etc/yum.repos.d/saltstack.repo

    [saltstack-repo]
    name=SaltStack repo for RHEL/CentOS 7 PY3
    baseurl=https://repo.saltstack.com/py3/redhat/7/$basearch/archive/3002.2
    enabled=1
    gpgcheck=1
    gpgkey=https://repo.saltstack.com/py3/redhat/7/$basearch/archive/3002.2/SALTSTACK-GPG-KEY.pub
    

    yum clean expire-cache yum install salt-minion -y sed -i 's/#master: salt/master: 10.199.0.83/' /etc/salt/minion systemctl start salt-minion systemctl enable salt-minion

  • run salt states

stupid chrome hates the TLS certificate. Type this to bypass Chrome block for self-signed cert. thisisunsafe

Installation

run salt states

Data Pull with Salt

Use Duane's zztop.sh!!!!

cmd.run '/root/s3cmd-2.1.0/s3cmd get --recursive s3://mdr-afs-prod-splunk-frozen/threat_metrics/frozendb/ /opt/splunk/var/lib/splunk/threat_metrics/thaweddb/'

/root/s3cmd-2.1.0/s3cmd get --recursive s3://mdr-afs-prod-splunk-frozen/app_o365/frozendb/ /opt/splunk/var/lib/splunk/app_o365/thaweddb/

Thaw it out!

No need to thaw it out! The data was not fully frozen and the data does not need to be rebuilt.

GetObject

Pull the file after it has been restored aws s3 cp s3://mdr-afs-prod-splunk-frozen/junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE/splunk-autogen-params.dat here.dat

With S3cmd ./s3cmd get --recursive s3://mdr-afs-prod-splunk-frozen/junk/frozendb/db_1598282957_1598264767_511_50F6EC26-9620-4CAA-802C-857CD78386CE/ /home/centos/test-dir-2/

BEST

STEPS

  • get file of all S3 objects
  • split file into 10 different files ( or number of indexers )
  • run file with zztop to download the files

Make list of indexes aws s3 ls s3://mdr-afs-prod-splunk-frozen | awk '{ print $2 }' > foo1

Make list of ALL buckets in each index for i in $(cat foo1| egrep -v ^_); do aws s3 ls s3://mdr-afs-prod-splunk-frozen/${i}frozendb/ | egrep "db" | awk -v dir=$i '{ printf("s3://mdr-afs-prod-splunk-frozen/%sfrozendb/%s\n",dir,$2)}' ; done > bucketlist

break up list ( 10 indexers in this case ) cat bucketlist | awk '{ x=NR%10 }{print >> "indexerlist"x}'

create zztop.sh $ cat zztop.sh

#!/bin/bash
DEST=$( echo $1 | awk -F/ '{ print "/opt/splunk/var/lib/splunk/"$4"/thaweddb/"$6 }' )
mkdir -p $DEST
/usr/local/aws-cli/v2/current/bin/aws s3 cp $1 $DEST --recursive --force-glacier-transfer --no-progress

Distribute files using salt

salt '*idx-2' cmd.run 'mkdir /root/s3cp'
salt '*idx-2' cp.get_file salt://s3cp/indexerlist0 /root/s3cp/indexerlist0
salt '*idx-2' cp.get_file salt://s3cp/zztop.sh /root/s3cp/zztop.sh
salt '*idx-2' cmd.run 'chmod +x /root/s3cp/zztop.sh'

idx-2 indexerlist2 needs restart idx-3 indexerlist1 needs restart idx-4 indexerlist2 running idx-5 indexerlist3 running idx-6 indexerlist4 running idx-7 indexerlist5 running idx-8 indexerlist6 running idx-9 indexerlist7 running idx-10 indexerlist8 running idx-11 indexerlist9 running ...

distribute each list to an indexer and use zztop script with egrep and xargs to download all the buckets.

tmux ( process multiple lines at the same time with -P flag ) egrep -h "*" indexerlist* | xargs -P 15 -n 1 ./zztop.sh egrep -h "*" indexerlist* | head -1 | xargs -P 10 -n 1 ./zztop.sh