User Calendar Apt to notify when you are upgrading Splunk.
Naughton, Brandon <brandon.naughton@accenturefederal.com>; Williams, Colby <colby.williams@accenturefederal.com>; Waddle, Duane E. <duane.e.waddle@accenturefederal.com>; Damstra, Frederick T. <frederick.t.damstra@accenturefederal.com>; Reuther, John M. <john.m.reuther@accenturefederal.com>; Leonard, Wesley A. <wesley.a.leonard@accenturefederal.com>; Starcher, George <george.a.starcher@accenturefederal.com>; Rivas, Gregory A. <gregory.a.rivas@accenturefederal.com>; Jarrett, James M. <james.m.jarrett@accenturefederal.com>; Kerr, James <j.kerr@accenturefederal.com>
This is an FYI only. I plan on upgrading PROD Splunk during this time.
No need to notify the customer since this is a "behind the scences" change. No customer facing downtime.
Post to slack channels before you begin. xdr-patching, xdr-engineering, xdr-soc
Starting dc-c19 Splunk upgrade. please plan on outages.
NOTE: The CM should be at the same or higher version than any Search Head connecting to it. Thus, upgrade the FM-shared-search, monitoring console, and qcompliance after upgrading all the Cluster Masters.
09/27/2021
ae6821b7c6
-linux-2.6-x86_64.rpm 'https://d7wz6hmoaavd0.cloudfront.net/products/splunk/releases/8.2.2.1/linux/splunk-8.2.2.1-ae6821b7c64b-linux-2.6-x86_64.rpm'ae6821b7c6
-linux-2.6-x86_64.rpm 'https://d7wz6hmoaavd0.cloudfront.net/products/universalforwarder/releases/8.2.2.1/linux/splunkforwarder-8.2.2.1-ae6821b7c64b-linux-2.6-x86_64.rpm'cd08487076
-linux-2.6-x86_64.rpm 'https://download.splunk.com/products/splunk/releases/8.2.3/linux/splunk-8.2.3-cd0848707637-linux-2.6-x86_64.rpm?_ga=2.213141332.1340323660.1635200179-268321405.1634569782'Ensure recent and persistent snapshot of SH, HF, CM, etc. EBS Volumes
salt -C '( *sh* or *hf* ) and moose*' cmd.run 'systemctl stop splunk'
tar -cvzf /opt/splunk/opt-splunk-backup.tar.gz /opt/splunk
Update the profile, InstanceId, and tag to create snapshots of all volumes
aws --profile mdr-test-c2-gov ec2 create-snapshots --instance-specification 'InstanceId=i-02a546c0de3d20030,ExcludeBootVolume=false' --tag-specifications 'ResourceType=snapshot,Tags=[{Key=Name,Value=modelclient-splunk-hf-pre-upgrade-backup-8.0.5}]'
Before Splunk Upgrades
Upgrade ES 6.1.1/6.2.0 -> 6.6.2
The app failed to upload to the SH. ( takes a long time ). Modify the etc/system/local/web.conf
to allow large uploads.
[settings]
max_upload_size = 1024
Run the setup after the upgrade.
In CAASP, the app failed to upgrade with the error "invalid message type: 28" due to insufficent space in the /tmp dir. Be sure to have a minimum of 1.4 GB available in /tmp for the tar ball to be extracted.
Update salt pillar data to new Splunk repo to reflect new splunk repo.
| rest /services/storage/passwords
Sensu and update the repo at the same time
cmd.run 'df -h'
salt -C 'moose* not moose-alsi*' saltutil.refresh_pillar
salt -C 'moose* not moose-alsi*' pillar.item yumrepos:splunk
state.sls splunk.update_repo
to update repoStop all servers at the same time
cmd.run 'systemctl stop splunk'
Upgrade CM, SH, HF, customer SH ( if applicable )
salt -C '( *cm* or *sh* or *hf* ) and moose*' pkg.upgrade name=splunk
Upgrade and Start Indexers
salt -C '*idx* and moose*' pkg.upgrade name=splunk
cmd.run 'systemctl start splunk'
cmd.run '/opt/splunk/bin/splunk version'
cmd.run '/opt/splunk/bin/splunk status'
Start CM and SH and Cust-SH
salt -C '( *cm* or *sh* or *hf* ) and moose*' cmd.run 'systemctl start splunk'
Verify Splunk Web is up and searches of _internal index are working and three green checkmarks
Upgrade fm-shared-search-0/splunk-mc-0/qcompliance-splunk-sh
- Migrate KV store storage engine to WiredTiger on the SHs ( where the KV store is used. )
- backup kvstore first!
- https://docs.splunk.com/Documentation/Splunk/8.2.2/Admin/BackupKVstore#Back_up_and_restore_the_KV_store_with_point_in_time_consistency
- Verify backup is there in /opt/splunk/var/lib/splunk/kvstorebackup
- https://docs.splunk.com/Documentation/Splunk/8.2.2/Admin/MigrateKVstore#Migrate_the_KV_store_after_an_upgrade_to_Splunk_Enterprise_8.1_or_higher_in_a_single-instance_deployment
- upgrade apps slowly so Brandon can troubleshoot errors!!!!)
- Ensure 3 green checkmarks (Prevents 3 green checkmarks on CM) Update the CM bundle to include `_cluster` see here: [Fixes for not replicating indexes?](https://github.xdr.accenturefederalcyber.com/mdr-engineering/msoc-afs-cm/pull/9) (index _metrics and _introspection not in _cluster)
cmd.run 'yum clean all ; yum makecache fast'
pkg.upgrade name=splunkforwarder
cmd.run 'systemctl restart splunkuf'
state.sls internal_splunk_forwarder --output-diff test=false
salt 'minion*' cmd.run '/opt/splunkforwarder/bin/splunk version'
salt 'minion*' cmd.run '/opt/splunkforwarder/bin/splunk status'
salt -C 'dgi* not *com' test.ping
saltutil.refresh_pillar
pillar.item yumrepos:splunk
state.sls splunk.update_repo
yum clean all ; yum makecache fast
cmd.run 'df -h /opt'
cmd.run 'systemctl stop splunk'
cmd.run 'tar -czf /opt/opt-splunk-backup-8.0.5.tar.gz /opt/splunk'
cmd.run 'tar -czf /opt/syslog-ng/opt-splunk-backup-8.0.5.tar.gz /opt/splunk'
cmd.run 'ls -larth /opt'
pkg.upgrade name=splunk
cmd.run 'systemctl start splunk'
cmd.run '/opt/splunk/bin/splunk version'
cmd.run 'tail /opt/splunk/var/log/splunk/splunkd.log'
New Error: 11-09-2021 22:17:39.611 +0000 ERROR ExecProcessor [16242 ExecProcessor] - message from "/opt/splunk/bin/python3.7 /opt/splunk/etc/apps/splunk_secure_gateway/bin/ssg_enable_modular_input.py" Socket error communicating with splunkd (error=[Errno 111] Connection refused), path = https://127.0.0.1:9666//services/server/roles
The splunk_secure_gateway app got installed with 8.2. It is not used AFAIK and can be ignored.
<<<------------------ LEGACY -------------------->>>
08/11/2020
Software is located in Duane's One drive.
Upgrade AFS/NGA 7.0.3 -> 8.0.5
| rest /services/storage/passwords
salt afs* saltutil.refresh_pillar
salt afs* pillar.item yumrepos:splunk
state.sls splunk.new_install
to update repo ; yes it will restart splunk. (ROOM FOR IMPROVEMENT: Make new saltstate for splunk repo)cmd.run 'systemctl stop splunk'
pkg.upgrade name=splunk
state.sls splunk.new_install
to update repocmd.run 'systemctl stop splunk'
tar -cvzf /opt/splunk/opt-splunk-backup.tar.gz /opt/splunk
pkg.upgrade name=splunk
state.sls splunk.new_install
to update repocmd.run 'systemctl stop splunk'
pkg.upgrade name=splunk
cmd.run 'systemctl start splunk'
cmd.run '/opt/splunk/bin/splunk version'
cmd.run '/opt/splunk/bin/splunk status'
cmd.run 'systemctl start splunk'
state.sls splunk.new_install
to update repocmd.run 'systemctl stop splunk'
tar -cvzf /opt/splunk/opt-splunk-backup.tar.gz /opt/splunk
pkg.upgrade name=splunk
cmd.run 'systemctl start splunk'
Upgrade ES 5.0.1 -> 6.2.0
The app failed to upload to the SH. ( takes a long time ). Modify the etc/system/local/web.conf
to allow large uploads.
max_upload_size = 1024
See Matrix for other apps ( upgrade apps slowly so Brandon can troubleshoot errors!!!!)
run geo ip DB update
/usr/local/bin/maxmind-downloader.sh
(Prevents 3 green checkmarks on CM) Update the CM bundle to include _cluster
see here: Fixes for not replicating indexes? (index _metrics and _introspection not in _cluster)
NGA has an additional check on the splunk HF IAM role for externalID
. Besure to add the "patch" back in. See here: Jira Ticket - MSOCI-623 - Splunk AWS TA doesn't support --external-id when assuming an IAM role. This is for the splunk_TA_aws
app.
Upgrade Moose 7.2.1 ->8.0.5 DONE!
Upgrade Covids 8.0.4 -> 8.0.5
Upgrade POP nodes