|
@@ -5,14 +5,17 @@
|
|
* Send email ~ 1 week before.
|
|
* Send email ~ 1 week before.
|
|
* Give 15 minute warning in the Slack [#xdr-patching](https://afscyber.slack.com/archives/CJ462RRBM), [#xdr-content-aas](https://afscyber.slack.com/archives/C010NEX6X1N), [#xdr-soc Channel](https://afscyber.slack.com/archives/CFUP7STE2), [#xdr-engineering Channel](https://afscyber.slack.com/archives/CFTJSTGDB) channels, etc before patching
|
|
* Give 15 minute warning in the Slack [#xdr-patching](https://afscyber.slack.com/archives/CJ462RRBM), [#xdr-content-aas](https://afscyber.slack.com/archives/C010NEX6X1N), [#xdr-soc Channel](https://afscyber.slack.com/archives/CFUP7STE2), [#xdr-engineering Channel](https://afscyber.slack.com/archives/CFTJSTGDB) channels, etc before patching
|
|
|
|
|
|
|
|
+---
|
|
## Patching Process
|
|
## Patching Process
|
|
|
|
|
|
[Day 1](#Day-1-Wednesday)
|
|
[Day 1](#Day-1-Wednesday)
|
|
[Day 2](#Day-2-Thursday)
|
|
[Day 2](#Day-2-Thursday)
|
|
-[Day 3](#Day-3-Monday)
|
|
|
|
|
|
+[Day 3](#Day-3-Monday)
|
|
[Day 3-afternoon](#Day-3-Monday-afternoon)
|
|
[Day 3-afternoon](#Day-3-Monday-afternoon)
|
|
[Day 4](#Day-4-Tuesday)
|
|
[Day 4](#Day-4-Tuesday)
|
|
|
|
|
|
|
|
+---
|
|
|
|
+
|
|
Each month the AWS `GovCloud(GC) TEST/PROD` environments must be patched to comply with FedRAMP requirements. This wiki page outlines the process for patching the environment.
|
|
Each month the AWS `GovCloud(GC) TEST/PROD` environments must be patched to comply with FedRAMP requirements. This wiki page outlines the process for patching the environment.
|
|
|
|
|
|
Email Template that needs to be sent out prior or create a Calendar event for patching and email addresses of individuals who should get the invite.
|
|
Email Template that needs to be sent out prior or create a Calendar event for patching and email addresses of individuals who should get the invite.
|
|
@@ -58,7 +61,7 @@ Tuesday <INSERT MONTH> 17:
|
|
|
|
|
|
The customer and user impact will be during the reboots so they will be done in batches to reduce our total downtime.
|
|
The customer and user impact will be during the reboots so they will be done in batches to reduce our total downtime.
|
|
```
|
|
```
|
|
-
|
|
|
|
|
|
+---
|
|
## Detailed Steps (Brad's patching)
|
|
## Detailed Steps (Brad's patching)
|
|
|
|
|
|
## HEY BRAD: READ ME!
|
|
## HEY BRAD: READ ME!
|
|
@@ -71,7 +74,7 @@ It's safe to run on `*` and will remove any RHEL registration (or warnings about
|
|
|
|
|
|
**Reminder** - The legacy `Reposerver` was shutdown in late February 2021, so consider it a suspect if you have issues.
|
|
**Reminder** - The legacy `Reposerver` was shutdown in late February 2021, so consider it a suspect if you have issues.
|
|
|
|
|
|
-
|
|
|
|
|
|
+---
|
|
### Day 1 (Wednesday)
|
|
### Day 1 (Wednesday)
|
|
|
|
|
|
Patch `GC TEST` first! This helps find problems in `TEST` and potential problems in `PROD`. Test is shutdown to save on costs:
|
|
Patch `GC TEST` first! This helps find problems in `TEST` and potential problems in `PROD`. Test is shutdown to save on costs:
|
|
@@ -99,13 +102,13 @@ FYI, patching today.
|
|
Starting with Moose and Internal infra patching within `GC TEST`. Check disk space for potential issues. Return here to start on PROD after TEST is patched.
|
|
Starting with Moose and Internal infra patching within `GC TEST`. Check disk space for potential issues. Return here to start on PROD after TEST is patched.
|
|
```
|
|
```
|
|
# Test connectivity between Salt Master and Minions
|
|
# Test connectivity between Salt Master and Minions
|
|
-salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* or qcompliance* or openvpn* )' test.ping --out=txt
|
|
|
|
|
|
+salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* )' test.ping --out=txt
|
|
|
|
|
|
# Fred's update for df -h - checks for disk utilization at the 80-90% area
|
|
# Fred's update for df -h - checks for disk utilization at the 80-90% area
|
|
-salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* or qcompliance* or openvpn* )' cmd.run 'df -h | egrep "[890][0-9]\%"'
|
|
|
|
|
|
+salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* )' cmd.run 'df -h | egrep "[890][0-9]\%"'
|
|
|
|
|
|
# Review packages that will be updated. Some packages are versionlocked (Collectd, Splunk, Teleport, etc.).
|
|
# Review packages that will be updated. Some packages are versionlocked (Collectd, Splunk, Teleport, etc.).
|
|
-salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* or qcompliance* or openvpn* )' cmd.run 'yum check-update'
|
|
|
|
|
|
+salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* )' cmd.run 'yum check-update'
|
|
```
|
|
```
|
|
|
|
|
|
<!-- ```
|
|
<!-- ```
|
|
@@ -116,13 +119,10 @@ salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib
|
|
salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* )' cmd.run 'df -h'
|
|
salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* )' cmd.run 'df -h'
|
|
|
|
|
|
``` -->
|
|
``` -->
|
|
-> :warning: **OpenVPN decommissioned on March 25, 2022; replaced with AWS VPN. Omit OpenVPN Commands.**
|
|
|
|
-
|
|
|
|
-See [AWS VPN Notes](AWS%20VPN%20NOTES.md)
|
|
|
|
|
|
|
|
-### Also, the phantom_repo pkg wants to upgrade, but we are not ready. Let's exclude that.
|
|
|
|
|
|
+### Also, the `phantom_repo` pkg wants to upgrade, but we are not ready. Let's exclude that.
|
|
```
|
|
```
|
|
-salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or openvpn* or vmray* or resolver-vmray* or phantom-0* or qcompliance* )' pkg.upgrade
|
|
|
|
|
|
+salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* or phantom-0* )' pkg.upgrade
|
|
|
|
|
|
# update phantom, but exclude the phantom repo.
|
|
# update phantom, but exclude the phantom repo.
|
|
salt -C 'phantom-0*' pkg.upgrade disablerepo='["phantom-base",]'
|
|
salt -C 'phantom-0*' pkg.upgrade disablerepo='["phantom-base",]'
|
|
@@ -152,17 +152,13 @@ salt vmray* cmd.run 'systemctl start vmray-server vmray-worker'
|
|
5. Reboot the Server (later? or now?) wait until all servers get rebooted.
|
|
5. Reboot the Server (later? or now?) wait until all servers get rebooted.
|
|
```
|
|
```
|
|
|
|
|
|
-<!-- ```
|
|
|
|
-### Now Patch OpenVPN server and monitor during process in case any issues occur; ie, you get kicked off of VPN, etc.
|
|
|
|
-`salt -C 'openvpn*' pkg.upgrade`
|
|
|
|
-``` -->
|
|
|
|
-
|
|
|
|
#### What about threatq? Ask Duane! It needs special handling.
|
|
#### What about threatq? Ask Duane! It needs special handling.
|
|
|
|
|
|
### Run it again to make sure nothing got missed.
|
|
### Run it again to make sure nothing got missed.
|
|
```
|
|
```
|
|
-salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* or phantom-0* or qcompliance* )' pkg.upgrade
|
|
|
|
|
|
+salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or bas-* or ca-c19* or frtib* or dgi* or threatq* or vmray* or resolver-vmray* or phantom-0* )' pkg.upgrade
|
|
```
|
|
```
|
|
|
|
+---
|
|
|
|
|
|
> :warning: After upgrades check on Portal to make sure it is still up.
|
|
> :warning: After upgrades check on Portal to make sure it is still up.
|
|
|
|
|
|
@@ -176,9 +172,12 @@ date; salt 'customer-portal*' cmd.run 'systemctl restart docker'
|
|
|
|
|
|
Portal Notes are here for further Troubleshooting if necessary: [Portal Notes](Portal%20Notes.md)
|
|
Portal Notes are here for further Troubleshooting if necessary: [Portal Notes](Portal%20Notes.md)
|
|
|
|
|
|
|
|
+---
|
|
#### Patch CaaSP
|
|
#### Patch CaaSP
|
|
See [Patch CaaSP instructions](Patching%20Notes--CaaSP.md)
|
|
See [Patch CaaSP instructions](Patching%20Notes--CaaSP.md)
|
|
|
|
|
|
|
|
+---
|
|
|
|
+
|
|
#### Troubleshooting
|
|
#### Troubleshooting
|
|
|
|
|
|
Phantom error
|
|
Phantom error
|
|
@@ -232,13 +231,6 @@ yum install yum-utils
|
|
package-cleanup --oldkernels --count=1 -y
|
|
package-cleanup --oldkernels --count=1 -y
|
|
```
|
|
```
|
|
|
|
|
|
-<!-- ```
|
|
|
|
-If VPN server stops working,
|
|
|
|
-Try a stop and start of the VPN service ([OpenVPN Notes](OpenVPN%20Notes.md)). The private IP will probably change.
|
|
|
|
-
|
|
|
|
-``` -->
|
|
|
|
-
|
|
|
|
-
|
|
|
|
#### ISSUE: Salt-minion doesn't come back and has this error
|
|
#### ISSUE: Salt-minion doesn't come back and has this error
|
|
```
|
|
```
|
|
/usr/lib/dracut/modules.d/90kernel-modules/module-setup.sh: line 16: /lib/modules/3.10.0-957.21.3.el7.x86_64///lib/modules/3.10.0-957.21.3.el7.x86_64/kernel/sound/drivers/mpu401/snd-mpu401.ko.xz: No such file or directory
|
|
/usr/lib/dracut/modules.d/90kernel-modules/module-setup.sh: line 16: /lib/modules/3.10.0-957.21.3.el7.x86_64///lib/modules/3.10.0-957.21.3.el7.x86_64/kernel/sound/drivers/mpu401/snd-mpu401.ko.xz: No such file or directory
|
|
@@ -246,7 +238,7 @@ Try a stop and start of the VPN service ([OpenVPN Notes](OpenVPN%20Notes.md)). T
|
|
|
|
|
|
RESOLUTION: Manually reboot the OS, this is most likely due to a kernal upgrade.
|
|
RESOLUTION: Manually reboot the OS, this is most likely due to a kernal upgrade.
|
|
|
|
|
|
-
|
|
|
|
|
|
+---
|
|
### Day 2 (Thursday)
|
|
### Day 2 (Thursday)
|
|
|
|
|
|
#### Step 1 of 4 (Day 2): Reboot Internals
|
|
#### Step 1 of 4 (Day 2): Reboot Internals
|
|
@@ -287,15 +279,15 @@ watch "salt -C 'vault-3* or sensu*' test.ping --out=txt"
|
|
|
|
|
|
Reboot majority of servers in `GC Test`.
|
|
Reboot majority of servers in `GC Test`.
|
|
```
|
|
```
|
|
-salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or resolver* or sensu* or threatq-* or vmray-* or vault-3* or openvpn* or qcompliance* or rhsso-0* )' test.ping --out=txt
|
|
|
|
-date; salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or resolver* or sensu* or threatq-* or vmray-* or vault-3* or openvpn* or qcompliance* or rhsso-0* )' system.reboot --async
|
|
|
|
|
|
+salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or resolver* or sensu* or threatq-* or vmray-* or vault-3* or rhsso-0* )' test.ping --out=txt
|
|
|
|
+date; salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or resolver* or sensu* or threatq-* or vmray-* or vault-3* or rhsso-0* )' system.reboot --async
|
|
```
|
|
```
|
|
> :warning:
|
|
> :warning:
|
|
### You will lose connectivity to Salt Master
|
|
### You will lose connectivity to Salt Master
|
|
### Log back in and verify they are back up
|
|
### Log back in and verify they are back up
|
|
|
|
|
|
```
|
|
```
|
|
-watch "salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or resolver* or sensu* or threatq-* or vmray-* or vault-3* or openvpn* or qcompliance* or rhsso-0* )' cmd.run 'uptime' --out=txt"
|
|
|
|
|
|
+watch "salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or resolver* or sensu* or threatq-* or vmray-* or vault-3* or rhsso-0* )' cmd.run 'uptime' --out=txt"
|
|
```
|
|
```
|
|
|
|
|
|
Take care of the govcloud Resolvers one at a time. The vmray can be combined with one of the govcloud ones.
|
|
Take care of the govcloud Resolvers one at a time. The vmray can be combined with one of the govcloud ones.
|
|
@@ -315,10 +307,10 @@ salt -C '*com not ( modelclient-splunk-idx* or moose-splunk-idx* or threatq-* o
|
|
```
|
|
```
|
|
### Duane Section (feel free to bypass)
|
|
### Duane Section (feel free to bypass)
|
|
--
|
|
--
|
|
-I (Duane) did this a little different. Salt-master first, then openvpn, then everything but resolvers. Resolvers reboot one at a time.
|
|
|
|
|
|
+I (Duane) did this a little different. Salt-master first, then everything but resolvers. Resolvers reboot one at a time.
|
|
|
|
|
|
```
|
|
```
|
|
-salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or openvpn* or qcomp* or salt-master* or moose-splunk-indexer-* or resolver* )' cmd.run 'shutdown -r now'
|
|
|
|
|
|
+salt -C '* not ( afs* or nga* or dc-c19* or la-c19* or qcomp* or salt-master* or moose-splunk-indexer-* or resolver* )' cmd.run 'shutdown -r now'
|
|
```
|
|
```
|
|
--
|
|
--
|
|
|
|
|
|
@@ -358,16 +350,16 @@ watch "salt -C 'vault-1*com or sensu*com' test.ping --out=txt"
|
|
|
|
|
|
Reboot majority of servers in GC.
|
|
Reboot majority of servers in GC.
|
|
```
|
|
```
|
|
-salt -C '*com not ( afs* or nga* or dc-c19* or la-c19* or dgi-* or moose-splunk-idx* or modelclient-splunk-idx* or bas-* or frtib* or ca-c19* or resolver* or vault-1*com or sensu*com or qcompliance* or vmray-worker* or openvpn* )' test.ping --out=txt
|
|
|
|
|
|
+salt -C '*com not ( afs* or nga* or dc-c19* or la-c19* or dgi-* or moose-splunk-idx* or modelclient-splunk-idx* or bas-* or frtib* or ca-c19* or resolver* or vault-1*com or sensu*com or vmray-worker* )' test.ping --out=txt
|
|
|
|
|
|
-date; salt -C '*com not ( afs* or nga* or dc-c19* or la-c19* or dgi-* or moose-splunk-idx* or modelclient-splunk-idx* or bas-* or frtib* or ca-c19* or resolver* or vault-1*com or sensu*com or qcompliance* or vmray-worker* or openvpn* )' system.reboot --async
|
|
|
|
|
|
+date; salt -C '*com not ( afs* or nga* or dc-c19* or la-c19* or dgi-* or moose-splunk-idx* or modelclient-splunk-idx* or bas-* or frtib* or ca-c19* or resolver* or vault-1*com or sensu*com or vmray-worker* )' system.reboot --async
|
|
```
|
|
```
|
|
> :warning:
|
|
> :warning:
|
|
### You will lose connectivity to Salt master
|
|
### You will lose connectivity to Salt master
|
|
### Log back in and verify they are back up
|
|
### Log back in and verify they are back up
|
|
|
|
|
|
```
|
|
```
|
|
-watch "salt -C '*accenturefederalcyber.com not ( afs* or nga* or dc-c19* or la-c19* or dgi-* or moose-splunk-idx* or modelclient-splunk-idx* or bas-* or frtib* or ca-c19* or resolver* or vault-1*com or sensu*com or qcompliance* )' cmd.run 'uptime' --out=txt"
|
|
|
|
|
|
+watch "salt -C '*accenturefederalcyber.com not ( afs* or nga* or dc-c19* or la-c19* or dgi-* or moose-splunk-idx* or modelclient-splunk-idx* or bas-* or frtib* or ca-c19* or resolver* or vault-1*com or sensu*com )' cmd.run 'uptime' --out=txt"
|
|
```
|
|
```
|
|
|
|
|
|
Take care of the resolvers one at a time and with the `GC Prod Salt Master`. Reboot one of each at the same time.
|
|
Take care of the resolvers one at a time and with the `GC Prod Salt Master`. Reboot one of each at the same time.
|