ソースを参照

Updates Patching notes and more!

Brad Poulton 4 年 前
コミット
6f4aac7035
4 ファイル変更66 行追加18 行削除
  1. 3 0
      ClamAV notes.md
  2. 58 16
      Patching Notes.md
  3. 1 1
      ScaleFT Notes.md
  4. 4 1
      Terraform Notes.md

+ 3 - 0
ClamAV notes.md

@@ -1,5 +1,8 @@
 # ClamAV Notes
 
+stop the clam scanning service. 
+`service clamd@scan stop`
+
 # clamscan vs clamdscan
 
 clamscan is the full scanner, clamdscan talks to the clam daemon who runs

+ 58 - 16
Patching Notes.md

@@ -32,7 +32,7 @@ Wednesday Dec 11:
 Thursday Dec 12:
 * Moose and Internal
   * Reboots
-* All Customer PoP
+* All Customer PoP/LCP
   * Patching (AM)
   * Reboots (PM)
 
@@ -53,6 +53,9 @@ The customer and user impact will be during the reboots so they will be done in
 ## Detailed Steps (brad's patching)
 
 ### Day 1 (Wednesday), step 1 of 1: Moose and Internal infrastructure - Patching
+
+Patch TEST first! This helps find problems in TEST and potential problems in PROD. 
+
 Post to slack:
 ```
 FYI, patching today. 
@@ -73,11 +76,39 @@ salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* )' c
 salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* )' cmd.run 'df -h | egrep "[890][0-9]\%"'
 #review packages that will be updated. some packages are versionlocked (Collectd, Splunk,etc.).
 salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* )' cmd.run 'yum check-update' 
-salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* )' pkg.upgrade
+#OpenVPN sometimes goes down with patching and needs a restart of the service. Let's patch the VPN after everthing else. I am not sure which package is causing the issue. Kernal? bind-utils? 
+# Also, the phantom_repo pkg wants to upgrade, but we are not ready. Let's exclude that package to prevent errors. 
+salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* or openvpn* )' pkg.upgrade exclude='phantom_repo'
+salt -C 'openvpn*' pkg.upgrade
+#Just to be sure, run it again to make sure nothing got missed. 
+salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* )' pkg.upgrade exclude='phantom_repo'
 ```
 
 > :warning: After upgrades check on Portal to make sure it is still up. 
 
+Phantom error
+```
+phantom.msoc.defpoint.local:
+    ERROR: Problem encountered upgrading packages. Additional info follows:
+
+    changes:
+        ----------
+    result:
+        ----------
+        pid:
+            40718
+        retcode:
+            1
+        stderr:
+            Running scope as unit run-40718.scope.
+            Error in PREIN scriptlet in rpm package phantom_repo-4.9.39220-1.x86_64
+            phantom_repo-4.9.37880-1.x86_64 was supposed to be removed but is not!
+        stdout:
+            Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
+            Logging to /var/log/phantom/phantom_install_log
+            error: %pre(phantom_repo-4.9.39220-1.x86_64) scriptlet failed, exit status 7
+```
+
 #### Error: `error: unpacking of archive failed on file /usr/lib/python2.7/site-packages/urllib3/packages/ssl_match_hostname: cpio: rename failed`
 
 `salt ma-* cmd.run 'pip uninstall urllib3 -y'`
@@ -113,32 +144,43 @@ RESOLUTION: Manually reboot the OS, this is most likely due to a kernal upgrade.
 
 ### Day 2 (Thursday), step 1 of 4:  Reboot Internals 
 
+Don't forget to reboot test. 
+
 Post to slack:
 ```
 FYI, patching today. 
 * In about 15 minutes: Reboots of moose and internal systems, including the VPN.
-* Following that, patching (but not rebooting) of all customer PoPs.
-* Then this afternoon, reboots of those those PoPs.
+* Following that, patching (but not rebooting) of all customer PoPs/LCPs.
+* Then this afternoon, reboots of those those PoPs/LCPs.
 ```
 
 Be sure to select ALL events in sensu for silencing not just the first 25. 
 Sensu -> Entities -> Sort (name) -> Select Entity and Silence. This will silence both keepalive and other checks. 
 Some silenced events will not unsilence and will need to be manually unsilenced.
-*IDEA! restart the sensu server and the vault-3 server first. this helps with the clearing of the silenced entities.*
+*IDEA! restart the sensu server and the vault-3 server first. This helps with the clearing of the silenced entities.*
 
 ```
 salt -L 'vault-3.msoc.defpoint.local,sensu.msoc.defpoint.local' test.ping
 date; salt -L 'vault-3.msoc.defpoint.local,sensu.msoc.defpoint.local' system.reboot
 watch "salt -L 'vault-3.msoc.defpoint.local,sensu.msoc.defpoint.local' test.ping"
-salt -C '* not ( moose-splunk-indexer* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault-3* or sensu* )' test.ping --out=txt
-date; salt -C '* not ( moose-splunk-indexer* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault-3* or sensu* )' system.reboot
+salt -C '* not ( moose-splunk-indexer* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault-3* or sensu* or interconnect* or resolver* )' test.ping --out=txt
+date; salt -C '* not ( moose-splunk-indexer* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault-3* or sensu* or interconnect* or resolver* )' system.reboot
 #you will lose connectivity to openvpn and salt master
 #log back in and verify they are back up
-watch "salt -C '* not ( moose-splunk-indexer* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault-3* or sensu* )' cmd.run 'uptime' --out=txt"
+watch "salt -C '* not ( moose-splunk-indexer* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault-3* or sensu* or interconnect* or resolver* )' cmd.run 'uptime' --out=txt"
+#take care of the interconencts/resolvers one at a time. 
+salt 'interconnect-0.pvt.xdr.accenturefederalcyber.com' test.ping 
+salt 'interconnect-0.pvt.xdr.accenturefederalcyber.com' system.reboot
+salt 'interconnect-1.pvt.xdr.accenturefederalcyber.com' test.ping 
+salt 'interconnect-1.pvt.xdr.accenturefederalcyber.com' system.reboot
+salt 'resolver-commercial.pvt.xdr.accenturefederalcyber.com' test.ping
+salt 'resolver-commercial.pvt.xdr.accenturefederalcyber.com' system.reboot
+salt 'resolver-govcloud.pvt.xdr.accenturefederalcyber.com' test.ping
+salt 'resolver-govcloud.pvt.xdr.accenturefederalcyber.com' system.reboot
 ```
 
 I (Duane) did this a little different.  Salt-master first, then openvpn, then everything but
-interconnects and resolvers.
+interconnects and resolvers. interconnects and resolvers reboot one at a time. 
 
 ```
 salt -C '* not ( afs* or saf* or nga* or ma-* or mo-* or dc-c19* or la-c19* or openvpn* or qcomp* or salt-master* or moose-splunk-indexer-* or interconnect* or resolver* )' cmd.run 'shutdown -r now'
@@ -168,7 +210,7 @@ Repeat the above patching steps for the additional indexers, waiting for 3 green
 # Do the second indexer
 salt -C 'moose-splunk-indexer-i-0b11e585de680b383.msoc.defpoint.local' test.ping --out=txt
 date; salt -C 'moose-splunk-indexer-i-0b11e585de680b383.msoc.defpoint.local' system.reboot
-#indexers take a while date; salt -C 'moose-splunk-indexer-i-00ca1da87a2abcd56.msoc.defpoint.local' system.rebootto restart
+#indexers take a while to restart
 watch "salt -C 'moose-splunk-indexer-i-0b11e585de680b383.msoc.defpoint.local' cmd.run 'uptime' --out=txt"
 
 # Do the third indexer
@@ -380,13 +422,13 @@ afs-splunk-syslog-4: {u'location': u'San Antonio'}
 
 salt -C 'afs-splunk-syslog*  grains.item location
 
-salt -L 'afs-splunk-syslog-6, afs-splunk-syslog-8' cmd.run 'uptime'
-date; salt -L 'afs-splunk-syslog-6, afs-splunk-syslog-8' system.reboot
-watch "salt -L 'afs-splunk-syslog-6, afs-splunk-syslog-8' test.ping"
+salt -L 'afs-splunk-syslog-3, afs-splunk-syslog-5, afs-splunk-syslog-7' cmd.run 'uptime'
+date; salt -L 'afs-splunk-syslog-3, afs-splunk-syslog-5, afs-splunk-syslog-7' system.reboot
+watch "salt -L 'afs-splunk-syslog-3, afs-splunk-syslog-5, afs-splunk-syslog-7' test.ping"
 
-salt -L 'afs-splunk-syslog-5, afs-splunk-syslog-7' cmd.run 'uptime'
-date; salt -L 'afs-splunk-syslog-5, afs-splunk-syslog-7' system.reboot
-watch "salt -L 'afs-splunk-syslog-5, afs-splunk-syslog-7' test.ping"
+salt -L 'afs-splunk-syslog-4, afs-splunk-syslog-6, afs-splunk-syslog-8' cmd.run 'uptime'
+date; salt -L 'afs-splunk-syslog-4, afs-splunk-syslog-6, afs-splunk-syslog-8' system.reboot
+watch "salt -L 'afs-splunk-syslog-4, afs-splunk-syslog-6, afs-splunk-syslog-8' test.ping"
 ```
 
 ####verify logs are flowing

+ 1 - 1
ScaleFT Notes.md

@@ -45,7 +45,7 @@ Match exec "/usr/local/bin/sft resolve -q  %h" !User centos
 
 ssh using msoc_build key
 
-ssh -i msoc_build_fips centos@10.80.101.126
+ssh -i Documents/MDR/SSH\ Keys/msoc_build_fips centos@10.80.101.126
 
 How to use the msoc_build key and bastion
 1. add msoc_build key to your ssh agent `ssh-add msoc_build_fips`

+ 4 - 1
Terraform Notes.md

@@ -86,7 +86,10 @@ locals are variables that can refer to variables or other locals
 variables - expecting data from somewhere else.
 provider instance of the API
 
-
+Some files are symlinks.
+`ln -s ../common/variables.tf variables.tf`
+`ln -s ../amis.tf amis.tf`
+`ln -s ../../../../prod/aws-us-gov/mdr-prod-c2/090-instance-vault/README.md README.md`
 
 
 --------------------