Sensu Go Upgrade Notes.md 10 KB

Sensu Upgrade Notes

Places where Official Sensu Go code and Documentation is located


:warning: We will use our XDR Internal Reposerver for all upgrade methods - See How to add a new package to the Reposerver

Sensu Go Upgrade History


Sensu Go Upgrade Process


We want to deploy the new code in iterations so that we can quickly abort deployment if we run in to any issues. Start with GC Test XDR Infrastructure first.

Starting with Moose and Internal infra within GC TEST. After deployment is verfied and functional, let it bake for 24-48 hrs before GC Prod deployment.

  1. Download latest packages for Sensu backend, Sensu agents, Sensuctl (Sensu CLI) to Repo server and run yum clean all on Sensu Backend server - See Reposerver notes.

  2. If needed, update Salt states to ensure they are up-to-date - Salt Upgrade Notes

    salt sensu* state.sls salt_minion.minion_upgrade --output-diff test=true
    

:warning: Remember to silence Sensu alerts before restarting services

  1. Sensu first; Login to GC TEST Salt-Master and Stop Sensu services on Sensu Backend server; do the same process for GC PROD afterwards

    date; salt sensu* cmd.run 'systemctl stop sensu-agent'
    date; salt sensu* cmd.run 'systemctl stop sensu-backend'
    
  2. Update Sensu Backend server

    date; salt sensu* cmd.run 'yum clean all && yum makecache fast'
    salt sensu* cmd.run 'yum --disablerepo="*" --enablerepo="msoc" list available'
    date; salt sensu* cmd.run 'yum update -y sensu-go-backend'
    date; salt sensu* cmd.run 'yum update -y sensu-go-cli'
    date; salt sensu* cmd.run 'yum update -y sensu-go-agent'
    date; salt sensu* cmd.run 'systemctl daemon-reload'
    
  3. Restart the Sensu services and check the Status

    date; salt sensu* cmd.run 'systemctl start sensu-backend'
    salt sensu* cmd.run 'systemctl start sensu-agent'
    
    date; salt sensu* cmd.run 'systemctl status sensu-backend'
    salt sensu* cmd.run 'systemctl status sensu-agent'
    

    :warning: Did you silence Sensu alerts before restarting services?

  4. GC Test first; GC PROD second; From target servers; clean out the cache

    # XDR Infrastructure - be sure to note the different Salt minions to target between TEST and PROD
    salt -C '* not ( afs* or bas-* or ca-c19* or dc-c19* or dgi* or doed* or frtib* or la-c19* or ma-* or nga* or vmray* or sensu* )' test.ping --out=txt
    
    salt -C '* not ( afs* or bas-* or ca-c19* or dc-c19* or dgi* or doed* or frtib* or la-c19* or ma-* or nga* or vmray* or sensu* )' cmd.run 'sensu-agent version'
    
    date; salt -C '* not ( afs* or bas-* or ca-c19* or dc-c19* or dgi* or doed* or frtib* or la-c19* or ma-* or nga* or vmray* or sensu* )' cmd.run 'yum clean all && yum makecache fast'
    
    # From target servers; view the available packages
    salt -C '* not ( afs* or bas-* or ca-c19* or dc-c19* or dgi* or doed* or frtib* or la-c19* or ma-* or nga* or vmray* or sensu* )' cmd.run 'yum --disablerepo="*" --enablerepo="msoc" list available'
    
    #LCPs
    salt -C '* not *.local not *.pvt.xdr.accenturefederalcyber.com' test.ping --out=txt
    
    salt -C '* not *.local not *.pvt.xdr.accenturefederalcyber.com' cmd.run 'sensu-agent version'
    
    date; salt -C '* not *.local not *.pvt.xdr.accenturefederalcyber.com' cmd.run 'yum clean all && yum makecache fast'
    
    salt -C '* not *.local not *.pvt.xdr.accenturefederalcyber.com' cmd.run 'yum --disablerepo="*" --enablerepo="msoc" list available'
    
    #Customer Slices
    salt -C 'afs*local or afs*com or bas-*com or ca-c19*com or dc*com or dgi*com or doed-*com or frtib*com or la-*com or ma-*com or nga*com or nga*local' test.ping --out=txt
    
    salt -C 'afs*local or afs*com or bas-*com or ca-c19*com or dc*com or dgi*com or doed-*com or frtib*com or la-*com or ma-*com or nga*com or nga*local' cmd.run 'sensu-agent version'
    
    salt -C 'afs*local or afs*com or bas-*com or ca-c19*com or dc*com or dgi*com or doed-*com or frtib*com or la-*com or ma-*com or nga*com or nga*local' cmd.run 'yum clean all && yum makecache fast'
    
    salt -C 'afs*local or afs*com or bas-*com or ca-c19*com or dc*com or dgi*com or doed-*com or frtib*com or la-*com or ma-*com or nga*com or nga*local' cmd.run 'yum --disablerepo="*" --enablerepo="msoc" list available'
    
    
  5. Stop / Update / Reload daemon / Start agent on minions systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent

    # XDR Infrastructure 
    date; salt -C '* not ( afs* or bas-* or ca-c19* or dc-c19* or dgi* or doed* or frtib* or la-c19* or ma-* or nga* or vmray* or sensu* )' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    # LCPs
    date; salt -C '* not *.local not *.pvt.xdr.accenturefederalcyber.com' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    # Customer Slices Search Heads Only
    salt -C '*-sh* and not *moose* and not fm-shared-search*' cmd.run 'sensu-agent version'
        
    date; salt -C '*-sh* and not *moose* and not fm-shared-search*' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    # Customer Slices Cluster Masters and Heavy Forwarders 
    salt -C '( *splunk-cm* or *splunk-hf* ) not moose*' cmd.run 'sensu-agent version'
        
    date; salt -C '( *splunk-cm* or *splunk-hf* ) not moose*' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    # Customer Slices Indexers
    # us-east-1a
    salt -C '*splunk-i* and ( G@ec2:placement:availability_zone:us-east-1a or G@ec2:placement:availability_zone:us-gov-east-1a ) not moose*' test.ping --out=txt
    
    date; salt -C '*splunk-i* and ( G@ec2:placement:availability_zone:us-east-1a or G@ec2:placement:availability_zone:us-gov-east-1a ) not moose*' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    # us-gov-east-1b
    salt -C '*splunk-i* and ( G@ec2:placement:availability_zone:us-east-1b or G@ec2:placement:availability_zone:us-gov-east-1b ) not moose*' test.ping --out=txt
        
    date; salt -C '*splunk-i* and ( G@ec2:placement:availability_zone:us-east-1b or G@ec2:placement:availability_zone:us-gov-east-1b ) not moose*' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    # us-gov-east-1c
    salt -C '*splunk-i* and ( G@ec2:placement:availability_zone:us-east-1c or G@ec2:placement:availability_zone:us-gov-east-1c ) not moose*' test.ping --out=txt
    
    date; salt -C '*splunk-i* and ( G@ec2:placement:availability_zone:us-east-1c or G@ec2:placement:availability_zone:us-gov-east-1c ) not moose*' cmd.run 'systemctl stop sensu-agent && yum update -y sensu-go-agent && systemctl daemon-reload && systemctl start sensu-agent'
    
    #For VMRAY - Ubuntu
    salt vmray* cmd.run 'sensu-agent version'
    
    salt vmray* cmd.run 'apt list --upgradable' --out=txt
    
    date; salt vmray* cmd.run 'systemctl stop sensu-agent && apt-get --only-upgrade install sensu-go-agent -y && apt autoremove -y' --output-diff
    
    date; salt vmray* cmd.run 'systemctl daemon-reload && systemctl restart sensu-agent' --output-diff
    
    
  6. Verify with this:

    salt '*' cmd.run 'sensu-agent version'
    salt -C '* not salt* not sensu* not jira*' cmd.run 'sensu-agent version'
    

Vault Service likes to crap out after reboot; verify the service is back up

Borrowed this from Vault Upgrade instructions

# Check the status
salt vault* cmd.run cmd='VAULT_SKIP_VERIFY=1 VAULT_ADDR=https://127.0.0.1 vault status'

# If you see "connection refused", the Vault service is not running
salt vault* cmd.run 'systemctl start vault'

# Check the status
salt vault* cmd.run cmd='VAULT_SKIP_VERIFY=1 VAULT_ADDR=https://127.0.0.1 vault status'

vault-1.pvt.xdr.accenturefederalcyber.com:
    Key                      Value
    ---                      -----
    Recovery Seal Type       shamir
    Initialized              true
    Sealed                   false
    Total Recovery Shares    5
    Threshold                2
    Version                  1.9.3
    Storage Type             dynamodb
    Cluster Name             vault-cluster-b6aa0cd0
    Cluster ID               d0d778a9-b123-4a6a-7712-0b99d54f8a00
    HA Enabled               true
    HA Cluster               https://10.40.0.204:443
    HA Mode                  standby
    Active Node Address      https://vault.pvt.xdr.accenturefederalcyber.com

Verify the UI is up Vault Prod

:warning: Don't forget to un-silence Sensu.


Sensu Go caveats


In version 5.16 the default password was removed in favor of a sensu-backend init with bash variables.

Sen$uP@ssw0rd!

systemctl start sensu-backend
export SENSU_BACKEND_CLUSTER_ADMIN_USERNAME=YOUR_USERNAME
export SENSU_BACKEND_CLUSTER_ADMIN_PASSWORD=YOUR_PASSWORD
sensu-backend init
sensuctl create --file filename.json