Salt Upgrade Notes.md 18 KB

Salt Upgrade Notes

Places where code might need to be updated for a new version ( salt.repo )

Salt Upgrade Steps

Always upgrade salt master then minions

Dev Steps Update the pillar in git salt/pillar/dev/yumrepos.sls

salt 'salt*' cmd.run 'salt-run fileserver.update'
salt 'salt*' cmd.run 'salt-run git_pillar.update'
salt 'salt*' saltutil.refresh_pillar
salt 'salt*' pillar.get yumrepos:salt:version

Update salt master

salt 'salt*' cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt 'salt*' state.sls os_modifications.repo_update_redhat --output-diff test=true
salt 'salt*' cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt 'salt*' cmd.run 'yum clean all ; yum makecache fast'
salt 'salt*' cmd.run 'yum check-update | grep salt'
salt 'salt*' pkg.upgrade name=salt-master   # NOTE: this might upgrade the salt-minion at the same time. 
sudo systemctl start salt-minion
sudo salt 'salt*' state.sls salt_master.salt_posix_acl --output-diff
salt 'salt*' test.version

Update salt minions using minion_upgrade salt state

salt '*' saltutil.refresh_pillar
salt '*' pillar.get yumrepos:salt:version
salt sensu* state.sls salt_minion.minion_upgrade --output-diff test=true
salt sensu* test.version
salt vault* state.sls salt_minion.minion_upgrade --output-diff test=true
salt vault* test.version
# focus on just Redhat first?
salt -G 'os:RedHat' state.sls salt_minion.minion_upgrade --output-diff test=true
salt -G 'os:RedHat' test.version

# then debian based
salt -C '* not G@os:RedHat' state.sls salt_minion.minion_upgrade --output-diff test=true
salt -C '* not G@os:RedHat' test.version

Update salt minions without salt state (when the repo is already up-to-date)

salt sensu* cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt sensu* state.sls os_modifications.repo_update_redhat --output-diff test=true
salt sensu* cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt sensu* cmd.run 'yum clean all ; yum makecache fast'
salt sensu* cmd.run 'yum check-update | grep salt'
salt sensu* cmd.run_bg 'systemd-run --scope yum update salt-minion -y && sleep 20 && systemctl daemon-reload && sleep 20 && systemctl start salt-minion'
salt sensu* test.version

Did you miss any?

salt -G saltversion:300X.X test.version
salt -C '* not G@saltversion:300X.X' test.version

Ensure the vmray /etc/apt/sources.list.d/salt.list is correctly showing only one repo.

repeat for PROD.

Salt Upgrade 3003.3 -> 3004.1

Upgrading the minion first will result in loss of connectivity

Salt Upgrade 3002.6 -> 3003.3

Salt Upgrade 3001.6 -> 3002.6 Notes

next time try this: salt/fileroots/os_modifications/minion_upgrade.sls ( move it to the salt folder or something )

upgrade salt master then minions

Did you miss any? salt -G saltversion:3002.6 test.ping

repeat for PROD.

Salt Upgrade 3001.2 -> 3001.6 Notes

Places where code might need to be updated for a new version ( salt.repo )

For your reference....

Prep In the dev environment, the salt minion failed to start up after the upgrade. Might need a cronjob on the LCP nodes.

Ensure the pillar has been updated to the correct version.

salt salt* cmd.run 'salt-run fileserver.update'
salt salt* pillar.get yumrepos:salt:version

Update repo

salt salt* cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt salt* state.sls os_modifications.repo_update --output-diff test=true
salt salt* cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt salt* cmd.run 'yum clean all ; yum makecache fast'
salt salt* cmd.run 'yum check-update | grep salt'
salt salt* pkg.upgrade name=salt-master
sudo salt salt* state.sls salt_master.salt_posix_acl --output-diff

Ack the minions didn't come back! stupid salt! Let's try something different

salt salt* cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt salt* state.sls os_modifications.repo_update --output-diff test=true
salt salt* cmd.run 'cat /etc/yum.repos.d/salt.repo'
salt salt* cmd.run 'yum clean all ; yum makecache fast'
salt salt* cmd.run 'yum check-update | grep salt'
cmd.run_bg 'systemd-run --scope yum update salt-minion -y && sleep 240 && systemctl daemon-reload && sleep 20 && systemctl start salt-minion'

Did you miss any? salt -G saltversion:3001.3 test.ping

BAD DNS for Splunk returner requests.packages.urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='moose-hec.xdr.accenturefederalcyber.com', port=8088): Max retries exceeded with url: /services/collector/event (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7fb058f0deb8>: Failed to establish a new connection: [Errno 110] Connection timed out',))

Salt Upgrade 2019 -> 3001 Notes

Places where code might need to be updated for a new version ( salt.repo )

Prep

  • update Pillars yumrepos:salt:version and yumrepos:salt:baseurl

On the master

  • update repo salt salt* state.sls os_modifications.repo_update --output-diff
  • install gitpython on salt master for py3 pip3 install gitpython
  • salt salt-master* cmd.run 'yum clean all ; yum makecache fast'
  • salt salt* cmd.run 'yum check-update'
  • update salt salt* pkg.upgrade name=salt-master
  • salt salt* state.sls salt_master.salt_posix_acl --output-diff
  • salt salt* cmd.run 'systemctl restart salt-master'
  • salt salt*com state.sls salt_master.salt_master_configs test=true

On the minions

  • update repo salt salt* state.sls os_modifications.repo_update --output-diff
  • salt salt* cmd.run 'yum clean all ; yum makecache fast'
  • salt salt* cmd.run 'yum check-update'
  • update salt salt* pkg.upgrade name=salt-minion
  • yum install python36-zmq <- might need that for some minions.
  • watch 'salt salt* test.ping'
  • salt cmd.run 'pip3 install boto'
  • salt cmd.run 'pip3 install boto3'
  • salt cmd.run 'pip3 install pyinotify'
  • salt saltutil.sync_all
  • salt saltutil.refresh_modules
  • salt grains.get ec2:placement:availability_zone
  • salt grains.get environment
  • RESTART to apply beacon inotify changes service.restart salt-minion
  • cmd.run 'tail /var/log/salt/minion'

    salt sensu* pkg.upgrade name=salt-minion
    salt vault*local pkg.upgrade name=salt-minion
    salt moose*local pkg.upgrade name=salt-minion
    salt -C '* not ( moose* or afs* or nga* or ma-* or mo-* or la-* or dc-* or vault* or sensu* or interconnect* or resolver* or salt-master* )' pkg.upgrade name=salt-minion
    salt -C 'resol* or interc*' pkg.upgrade name=salt-minion
    

3001 Upgrade PROBLEMS

salt-call -ldebug --local grains.get ec2_info salt-call -ldebug --local grains.get ec2_tags

boto and boto3 needs to be installed for py3 for ec2 grains pip3 install boto pip3 install boto3 pip3 list installed | grep boto

push out new grain that was updated for py3. fixes the ec2:placement:availability_zone grain salt *local saltutil.sync_all salt *com saltutil.sync_all salt *local grains.get ec2:placement:availability_zone salt *com grains.get ec2:placement:availability_zone

ISSUE:

[ERROR   ] Returner splunk.returner could not be loaded: 'splunk.returner' is not available.
SOLUTION: manually restart minion

ISSUE:

2020-11-23 18:13:09,719 [salt.beacons     :144 ][WARNING ][15141] Unable to process beacon inotify
cmd.run 'ls -larth /etc/salt/minion.d/beacons.conf'

ISSUE:

requests.packages.urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='iratemoses.mdr.defpoint.com', port=8088): Max retries exceeded with url: /services/collector/event (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f19e76c64a8>: Failed to establish a new connection: [Errno -2] Name or service not known',))

SOLUTION:

IGNORE: this was happening with previous version of salt and python2. 

ISSUE on reposerver:

2020-11-23 19:42:20,061 [salt.state       :328 ][ERROR   ][18267] Cron /usr/local/bin/repomirror-cron.sh for user root failed to commit with error
    "/tmp/__salt.tmp.9b64eos8":1: bad minute
    errors in crontab file, can't install.

SOLUTION:

bad cron file?

ISSUE:

[CRITICAL][1745] Pillar render error: Rendering SLS 'mailrelay' failed
2020-11-23 19:26:11,255 [salt.pillar      :889 ]

[CRITICAL][1745] Rendering SLS 'mailrelay' failed, render error:
Jinja variable 'salt.utils.context.NamespacedDictWrapper object' has no attribute 'ec2'

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/salt/utils/templates.py", line 400, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3.6/site-packages/jinja2/environment.py", line 989, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/lib/python3.6/site-packages/jinja2/environment.py", line 754, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 1, in top-level template code
  File "/usr/lib/python3.6/site-packages/jinja2/environment.py", line 389, in getitem
    return obj[argument]
jinja2.exceptions.UndefinedError: 'salt.utils.context.NamespacedDictWrapper object' has no attribute 'ec2'

SOLUTION:

?

2019 Upgrade

Jira MSOCI-1164 ticket - Standardize salt version across infrastructure

Done when:

  • All salt minions are running same version (2018)
  • All server minions are pegged to specific version (that can be changed at upgrade time)
  • Remove yum locks for minion

Notes:

  • Packer installs 2019 repo (packer/scripts/add-saltstack-repo.sh & packer/scripts/provision-salt-minion.sh) , then os_modifications ( os_modifications.repo_update ) overwrites the repo with 2018. This leaves the salt minion stuck at the 2019 version without being able to upgrade. 

#salt master (two salt repo files)

/etc/yum.repos.d/salt.repo (salt/fileroots/os_modifications/minion_upgrade.sls)

[salt-2018.3]
name=SaltStack 2018.3 Release Channel for Python 2 RHEL/Centos $releasever
baseurl=https://repo.saltstack.com/yum/redhat/7/$basearch/2018.3
failovermethod=priority
enabled=1

/etc/yum.repos.d/salt-2018.3.repo

[salt-2018.3]
name=SaltStack 2018.3 Release Channel for Python 2 RHEL/Centos $releasever
baseurl=https://repo.saltstack.com/yum/redhat/7/$basearch/2018.3
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/saltstack-signing-key, file:///etc/pki/rpm-gpg/centos7-signing-key

#reposerver.msoc.defpoint.local

/etc/yum.repos.d/salt.repo

[salt-2018.3]
name=SaltStack 2018.3 Release Channel for Python 2 RHEL/Centos $releasever
baseurl=https://repo.saltstack.com/yum/redhat/7/$basearch/2018.3
failovermethod=priority
enabled=1
gpgcheck=0

Two repo files in salt, both are 2018.3; one has proxy=none other doesn't.  The salt_rhel.repo is just for RHEL and the other is for CENTOS. 

  • salt/fileroots/os_modifications/files/salt.repo
    (salt/fileroots/os_modifications/repo_update.sls uses this file and it is actively pushed to CENTOS minions)

  • salt/fileroots/os_modifications/files/salt_rhel.repo
    (salt/fileroots/os_modifications/repo_update.sls uses this file and it is actively pushed to RHEL minions)

  • /etc/yum.repos.d/salt-2018.3.repo
    ( not sure how this file is being pushed. possibly pushed from Chris fixing stuff )

STEPS

  1. remove /etc/yum.repos.d/salt-2018.3.repo from test
    • 1.2 remove yum versionlock in test (if there are any; None found)
    • 1.3 yum clean all ; yum makecache fast
  2. use git to update os_modifications/files/salt_rhel.repo file to 2019.2.2 ( match salt master)
    • 2.1 use salt + repo to update minion to 2019.2.2
    • 2.5 salt minion cmd.run rm -rf /etc/yum.repos.d/salt-2018.3.repo
    • 2.5.1 salt minion cmd.run ls /etc/yum.repos.d/salt*
  3. 2.6 salt salt-master* state.sls os_modifications.repo_update
  4. 2.7 salt salt-master* cmd.run yum clean all ; yum makecache fast
  5. 2.8 salt minion cmd.run yum update salt-minion -y
  6. 2.9 salt minion cmd.run yum remove salt-repo -y
  7. upgrade salt master to 2019.2.3 using repo files as a test
  8. upgrade salt mininos to 2019.2.3 using repo files as a test
  9. push to prod.

PROBLEMS:

bastion.msoc.defpoint.local
error: unpacking of archive failed on file /var/log/salt: cpio: lsetfilecon
mailrelay.msoc.defpoint.local
pillar broken

PROD

  1. remove dup repos
  2. 1.1 remove /etc/yum.repos.d/salt-2018.3.repo from environment (looks like it was installed with a RPM)
  3. 1.1.1 salt minion cmd.run 'yum remove salt-repo -y' (does not remove the proper salt.repo file)
  4. 1.1.2 salt minion cmd.run 'rm -rf /etc/yum.repos.d/salt-2018.3.repo6 (just to make sure)
  5. 1.2 remove yum versionlock yum versionlock list
    • 1.2.1 salt minion cmd.run 'yum versionlock delete salt-minion'
    • 1.2.2 salt minion cmd.run 'yum versionlock delete salt'
    • 1.2.3 salt minion cmd.run 'yum versionlock delete salt-master'
  6. use salt + repo to update master/minion to 2019.2.2
    • 2.1 use git to update os_modifications/files/salt_rhel.repo file to 2019.2.2 pin to minor release (match TEST)(https://repo.saltstack.com/yum/redhat/$releasever/$basearch/archive/2019.2.2)
    • 2.2 Check for environment grain ( needed for repo_update state file. )
    • 2.2.1 salt minion grains.item environment
    • 2.3 salt salt-master* state.sls os_modifications.repo_update
    • 2.4 salt salt-master* cmd.run 'yum clean all ; yum makecache fast'
    • 2.4.5 salt minion cmd.run 'yum check-update | grep salt'
    • 2.5 salt minion cmd.run 'yum update salt-minion -y' OR salt minion pkg.upgrade name=salt-minion salt minion pkg.upgrade name=salt-minion fromrepo=salt-2019.2.4
    • 2.6 salt master cmd.run 'yum update salt-master -y'
  7. ensure salt master and minions are at that minor version.
    • 3.1 salt * test.version
  8. upgrade test and prod to 2019.2.3 via repo files to ensure upgrade process works properly.

  9. fix permissions on master to allow non-root users to be able to run ( or run highstate )

    • 5.1 chmod 700 /etc/salt/master.d/
    • 5.2 then restart master
  10. never upgrade salt again.

PROBLEMS:

  • The pillar depends on a custom grain, the custom grain depends on specific python modules. T
  • The moose servers seem to have python module issues.
  • These commands helped fix them. python yum VS. pip

    ERROR: Could not get AWS connection: global name 'boto3' is not defined
    ERROR: ImportError: cannot import name certs
    pip list | grep requests
    yum list installed | grep requests
    sudo pip uninstall requests
    sudo pip uninstall urllib3
    sudo yum install python-urllib3
    sudo yum install python-requests
    pip install boto3 (this installs urllib3 via pip as a dependency!)
    pip install boto
    

slsutil.renderer salt://os_modifications/repo_update.sls
#if the grain is wrong on the salt master, but correct with salt-call restart the minion.

salt moose* grains.item environment
cmd.run 'salt-call grains.get environment'
cmd.run 'salt-call -ldebug --local grains.get environment'
cmd.run 'salt-call -lerror --local grains.get environment'

Boto3 issue is actually a urllib3 issue?
pip -V
pip list | grep boto
pip list | grep urllib3

salt-call is different connecting to python2
/bin/bash: pip: command not found
salt 'moose*indexer*' cmd.run "salt-call cmd.run 'pip install boto3'"

resolution steps
Duane will remove /usr/local/bin/pip which is pointing to python3
pip should be at /usr/bin/pip
yum --enablerepo=epel -y reinstall python2-pip

To Fix, upgrade the urllib3 module:

  1. salt '*.local' cmd.run 'pip install --upgrade urllib3'
  2. restart salt-minion

Permissions issue? Run this command as root:
salt salt* state.sls salt_master.salt_posix_acl