瀏覽代碼

Initial commit.

Fred Damstra 8 年之前
當前提交
98f9b58ab6

+ 95 - 0
README.md

@@ -0,0 +1,95 @@
+# Playbook to Migrate ColdDB to the SplunkCold Filesystem
+Notes: The scripts now support multiple folders at once, so this 
+readme may be slightly out of date, but multiple folders should
+be straightforward if you look at the playbook.
+
+## Ansible Method:
+### Step 1:
+Recommendation: Use Screen so you don't lose your session!
+
+`ansible-playbook install_rsync --extra-vars="target=TARGETS"`
+`ansible-playbook rsync_colddb --extra-vars="target=TARGETS folder=FOLDERNAME"`
+
+Watch progress in another window with:
+`watch --interval 30 'ansible TARGETS --sudo --sudo-user=splunk -m shell -a "du -h --summarize /opt/splunk/var/lib/splunk/FOLDER/colddb /opt/splunk/var/lib/splunkcold/FOLDER/colddb"'`
+
+
+### Step 2:
+Run a search for year-to-date `| tstats count where index=FOLDER by _time span=1d`. Keep this window open for comparison at the end.
+
+On the MN:
+```
+# Enable maintenance mode: 
+sudo -u splunk /opt/splunk/bin/splunk enable maintenance-mode
+# Backup indexes.conf
+sudo -u splunk cp /opt/splunk/etc/master-apps/_cluster/local/indexes.conf{,.20170725}
+# Edit indexes.conf
+sudo -u splunk vi /opt/splunk/etc/master-apps/_cluster/local/indexes.conf
+```
+If it doesn't exist, add the volume:
+```
+[volume:coldvol]
+path = /opt/splunk/var/lib/splunkcold
+```
+
+Modify the index you are working on and add:
+```
+coldPath = volume:coldvol/<indexname>/colddb
+```
+
+DO NOT apply the bundle. DO NOT let anybody /else/ apply the bundle.
+Transfer indexes.conf to the ansible master into 
+`<ansible_home>/os_modifications/roles/splunk_colddb_migration/files/indexes.conf`
+
+On the MN, run:
+`watch sudo -u splunk /opt/splunk/bin/splunk show cluster-status`
+
+### Step 3:
+For each indexer, run from ansible server:
+ansible-playbook migrate_single_indexer --extra-vars="target=IP folder=defaultdb"
+* Check the cluster status before moving onto the next indexer! It takes a minute or two after starting before the indexer is back operational *
+
+To verify you hit everybody, run:
+`ansible --sudo --sudo-user=splunk TARGETS -m shell -a "ls /opt/splunk/var/lib/splunk/FOLDER/colddb/"`. You should get error messages from every host.
+
+### Step 4: Disable maintenance mode, apply cluster bundle:
+```
+sudo -u splunk /opt/splunk/bin/splunk show maintenance-mode
+sudo -u splunk /opt/splunk/bin/splunk disable maintenance-mode
+sudo -u splunk /opt/splunk/bin/splunk show cluster-bundle-status
+sudo -u splunk /opt/splunk/bin/splunk validate cluster-bundle
+sudo -u splunk /opt/splunk/bin/splunk show cluster-bundle-status
+sudo -u splunk /opt/splunk/bin/splunk apply cluster-bundle
+```
+
+### Step 5: Clean up the `/opt/splunk/var/lib/splunk/*/colddb.migrated` directories
+For the daring:
+`ansible TARGETS --sudo --sudo-user=splunk -m shell -a 'rm -rfv /opt/splunk/var/lib/splunk/FOLDERNAME/colddb.migrated'`
+
+####################################################################
+## Manual Method (Just for reference, use the ansible method above)
+1) Do a presync to minimize downtime (can be run multiple times before cutover):
+		a. sudo -u splunk mkdir -p /opt/splunk/var/lib/splunkcold/FOLDER/colddb
+		b. sudo -u splunk rsync -avz --delete /opt/splunk/var/lib/splunk/FOLDER/colddb /opt/splunk/var/lib/splunkcold/FOLDER/colddb
+2) Update the master node:
+		a. sudo -u splunk /opt/splunk/bin/splunk enable maintenance-mode 
+		b. cp /opt/splunk/etc/master-apps/_cluster/local/indexes.conf{,.20170725}
+		c. vi /opt/splunk/etc/master-apps/_cluster/local/indexes.conf
+			 i. Add:
+			[volume:coldvol]
+			path = /opt/splunk/var/lib/splunkcold
+			 ii. Then update the coldPath for FOLDER to be volume:coldvol/indexname/colddb
+		d. Do NOT deploy the changes. Make sure EVERYBODY KNOWS, no touching the master node!
+3) On each indexer in turn:
+		a. sudo su - splunk
+		b. /opt/splunk/bin/splunk stop
+		c. rsync -avz --delete /opt/splunk/var/lib/splunk/FOLDER/colddb /opt/splunk/var/lib/splunkcold/FOLDER/colddb
+		d. Manually copy the indexes.conf from the master node to /opt/splunk/etc/slave-apps/_cluster/local/indexes.conf
+		e. mv /opt/splunk/var/lib/splunk/FOLDER/colddb{,.20170725}
+		f. /opt/splunk/bin/splunk btool check
+		g. /opt/splunk/bin/splunk start
+4) After all indexes are completed, run a search:
+| tstats count where index=FOLDER by _time span=1d
+		a. Year to date. There should not be gaps.
+5) If everything checks out, turn off maintenance mode and apply the cluster bundle (if changes were made exactly, no bundle update will go out).
+

+ 1 - 0
files/.gitignore

@@ -0,0 +1 @@
+indexes.conf

+ 4 - 0
files/README.md

@@ -0,0 +1,4 @@
+# Instructions
+You will need to manually copy an indexes.conf into this folder
+for distribution to all peers.
+

+ 2 - 0
tasks/.gitignore

@@ -0,0 +1,2 @@
+batch.json
+

+ 5 - 0
tasks/example.json

@@ -0,0 +1,5 @@
+{ 
+  "target": MyIndexers,
+  "delay": 90, 
+  "folders": ["colddbfolder1", "colddbfolder2"]
+}

+ 11 - 0
tasks/install_rsync.yml

@@ -0,0 +1,11 @@
+---
+- hosts: "{{ target }}"
+  become: true
+
+  tasks:
+  # Verify rsync is installed
+  - name: Ensure rsync is installed
+    package:
+      name: rsync
+      state: latest
+

+ 99 - 0
tasks/migrate_indexers.yml

@@ -0,0 +1,99 @@
+---
+# Perform actual migration of the servers
+#
+# PREREQUITES:
+#       1) An initial rsync should have been performed (see rsync_colddb)
+#       2) Cluster should be in maintenance mode
+#
+# Specify extra vars for "target", "delay", and "folders", where
+# folders should be an array.
+#
+# You can do this on the command-line via JSON:
+#   ansible-playbook migrate_indexers.yml \
+#      --extra-vars="{"target": 10.10.10.10, "delay": 30, "folders": ["folder1", "folder2"]}"
+#
+# Or put that json in a file and do:
+#   ansible-playbook migrate_indexers.yml --extra-vars "@filename.json"
+#
+- hosts: "{{ target }}"
+  become: true
+  become_user: splunk
+  serial: 1
+
+  tasks:
+  # Verify variables
+  - name: Variable check - folders
+    fail: msg="Variable 'folders' is not defined or is invalid. Please see playbook for how to configure extra-vars."
+    when: (folders is not defined)
+
+  - name: Variable check - delay
+    fail: msg="Variable 'delay' is not defined or is invalid. Please see playbook for how to configure extra-vars."
+    when: (delay is not defined)
+
+  # Verify folder exists
+  - name: Ensure folder already exists
+    stat:
+      path: /opt/splunk/var/lib/splunkcold/{{ item }}/colddb
+    register: colddbpath
+    with_items: "{{ folders }}"
+
+  - name: Fail if the folder doesn't exist
+    fail: msg="One of the colddb folders does not exist."
+    when: not(item.stat.isdir is defined and item.stat.isdir)
+    with_items: "{{ colddbpath.results }}"
+
+  - debug:
+      msg: "Cold Paths exist. Good."
+    when: item.stat.isdir is defined and item.stat.isdir
+    with_items: "{{ colddbpath.results }}"
+   
+  # Verify migrated folder does not exist (DO NOT RUN TWICE!)
+  - name: Ensure migrated folder does not exist
+    stat:
+      path: /opt/splunk/var/lib/splunk/{{ item }}/colddb.migrated
+    register: colddbmigratedpath
+    with_items: "{{ folders }}"
+
+  - name: Fail if the migrated folder exist
+    fail: msg="One of the migrated folders already exists. (Already run on this index?)"
+    when: item.stat.isdir is defined and item.stat.isdir
+    with_items: "{{ colddbmigratedpath.results }}"
+
+  - debug:
+      msg: "Migrated Cold Paths do not exist. Good."
+    when: not(item.stat.isdir is defined and item.stat.isdir)
+    with_items: "{{ colddbmigratedpath.results }}"
+   
+  # Stop Splunk
+  - name: Stop Splunk
+    command: /opt/splunk/bin/splunk stop
+ 
+  - name: rsync cold data
+    command: rsync -avz --delete /opt/splunk/var/lib/splunk/{{ item }}/colddb/ /opt/splunk/var/lib/splunkcold/{{ item }}/colddb/
+    # Run this asynchyronously for one hour, polling every 30.
+#    async: 604800
+#    poll: 60
+    register: rsync_result
+    with_items: "{{ folders }}"
+
+  - name: overwrite indexes.conf
+    copy:
+      src: ../files/indexes.conf
+      dest: /opt/splunk/etc/slave-apps/_cluster/local/indexes.conf
+      owner: splunk
+      group: splunk
+      mode: 0600
+
+  - name: Rename Colddb path
+    command: mv /opt/splunk/var/lib/splunk/{{ item }}/colddb /opt/splunk/var/lib/splunk/{{ item }}/colddb.migrated
+    with_items: "{{ folders }}"
+
+  - name: Btool Check for Good Measure
+    command: /opt/splunk/bin/splunk btool check
+
+  - name: start splunk
+    command: /opt/splunk/bin/splunk start
+
+  - name: sleep after
+    command: sleep {{delay}}
+

+ 80 - 0
tasks/migrate_single_indexer.yml

@@ -0,0 +1,80 @@
+---
+# Perform actual migration of a server.
+# PREREQUITES:
+#       1) An initial rsync should have been performed (see rsync_colddb)
+#       2) Cluster should be in maintenance mode
+#
+# Specify extra vars for both "target" and "folder"
+# 
+# e.g.:
+#   ansible-playbook migrate_single_indexer.yml --extra-vars="target=10.10.10.10 folder=defaultdb"
+- hosts: "{{ target }}"
+  become: true
+  become_user: splunk
+  serial: 1
+
+  tasks:
+  # Verify folder is defined
+  - name: Variable check
+    fail: msg="Variable 'folder' is not defined or is invalid. Please run with --extra-vars=\"target=x folder=dbfolder\""
+    when: (folder is not defined)
+
+  # Verify folder exists
+  - name: Ensure folder already exists
+    stat:
+      path: /opt/splunk/var/lib/splunkcold/{{ folder }}/colddb
+    register: colddbpath
+
+  - name: Fail if the folder doesn't exist
+    fail: msg="The colddb folder does not exist."
+    when: not(colddbpath.stat.isdir is defined and colddbpath.stat.isdir)
+
+  - debug:
+      msg: "Cold Path exists. Good."
+    when: colddbpath.stat.isdir is defined and colddbpath.stat.isdir
+   
+  # Verify migrated folder does not exist (DO NOT RUN TWICE!)
+  - name: Ensure migrated folder does not exist
+    stat:
+      path: /opt/splunk/var/lib/splunk/{{ folder }}/colddb.migrated
+    register: colddbmigratedpath
+
+  - name: Fail if the migrated folder exist
+    fail: msg="The migrated folder already exists. (Already run on this index?)"
+    when: colddbmigratedpath.stat.isdir is defined and colddbmigratedpath.stat.isdir
+
+  - debug:
+      msg: "Migrated Cold Path does not exist. Good."
+    when: not(colddbmigratedpath.stat.isdir is defined and coldmigratedpath.stat.isdir)
+   
+  # Stop Splunk
+  - name: Stop Splunk
+    command: /opt/splunk/bin/splunk stop
+ 
+  - name: rsync cold data
+    command: rsync -avz --delete /opt/splunk/var/lib/splunk/{{ folder }}/colddb/ /opt/splunk/var/lib/splunkcold/{{ folder }}/colddb/
+    # Run this asynchyronously for one hour, polling every 30.
+#    async: 604800
+#    poll: 60
+    register: rsync_result
+
+  - name: overwrite indexes.conf
+    copy:
+      src: ../files/indexes.conf
+      dest: /opt/splunk/etc/slave-apps/_cluster/local/indexes.conf
+      owner: splunk
+      group: splunk
+      mode: 0600
+
+  - name: Rename Colddb path
+    command: mv /opt/splunk/var/lib/splunk/{{ folder }}/colddb /opt/splunk/var/lib/splunk/{{ folder }}/colddb.migrated
+
+  - name: Btool Check for Good Measure
+    command: /opt/splunk/bin/splunk btool check
+  
+  - name: start splunk
+    command: /opt/splunk/bin/splunk start
+
+  - name: sleep after
+    command: sleep 60
+

+ 34 - 0
tasks/rsync_colddb.yml

@@ -0,0 +1,34 @@
+---
+# Synchronize a folder to the new colddb path.
+# Specify extra vars for both "target" and "folder"
+# 
+# e.g.:
+#   ansible-playbook rsync_colddb.yml --extra-vars="target=AWS-Indexers folder=defaultdb"
+- hosts: "{{ target }}"
+  become: true
+  become_user: splunk
+
+  tasks:
+  # Verify folder is defined
+  - name: Variable check
+    fail: msg="Variable 'folder' is not defined or is invalid. Please run with --extra-vars=\"target=x folder=dbfolder\""
+    when: (folder is not defined)
+
+  # Verify folder exists
+  - name: Ensure folder exists
+    file:
+      path: /opt/splunk/var/lib/splunkcold/{{ folder }}/colddb
+      state: directory
+      mode: 0750
+    
+  - name: rsync cold data
+    command: rsync -avz --delete /opt/splunk/var/lib/splunk/{{ folder }}/colddb/ /opt/splunk/var/lib/splunkcold/{{ folder }}/colddb/
+    # Run this asynchyronously for one hour, polling every 30.
+#    async: 604800
+#    poll: 60
+    register: rsync_result
+
+#  - debug: msg="{{ rsync_result.stdout}}"
+#  - debug: msg="{{ rsync_result.stderr}}"
+
+

+ 36 - 0
tasks/rsync_until_completed.sh

@@ -0,0 +1,36 @@
+#! /bin/bash
+#
+# rsync takes a long time and can timeout, so repeat
+# until it's done.
+# 
+# Recommendation, in another window/screen session, run:
+#
+FOLDERS=(did-it-aws elis2-aws-pii)
+TARGET=AWS-Indexers
+SLEEPTIME=300 # Time to rest between retries
+FORKS=10 # Maximum number of parallel actions
+
+function join_by { local IFS="$1"; shift; echo "$*"; } # joins strings
+FOLDERS_SHELL=\{$(join_by , ${FOLDERS[@]})\}
+
+echo Recommendation: In another window/screen session, run:
+echo 	watch --interval=30 \"ansible $TARGET --sudo --sudo-user=splunk -m shell -a \'du --summarize -h /opt/splunk/var/lib/\{splunk,splunkcold\}/$FOLDERS_SHELL/colddb\*\'\"
+
+# Store our resullt
+result=1
+while [[ $result -ne 0 ]] 
+do
+  result=0
+  for folder in "${FOLDERS[@]}"
+  do
+    echo Synchronizing ${folder}...
+    time ansible-playbook rsync_colddb.yml --forks=$FORKS --extra-vars="target=$TARGET folder=${folder}"
+    result=$((result+$?))
+  done
+  if [[ $result -ne 0 ]]
+  then
+    echo Not finished. Resting for $SLEEPTIME seconds...
+    sleep 300
+  fi
+done
+