# Terragrunt Notes aka "how to develop the terraform 12+ stuff" ## Local cache of providers NOTE: this doesn't work well with provider locking in TF14+. I recommend you disable this if you've enabled it. ~helpful tip, speed up cache by adding the following to your `~./bashrc`:~ ~```~ ~export TF_PLUGIN_CACHE_DIR=~/.terraform.d/plugin-cache~ ~[[ -d "$TF_PLUGIN_CACHE_DIR" ]] || mkdir -p $TF_PLUGIN_CACHE_DIR~ ~```~ ## Renaming Directories/Resources General process: 1. Make sure everything's up to date. 1. Move the remote state. 1. Update the configuration. 1. Rename the directory 1. Make sure terragrunt applies cleanly (But updates all the tags, so lots of changes to review) For this example, I was renaming `010-standard-vpc` to `010-vpc-splunk` in `test/aws-us-gov/mdr-test-modelclient`. ``` cd 010-standard-vpc/ # clear out cache to make our lives easier rm -rf .terragrunt-cache # validate that we're on latest code terragrunt-local apply # Get the `bucket` and 'key' value cat `find . -name 'backend.tf'` # In this example: # bucket = "afsxdr-terraform-state" # key = "aws/test/aws-us-gov/mdr-test-modelclient/010-standard-vpc/terraform.tfstate" aws --profile mdr-common-services-gov \ s3 mv \ s3://afsxdr-terraform-state/aws/test/aws-us-gov/mdr-test-modelclient/010-standard-vpc/terraform.tfstate \ s3://afsxdr-terraform-state/aws/test/aws-us-gov/mdr-test-modelclient/010-vpc-splunk/terraform.tfstate # move and rename cd .. git mv 010-standard-vpc 010-vpc-splunk cd 010-vpc-splunk # Apply again: NOTE: The only changes should be to the tags. Do not accept any other changes, or you will have extra resources rm -rf .terragrunt-cache terragrunt-local apply ``` If you get: ``` Error refreshing state: state data in S3 does not have the expected content. ``` You forgot to rename the directory you're working in. ## GitFlow Notes These notes will walk you through the Terragrunt git flow for making changes. - Fork the Master branch to your branch - change local xdr-terrafrom-live repo with expected new tag ( so you don't forget to do it when you are done. ) - make changes to xdr-terraform-modules - make changes to xdr-terraform-live - increment the ref=v0.x.x in your terragrunt.hcl - use terragrunt-local to try the changes - ( did you run the saml command to login?) - use tgswitch to change versions - `rm -rf .terragrunt-cache` to resolve "strange" errors - commit the changes to your branch - push new branch to github - get pr approved and merged in - tag master to latest tag that is set in terragrunt.hcl - verify it is working in TEST without terragrunt-local - deploy to PROD - delete github branch and close jira ticket ## Destroy instances ``` TF_VAR_instance_termination_protection=false terragrunt apply TF_VAR_instance_termination_protection=false terragrunt destroy ``` ## tfswitch.toml colby-williams taught me: cp -ar to copy symlinks correctly. ln -s ../../../../.tfswitch.toml . ls -larth .tfswitch.toml -> ../../../../.tfswitch.toml ### 2021-04-29: State Issues When running `terragrunt apply`, got the following: ``` Initializing the backend... Error refreshing state: state data in S3 does not have the expected content. This may be caused by unusually long delays in S3 processing a previous state update. Please wait for a minute or two and try again. If this problem persists, and neither S3 nor DynamoDB are experiencing an outage, you may need to manually verify the remote state and update the Digest value stored in the DynamoDB table to the following value: ec9c9183a070f5ad59b9abd524810c06 ``` The remote state looks uncorrupted: ``` cd ~/xdr-terraform-live/prod/aws-us-gov/mdr-prod-c2/160-splunk-indexer-cluster find .terragrunt-cache -name 'backend.tf' # Use the filename found and view the contents cat .terragrunt-cache/tC_aGEvkrKzsZjSw0YQum-A6YL8/Ipji28Trjy_fymLhd4EZgtAe8xg/base/splunk_servers/indexer_cluster/backend.tf # Use the bucket and key to from the s3 path: scp --profile mdr-common-services-gov cp s3://afsxdr-terraform-state/aws/prod/aws-us-gov/mdr-prod-c2/160-splunk-indexer-cluster/terraform.tfstate less -iS terraform.tfstate ``` To fix: 1. go to the gui, log into AWS console to the mdr-common-services-gov account, service dynamodb 2. Go to tables->items 3. Change dropdown to 'query' 4. Into lockId=, enter: afsxdr-terraform-state/aws/prod/aws-us-gov/mdr-prod-c2/160-splunk-indexer-cluster/terraform.tfstate-md5 (The key from above, with -md5 appended) 5. Record the old digest: 9cb9cbfddaf100cfa8ae92ec79236175 6. Insert the digest from the error message: ec9c9183a070f5ad59b9abd524810c06 7. Run `terragrunt refresh` ## TF 0.14 / The State lock File With tf14, terraform has added the creation of a 'provider state lock file' to prevent inadvertant drift of provider modules. This requires some addition management. * On first run of a module, create the provider lock file for multiple platforms by running `terragrunt-providers` (which is just a bash script that runs some cleanup and then runs `terragrunt providers lock -platform=darwin_amd64 -platform=linux_amd64 -platform=windows_amd64 -platform=linux_arm64`. * If you need an extra provider, you should override the generation of `required_providers.tf` in your `terragrunt.hcl` file for the module. This must include the modules from the root `terragrunt.hcl` that are used within your module. For an example, see `xdr-terraform-live/common/aws-us-gov/afs-mdr-common-services-gov/085-codebuild-ecr-customer-portal/terragrunt.hcl` * To regenerate or upgrade modules, I guess you just delete it? * There is possible compatibility issues with `TF_PLUGIN_CACHE_DIR`. You can try disabling this if you have trouble getting hashes. ## Could not load plugin If you get: ``` Substituting 'git@github.xdr.accenturefederalcyber.com:mdr-engineering/xdr-terraform-modules.git//base/sensu-configuration' with '../../../../../xdr-terraform-modules//base/sensu-configuration' Acquiring state lock. This may take a few moments... Releasing state lock. This may take a few moments... ╷ │ Error: Could not load plugin │ │ │ Plugin reinitialization required. Please run "terraform init". │ │ Plugins are external binaries that Terraform uses to access and manipulate │ resources. The configuration provided requires plugins which can't be │ located, │ don't satisfy the version constraints, or are otherwise incompatible. ``` It means that you've run the module before with an earlier version of the plugin. To fix, run: ``` terragrunt init --upgrade ``` (Or terragrunt-local if appropriate) # Static Code Analysis We can do some good static code analysis with some standard tools. On OS X run: ``` brew install tflint tfsec checkov ``` These can be enabled/enforced during terragrunt via `xdr-terraform-live/terragrunt.hcl` in the section labeled `Apply Static Code Analysis`, which should be somewhat self-explanatory. ## Ignoring Findings You can easily ignore findings from tflint, tfsec, or checkov by adding comments to the code. ### Terragrunt formatting `terragrunt hclfmt` This command will make changes to your files! ### tflint Run these command in the modules folders. `terraform fmt` This command will make changes to your files! To ignore a finding from tflint, add a comment like the following (this one from `xdr-terraform-modules/base/splunk_servers/indexer_cluster/elb-with-acks.tf`): ``` # tflint-ignore: aws_elb_invalid_subnet - Incorrectly errors out that these are invalid ``` ### tfsec `tfsec .` Run these command in the modules folders. For tfsec, look for the finding id, and add a comment: ``` # tfsec:ignore: ``` This can be added before the line, at the end of the line, or before the module. To ignore globally, edit `xdr-terraform-live/terragrunt.hcl` and add the id to the `ignored_tfsec` local variable. ### checkov Run these command in the modules folders. `checkov --framework terraform --quiet -d .` `checkov --framework terraform --quiet -f filename1.tf -f filename2.tf` For `checkov`, look for the check ID and add a comment: ``` #checkov:skip=CKV_AWS_[:optional comment] ``` For clarity, these should be added closest to the affected resource or element.