Let's say you have your application template and a team of five people working on it. One Monday morning you decide to change a minor thing, such as the security group, and at the same time your colleague, sitting in a room next to you, decides to change a disk size for instances. Being confident that you are the only ones running the terraform apply
command at this moment, you both do terraform apply
, push
changed state
file to the git
repository (or to remote storage like S3), and end up in a total disaster.
If your state
file is stored in git,
then you will meet the merge
conflict: not too bad, you can try to resolve it by hand, and you still can see who changed what. If you use a remote backend for the state
file, then things are going south. Which state
file is now inside the remote storage? And where do changes of another Terraform run go?
It is dangerous to work on the same state file in a team because there is no locking out of the box. You could pay for Atlas, which gives you this feature, but what if you don't want to pay for Atlas, for many obvious reasons? Well, there are a few (if not many) solutions to this problem.
The first one that we will take a look at is Terragrunt. Terragrunt is a thin wrapper for Terraform that supports locking for the Terraform state and enforces best practices. The GitHub page of this is yet another open source CLI wrapper for Terraform. It solves two problems, outlined as following:
Locking in Terragrunt is provided via DynamoDB: NoSQL database service from AWS. Let's go ahead and install it.
Grab the latest version from GitHub Releases at https://github.com/gruntwork-io/terragrunt/releases. Make it available in your $PATH
. On Mac, you can install Terragrunt by running brew install terragrunt
To start using it, create a .terragrunt
file in the packt-terraform
repository. This file uses already familiar HCL language (the same language Terraform templates are written in). It is needed to configure remote storage for your state file and locking. As we already have remote storage configured, carefully use the existing configuration inside this new file:
lock = { backend = "dynamodb" config { state_file_id = "mighty_trousers" } } remote_state = { backend = "s3" config { bucket = "packt-terraform" key = "mighty_trousers/terraform.tfstate" region = "eu-central-1" } }
Instead of using the terraform
commands, you should use terragrunt
now: terragrunt get/plan/apply/destroy/output
. You will note this fact when you run apply for the first time:
[terragrunt] 2016/12/01 09:52:58 Reading Terragrunt config file at .terragrunt [terragrunt] 2016/12/01 09:52:58 Remote state is already configured for backend s3 [terragrunt] 2016/12/01 09:52:58 Attempting to acquire lock for state file mighty_trousers in DynamoDB [terragrunt] 2016/12/01 09:52:58 Lock table terragrunt_locks does not exist in DynamoDB. Will need to create it just this first time. [terragrunt] 2016/12/01 09:52:58 Creating table terragrunt_locks in DynamoDB [terragrunt] 2016/12/01 09:52:59 Table terragrunt_locks is not yet in active state. Will check again after 10s. [terragrunt] 2016/12/01 09:53:09 Success! Table terragrunt_locks is now in active state. [terragrunt] 2016/12/01 09:53:09 Attempting to create lock item for state file mighty_trousers in DynamoDB table terragrunt_locks [terragrunt] 2016/12/01 09:53:10 Lock acquired!
Let's make an experiment and run the terragrunt apply
command twice. Open a new tab in your terminal, start the terragrunt run
command in the first one, then switch to the second one, and start it again. You won't be able to proceed because the first tab already acquired a lock for the state
file:
[terragrunt] 2016/12/01 09:54:37 Reading Terragrunt config file at .terragrunt [terragrunt] 2016/12/01 09:54:37 Remote state is already configured for backend s3 [terragrunt] 2016/12/01 09:54:37 Attempting to acquire lock for state file mighty_trousers in DynamoDB [terragrunt] 2016/12/01 09:54:38 Attempting to create lock item for state file mighty_trousers in DynamoDB table terragrunt_locks [terragrunt] 2016/12/01 09:54:39 Someone already has a lock on state file mighty_trousers! [email protected] acquired the lock on 2016-12-01 08:53:10.237944107 +0000 UTC. [terragrunt] 2016/12/01 09:54:39 Will try to acquire lock again in 10s.
This is a real game changer for your Terraform operations: no more conflicts, much more predictability! The locking in Terragrunt works by creating (if one does not exist) a new table in DynamoDB as well as a new item in this table with the name of the state_file_id
option value. This item will contain useful metadata about the lock, such as who created it and when. After the Terraform run is finished, the item is removed from the table, making a state
file available for modifications again.
The process of locking is not really complicated. Terragrunt supports DynamoDB as a backend, but you could implement a similar solution yourself. For example, you could create a Makefile
that wraps around Terraform the same way Terragrunt does and implement locking with a simple file on S3 or in some other way as per convenient for your organization name. You would actually need to do it if you do not rely on AWS as infrastructure provider.
With Terragrunt, you can also acquire the lock manually, for a longer period of time:
$> terragrunt acquire-lock [terragrunt] 2016/12/01 11:16:46 Reading Terragrunt config file at .terragrunt Are you sure you want to acquire a long-term lock? (y/n) y [terragrunt] 2016/12/01 11:16:49 Acquiring long-term lock. To release the lock, use the release-lock command. [terragrunt] 2016/12/01 11:16:49 Attempting to acquire lock for state file mighty_trousers in DynamoDB [terragrunt] 2016/12/01 11:16:50 Attempting to create lock item for state file mighty_trousers in DynamoDB table terragrunt_locks [terragrunt] 2016/12/01 11:16:51 Lock acquired!
After you are done Terraforming, just run terragrunt release-lock
.
Terragrunt is a lifesaver for any Terraform user; with a few simple features, it makes collaboration on Terraform-related work much more robust, predictable and production-ready. The company behind Terragrunt heavily relies on Terraform for its operations, so one could expect this tool to be further supported and improved with new features.
As mentioned, though, the locking features of Terragrunt work only if you are ready to use AWS DynamoDB, which might not always be the case. You could implement locking yourself and hope that it works well and your team members might follow all procedures as expected. But there is still the possibility a human mistake. We can keep it at a minimum by completely removing the right to run terraform apply
command from the operators' and developers' machines. How would it run, then? CI, of course.
3.21.34.0