Those who start with Infrastructure as Code (IaC for friends) in the cloud often find themselves having to choose the tool to use. According to the cloud provider, then, there are various possibilities so the choice is not always so easy.
By focusing on AWS, for example, we have Cloudformation or the most recent AWS CDK, just to name a few. But if we wanted a tool that provides a flexible abstraction of resources and providers, the choice would almost certainly fall on Terraform.
However, many will object by saying that Terraform manages state files locally and does not offer remote state management locking which could cause a concurrency deploying, bringing the infrastructure that is being created to an inconsistent state causing errors and problems.
This is not entirely correct, at least not anymore. Since version v0.9.0 (now we are at 0.12), Terraform offers a way to save the state files in AWS using an S3 bucket and lock the state using a dyanmoDB table.
Today I’ll show you how. (A video in which the whole procedure is performed from start to finish is available at the bottom of the page)
But let’s take a step back. What is a Terraform state file and what does it contain?
By running the “terraform plan” or the “terrafom apply” commands, a file called terraform.tfstate is created which contains a list of infrastructure resources in JSON format. Before each operation, Terraform checks the file to know the current state of the infrastructure and proceed accordingly with the creation, modification or elimination of one or more resources. This file is very important because if it were to be corrupted or to be eliminated, Terraform, not having a vision of what the state of the infrastructure is, would not be able to deploy the resources or bring the infrastructure that is being created to an inconsistent state. Being this file by default saved locally, if you work in a team, it would not be possible for colleagues to have the updated state of the infrastructure and therefore incur errors during deployment.
Is there a solution?
Yes. It’s called Terraform Backend. In practice, it stores the terraform.tfstate file in an s3 bucket and uses a dynamoDB table for state locking and consistency checking. In this way, when you or a colleague run the “terraform plan” command, Terraform accesses the bucket s3 where the terraform.tfstate is stored, and compare it to what’s in your Terraform configurations to determine which changes need to be applied. At the same time, the dynamoDB table freezes its status, so that if two or more colleagues try to make changes to the infrastructure at the same time, there will be no concurrent updates that could lead to corruption of the file itself, loss of data and conflicts.
Let’s build it
This is the code that creates the backend. But is it so easy?
Not exactly. The code below assumes that the dynamoDB table and the bucket where the terraform.tfstate file will be saved have previously been created.
Let’s proceed to create everything directly using Terraform.
Let’s create a file called “s3.tf”, in this file past the following code:
In this way we will create an s3 bucket called “angelo-terraform-state-backend” (you can call it as you want but remembering that the names of the s3 buckets in AWS are global, which means that it is not possible to use a name that has been used by someone else). In this case I decided to enable versioning so that every revision of the state file is stored, and is possible to roll back to an older version if something goes wrong. I decided to encrypt the contents of the bucket as the state file saves the infrastructure information and therefore also the sensitive data in plain-text. I also decided to enable the lock of objects in order to avoid deletion or overwriting.
Now let’s create a file called “dynamo.tf” and paste the following code:
In this way we will create a table in dynamoDB called “terraform_state” (also in this case you can choose a different name, unlike the bucket, the name of the table is not global and therefore there can be multiple tables with the same name, as long as are not in the same region of the same account). The primary key to be used to lock the state in dynamoDB must be called LockID and must be a “string” type (S).
Created these two files we can now create another file to deploy for example a very simple vpc. Let’s create a file called “vpc.tf” and past the following code:
Note that the “backend” code is commented because, as previously written, it assumes the existence of the s3 bucket and the dynamoDB table.
In many of the guides/tutorials I have seen it is shown to pass the AWS in plain-text directly into the code. PLEASE DON’T DO IT. Instead, use the variable “shared_credentials_file” and if you have several profiles like in my case the variable “profile”.
Ok, everything is ready. Run the command “terraform plan”
As you can see, Terraform needs to download plugins and binaries to deploy the infrastructure based on the provider used. So we’re going to run the command “terraform init”
Terraform recognized that the provider used in this tutorial is AWS and has automatically downloaded all the necessary plugins and binary files.
Let’s run again the command “terraform plan”
This time we will receive the list of resources that will be created. After checking that everything is as we want, we can proceed with the deployment by running the “terraform apply” command and confirming the will to deploy by writing “yes” when requested.
In this way we will have created the vpc, the bucket and the dynamoDB table.
Now we can move on to creating the backend, go back to the vpc.tf file, uncomment the relative code and save.
By running the “terraform plan” command again to get an updated list of resources, we will receive an error because the backend has changed and Terraform needs to start again:
Let’s run the “terraform init” command again, Terraform will automatically detect that you already have a state file locally and prompt you to copy it to the S3 previously created, writing yes, the file will be automatically copied.
From now on, every time we run the “terraform apply” command, Terraform will acquire the state lock by preventing possible concurrent deployments and will release it as soon as it is finished by saving the state of the infrastructure in the state file stored in the bucket, creating one new version to be able to use the previous version in order to possibly roll back in case of need.
For those who hate reading long tutorials, I have created a video in which the whole procedure is performed from start to finish. You can find it down below.
All the code used is also available on my github