State file storage
Categories:
5 minute read
Developing solo with a local state file is probably fine so long as you have backups of your local system, for example. But if you have n+1 people who need to use the same Terraform environment, you must elevate the state file location to something more sophisticated.
Terraform has a backend feature that by default is configured as a local backend
and not shown in the Terraform configuration files. To change the state storage to a remote backend
you add configuration data to your Terraform configuration files defining where and how the state is to be stored. In most organizations AWS S3, Azure Blob, or Terraform Cloud will be the likely solutions used by DevOps teams as these solutions solve the following problems:
Automated load Terraform handles loading and writing back the state file every time you run Terraform which eliminates the need for the operator to update their copy of the state file.
Locks Are handled by the storage solution so two operators can never change the same state file at once. If the state file is locked already then the 2nd operator will receive an error stating the fact.
Secrets Data is transported and stored in an encrypted state and access to the backend system can be better handled by IAM roles and other systems much better than in a VCS system.
Availability AWS S3 and other similar solutions are a managed service with built-in durability and availability meaning your important state file is quite secure
Versioning By adding versioning to the state file you can in effect take advantage of rolling back to previous configurations if an update creates problems
Don’t store state in the VCS
State files should not be saved to GitHub (or whichever VCS you use) as mistakes will happen. For example, you forget to pull down the state’s latest version and run a Terraform command. Suddenly you’re essentially rolling back the deployed infrastructure or making unintended changes. Most VCS systems do not support locks meaning two people could change the same state file leading to corruption. All secrets are written to state files in plain text, breaking a fundamental law of; DON’T SAVE SENSITIVE DATA IN A VCS.
Chicken or egg when creating a new storage account for state
Typically you have a single storage account to store multiple state files on a per AWS account or Azure subscription basis as this keeps things relatively simple. When you create the new cloud account you’ll need to create the storage account so you can store your base infrastructure such as the VPC.
But there is a chicken before egg situation as you want to store Terraform state in a cloud storage account but you need to create the storage account using Terraform. So to store your state in the cloud you need to go through two stages.
First, create the storage account with a local backend defined in your Terraform configuration file. This will store the state file in the Terraform working directory.
Then you configure the remote backend in the same Terraform configuration file and define the newly created storage account as the target. When you run terraform init
it will ask you if you want to store the state in the newly described remote backend instead of locally and when you answer yes the state is then moved to the cloud storage account.
Use AWS S3
The following Terraform code examples will create an AWS S3 bucket specifically to support Terraform state and take into account the considerations explained earlier about chicken or egg.
resource "aws_s3_bucket" "terraform_state" {
bucket = "terraform-state-grinntec-learn"
# Prevent accidental deletion of this bucket
lifecycle {
prevent_destroy = true
}
}
// Enabled bucket versioning
# Each update creates a new version of the state file
resource "aws_s3_bucket_versioning" "enabled" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
// Turn-on server side encryption
# Ensures data stored on S3 disk is encrypted
resource "aws_s3_bucket_server_side_encryption_configuration" "enabled" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
// Block all public access to the S3 bucket
# Prevents making the S3 bucket publicly accessible
resource "aws_s3_bucket_public_access_block" "public_access" {
bucket = aws_s3_bucket.terraform_state.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
// Create a DynamoDB table for locking
# Key:vaue store
# Strongly consitent reads and conditional writes
# Primary key must be called 'LockID'
resource "aws_dynamodb_table" "terraform_locks" {
name = "terraform-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
provider "aws" {
region = "us-east-2"
}
provider "aws" {
region = "us-east-2"
}
// This section tell Terraform to store the state file remotley
# When you create the S3 bucket this section should be hashed out (chicken and egg)
# Once the S3 bucket exists then uncomment this block and run
# terraform init
# Terraform will ask if you want to transfer the local state to the remote state
terraform {
backend "s3" {
bucket = "terraform-state-grinntec-learn"
key = "global/s3/terraform.tfstate"
region = "us-east-2"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
Use Azure Blob Storage
To use Azure to host your state file you can configure a storage account with a blob container and then parse the storage account access_key value to the Terraform environment when executing so the state file can be stored there instead of locally.