Terraform state file demystified!

Parul
5 min readNov 22, 2022

--

Terraform state, its importance, role of terraform refresh.

Image source: spacelift.io/

Whenever you create or update any infrastructure on any remote platform [ex: AWS] via Terraform, you must have noticed that terraform autodetects resources it previously created and updates them accordingly. Terraform plays this magic by tracking the state of your infrastructure.

In this blog, I will be explaining about terraform state file, why it is super important and how it works behind the scenes.

For keeping things simple to explain concepts, I will be taking an example of AWS provider and creation of simple EC2 instance via terraform.

What is Terraform State file?

When terraform creates a resource on cloud referring to your .tf files, it maps the resources created with all its configurations [including default configs] to a file called terraform.tfstate. This state file tracks the resources that Terraform is managing.

Let’s take example of an EC2 instance with following simple configuration:

ec2.tf file

When you run terraform apply, you can navigate to AWS console to check that one EC2 instance was created. After resource creation you will see terraform.tfstate file in your project directory where your other .tf files are. This is how the file will look like: [file truncated for readability]

{
"version": 4,
"terraform_version": "1.3.2",
"serial": 2,
"outputs": {},
"resources": [
{
"mode": "managed",
"type": "aws_instance",
"name": "firstEC2",
"provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
"instances": [
{
"schema_version": 1,
"attributes": {
"ami": "ami-0beaa649c482330f7",
"availability_zone": "us-east-2c",
"id": "i-instanceid",
"cpu_core_count": 1,
"cpu_threads_per_core": 1,
"instance_state": "running",
"instance_type": "t2.micro",
"monitoring": false
}
}
]
}
]
}

Thus the terraform resource you mentioned in your .tf file gets mapped to actual resource created in the real world.

How terraform auto-detect changes to configurations?

Through state file as shown above, terraform can uniquely identify each resource created on your AWS via its unique instance Id [in this case: “i-instanceid” for EC2 instance].

So, if you re-run this terraform script for EC2 and check changes via terraform plan, the output would look as shown below.

No duplicate resource will be created!

If you update instance type from t2.micro to t2.medium in your EC2 script, terraform will refer to this state file. It will compare the changes made in the code [desired state] and the changes to the actual infrastructure based on the state file [current state].

Now, terraform plan in this case will give the output:

Changes in terraform script detected!

This is how terraform manages to update your existing EC2 instance without creating a duplicate new instance. You only need to update your terraform scripts and entire lifecycle (create, update and delete) of existing and new resources is handled by terraform.

What if Terraform state file is deleted/ modified?

Let’s say you delete the terraform.tfstate file from your project directory. Or if say, you clear the resources json in the state file like shown below:

terraform.tfstate file with no resources

Now, terraform will no longer be aware of existing state of infra. You will end up with orphaned resources that are not being managed by Terraform.

So, if you re-run terraform apply for the above EC2 script [without destroying existing EC2 instance on AWS Console], you will see EC2 instance being created in terminal output.

On your AWS console EC2 dashboard also, you will see 2 instances running which was not the desired state for our infrastructure.

Thus, it is very important to keep terraform state file intact, up-to-date and avoid making any changes to it manually.

Can terraform refresh recover our state file changes?

When you run terraform refresh command or terraform plan command which runs terraform refresh as part of it, your state file is synced with your actual infrastructure of AWS.

But this command only detects the “drifts” for the resources already managed by terraform in its state file.

For example,

  1. if you created EC2 instance with t2.micro instance type via above terraform script. Now, If you manually go to AWS console EC2 dashboard and change instance type to t2.medium, terraform refresh command will detect this change and update your state file EC2 configuration with new instance type.
  2. If you spin EC2 instance via terraform script, and manually delete that EC2 from your AWS console, terraform refresh will detect this too. As terraform was already tracking this resource locally in state file, it autodetects this change from AWS & updates state file by removing this resource json inside it.

But, if you delete local state file. Or if you manually create a resource directly on AWS console without terraform [not used terraform import either] → then terraform refresh will not detect this change created outside of the state file.

Thus, it makes it very clear - why persisting the terraform state file is important.

Conclusion

To summarise, we learnt about terraform state files, their importance, why we should keep them intact with latest resources info and ensure that state file is in sync with resources on actual remote platform.

This way, you can keep provisioning new resources to your cloud [ex: RDS instance, S3 bucket on AWS] via terraform and terraform will keep appending these resource instances info to the resources json in state file.

That’s it for the scope of this blog.

In the next blog, I will be covering some challenges associated with managing state file, which we often tend to overlook as beginners and how to better manage it for team collaboration.

Thanks for reading until here! Feel free to share your thoughts / suggestions in the comments below. And do share it across, give it some claps if you found this blog helpful. :)

--

--

Parul

Developer @ThoughtWorks | Generation Google Scholar