Refactoring Terraform

Overview

 

Terraform has become the de facto standard of building Cloud infrastructure in Infrastructure As Code (IaC) fashion. You can establish your setup by asking Terraform to create all Cloud resources in one big ‘tf’ file.

When your infrastructure grows bigger and more complex, more frequent updates to your infrastructure and hence more revision to your infrastructure code will be required. If everything goes well, your change will be deployed in that affected resource. However if you are just using a big tf file to cater for all your resources, even though you are intending to change the ASG max capacity by updating a parameter and applying it to existing infrastructure, there is a chance that a typo in other unaffected resources in the tf file will accidentally bring down your whole infrastructure.

As a result, it is suggested to breakdown a big file into smaller files using Terraform modules, which is known as Refactoring. In that way, your code can be better abstracted and it enables your code to be re-used in your other setup.

I am not going into details about how an existing infrastructure tf file can be refactored because it is greatly depending on your infrastructure usage. Some may break it down into several regions and then service modules, but some may think categorizing by services and then by regions. There is no one size fits all.

However, my suggestion is that least dependent resources should be placed in individual files. This not only minimizes the risk of accidental damage to unrelated resources upon a change being applied, but also delegation of resource ownership can be achieved. It is not uncommon that a giant infrastructure is shared by many teams where each team may have their own project schedules. Every little code change has to wait for the master release update to get their change deployed.

Even though an infrastructure is modularized, without well-defined resource dependencies in tf files, delivery speed will be adversely impacted. By delegating the ownership of certain resources to the team who make the most frequent changes, schedule blocking can be avoided.

If you determine to refactor your Terraform, the command ‘terraform state mv’ will be indispensable. Instead of destroying the existing infrastructure and applying a new one (which causes service outage), you may design your new infrastructure structure in terms of folder structure, modules, variable files placement and write up new tf files. By running ‘terraform plan’ against the new tf files and existing terraform state file, the plan output will tell you what old_resources need to be destroyed and what new_resources should be added due to the updated infrastructure structure. Remember NOT to run ‘terraform apply’ at this stage, otherwise your existing infrastructure will get destroyed. Going through the plan output, you ‘terraform state mv old_resource new_resource’ against your existing terraform state file. After the move, Terraform will know your new_resouce is referring the old_resource. When you ‘terraform plan’ against new tf files again, that old_resource will no longer flagged to be destroyed. Then repeat this “plan and mv” until no more resources are reported to be destroyed or added.

Of course, to play safe, it is always better try your new code in a non-production environment before applying to your target environment.

Photo by Markus Spiske on Unsplash