Intro To Terraform
Infrastructure as Code is a Necessary Evil...Also I thought Terraforming was at thing we wanted to do to Mars one day? š
Iāll admitā¦for a large part of my career working on cloud services, I avoided Infrastructure as Code (IaC) like the plague. I found it to not be as exciting as building data pipelines, and I more/less viewed it as āOh thatās another teamās jobā. However, as time went on, I came to realize that I could get things done exponentially faster if I went ahead and bit the bullet and learned IaC. With IaC, when you build out POCās, your work is much easier to reproduce for someone else that wants to follow along later. They donāt have to find the test account you were using to create various cloud assets. They donāt have to chase down the IAM roles and permissions you built out. They can just rip the IaC code as-is, make some minor modifications, and run it (usually). Thankfully, IaC, in my humble opinion, is one of the easiest programming languages there is to learn.
So What Exactly is āIaCā?
IaC (Infrastructure as Code) is what it sounds like. You use code to create assets/resources in the cloud. This can be things such as:
Cloud Storage Buckets
Cloud Data Warehouses
IAM roles
IAM Policies
Cloud Bucket Replication Rules
Data Pipeline Jobs
And thatās just to name a handful. When you look at the Terraform registry for providers AWS and GCS, they have APIād the majority of their services which allows you to create them with code.
But Why Should I Bother With It?
Why not just log into the cloud console via the web browser and create your stuff there with the nice little GUI? Simply put, a few things:
You want your assets to be reproducible by others
You want to scale asset creation and management across multiple departments in an organization
You want backups and audit trails of the assets you created
Those are just a few of the benefits off the top of my head. Additionally, IT security and governance requirements are usually a big reason why IaC is required these days.
Enough of the Introā¦Letās Get Crackinā
Alright, so far weāve given a high level overview of IaC, but letās actually see this in action. For the remainder of this article, I will walk you through how to create Terraform code to create 2 GCS buckets in separate regions. This is usually a typical pattern these days for HADR (High Availability Disaster Recovery); a bucket in a primary region followed by a replicated bucket in a secondary region. This article however will not setup any replication stuff, as we want to keep it simple, and DR can get very complex.
The Folder Structure
I have found through working with various Terraform projects that the following folder structure works pretty well:
You have a parent folder called āterraformā. In that folder, you have typically 3 files:
main.tf - this is the entry point/main script that gets fired when you tell terraform to go do things like create assets
locals.tf - this is where you can store static variables
provider.tf - this is where you specify the provider for terraform to use to create assets (e.g. Google Cloud/AWS/etc)
Additionally, a lot of terraform projects will have a fourth file called āvariables.tfā. I donāt have that in my project today, but that file allows you to create non-static e.g. mutable variables that can change throughout the terraform scripts while they are running.
After those files, we have a sub folder that I call āmodulesā. You can call it whatever you want such as āFranks-Pizza-Shopā. This folder will contain template scripts for the various assets you want to create, which is great when you want to scale out your IaC code and not have to repeat blocks of code over and over again.
Now that we have the folder structure squared away, letās look at these individual filesā¦
The Provider File
Letās take a peak at the provider file first.
This file specifies that we are using Google (GCP) as our provider for Terraform to create assets in. You will notice that Iāve listed the āgoogleā provider twice on lines 12 and 16 - once for the default us-east1 region and a secondary listing for region us-west1. The second listing has an alias called āwestā. If you do not alias a provider, that is assumed to be the default provider.
You might ask - why do we need multiple listings of the same āproviderā here e.g. Google? That is because in order for us to create buckets or other assets in multiple regions, we need to have the corresponding provider passed to our code that has the region preset. If this project was only focused on creating assets in one region, then the secondary listing of the provider with the us-west1 region would not be necessary.
The Locals File
Now, letās take a look at our locals.tf file.
This file is where I can create static variables that I can reference in the main.tf script later. From an organizational perspective, something important to consider is to label/tag your various assets you create via Terraform. This helps with cost management and knowing who created various things in the cloud. It helps with overall finOps monitoring/reporting. In GCP, they call it ālabelsā. In AWS, they call it ātagsā. You will see later in the terraform scripts where I add this list of labels to each asset I create.
Side Note - Googleās āLabelsā are very finicky on what characters can go in them. You canāt use symbols other than hyphens or underscores from what I've noticed. This means for things like contacts, you canāt put an ā@ā symbol in them and provide an email address. In AWS, thatās not the same case. AWS allows you do put more informative characters in your labels.
The Main File
This is where things start to get exciting. Now that we have our initial plumbing out of the way, letās see what the main.tf file looks like:
Alright, in this file, you will see I have affectively 2 code blocks a.k.a - modules. Each block invokes a template file, which for our example lives in the modules/gcs subfolder. This allows us to reuse our template code and not have to repeat it multiple times. This also allows us to easily create our GCS buckets in multiple regions, since we can in each module specify our provider. You will see in the first module, I simply just specify āgoogleā as the provider, but in the second module, I specify āgoogle.westā which indicates I want to create stuff in the west region. You will also see how Iām leveraging my locals.tf static variables; Iām passing in the region for each module to assemble my bucket name variable. Iām also passing in the tags that I want on the buckets.
Pro Tip - When you execute the main.tf script in Terraform, it will run as many things as it can in parallel. Terraform is usually smart enough to sniff out dependencies and the order in which things are created. If you want to instead force Terraform to run things based on dependencies, you can add in the module the depends_on block.
Now Letās Look at Our Template Code for GCS
Now we will look at our gcs.tf template file. And this is where things get fun:
This is our ātemplateā file a.k.a. our reusable code. In this file, Terraform creates a GCS bucket based on the arguments passed in from our main.tf file. You will notice in this file that those variables are prefixed with āvar.ā. In order to make this work, each template file will also require a file called āvariables.tfā. That file allows us to pass mutable values down to the template file script. Our variables.tf file in our GCS folder is as simple as this:
Ok, so back to the gcs.tf file. We have a few things going on here:
We are passing in our bucket name, the tags, and our region.
We are setting the GCS storage class to standard
we are setting a lifecycle rule to delete items older than 30 days
we are disabling public access to the bucket (sorry cryptobros)
Side Note - Googleās provider for Terraform makes it much easier to specify a region to create your bucket in, as that is an actual parameter in the google_storage_bucket terraform code. In AWS, the region is not an available parameter, and thus the provider passed via the module kicks in. I left it in here anyways as it does not harm the google code and can be useful for those needing to create buckets in multiple regions in AWS.
Now that we have walked through the Terraform code, how does one actually run it?
Running Terraform Code
If you donāt have terraform installed, I recommend you do so š. I used homebrew on my mac to install it. Once you have it installed, from the terminal, navigate to your projects terraform folder and run this command:
terraform init
This will tell terraform to scan the terraform folder plus all corresponding sub folders and get itself geared up to create terraform assets. When you run this, terraform will download any necessary APIās based on the provider you specified in the provider.tf file.
Next, we will run in the terminal this command:
terraform plan
This is basically our unit test, where Terraform checks syntactically that our code is sound and that we have the permissions to do what we say we want to do.
Pro Tip - Iām using my GCP default application credentials to authenticate to googleās cloud, which you can instantiate via the GCP SDK cli. For an example of getting your default credentials loaded locally, you can see this post and look for the section running the gcloud login part. This avoids me hardcoding credentials somewhere. Terraform is smart enough to detect these creds in my environment and will use them to check and execute the terraform code.
Assuming the terraform plan runs and throws no errors, itās time to actually go create the assets. To do this, we will run this command:
terraform apply -auto-approve
When we execute that code in our terminal, Terraform will run our main script, which calls the gcs subfolder template code twice via the module blocks to create our 2 buckets. The flag ā-auto-approveā tells the script to fully run without waiting for our permission. If we did not put that flag in, once the code is actually ready to run, our terminal would prompt us to type the word āyesā to approve it making the changes.
So how does this look when we run it?
ā¦Hot Diggity Dog! Alright, it says our buckets have been created. Letās go take a peak in GCS:
Well there you go. We have our 2 buckets in separate regions. Now for the fun part. How do we undo this stuff quick and easy? Terraform has a command for that call āterraform destroyā. We will run this code to nuke the buckets:
terraform destroy -auto-approve
Alright, now those buckets are gone. Usually though, you donāt want to destroy these assets in production, as you are creating them for yourself or others to use. The only time I use the terraform destroy command is if iām building stuff in a test account that I later need to clean up, so I donāt cause unnecessary spend for unused assets just hanging around out there in the cloud.
Conclusion
This article walked you through an introduction to Terraform and IaC. IMO, IaC can significantly up your game in the data engineering world and get you closer to that ever allusive full stack rainbow rocket powered unicorn of a developer.
Hereās a link to the terraform code we covered: TF Code
Thanks for reading,
Matt
Well now I'll have to quit telling myself data pipelines are another team's job.