Testing Infrastructure Code

Testing Infrastructure Code

[Clone and follow along in the project repository...]

If you're among the increasing majority who are using Terraform in some fashion, you have probably heard of Terratest. In the project's own words, Terratest is a Go library that provides patterns and helper functions for testing infrastructure. While useful, Terratest may seem daunting since it requires writing Go. If you've wanted to get started with Terratest but haven't had time to dig in, read on for a quick and painless intro!

First, why do we need Terratest at all? It could be viewed as another project dependency or complexity adding cognitive load. It's good to be mindful of such things. The key is accepting complexity when the added value (or reduced risk) is deemed a worthy tradeoff. In the case of Terratest, particularly in larger projects or shared modules, we trade some additional complexity for the peace of mind test coverage provides – specifically, that key features continue to work as projects evolve.

Getting Started

Terratest leverages Go's native testing library. The first thing that means is you need Go installed and properly configured. Luckily, installing Go has gotten easier over the years. Instead of downloading releases and following manual instructions (you can still do that if you prefer), it's likely just a brew install go away. Aside from the installation, you'll want to create a workspace (traditionally ${HOME}/go) and add some new environment variables to your shell profile. Here's mine:

➜ grep GO ~/.zshrc         
export GOPATH="${HOME}/go"
export GOROOT="$(brew --prefix golang)/libexec"
export PATH="${PATH}:${GOPATH}/bin:${GOROOT}/bin"
Minimal Go Environment

There is more we could discuss, or perhaps debate... such as a very brief introduction to Go modules. They'll be used to install dependencies, such as Terratest itself and the testify library typically used to make assertions. I'm going to avoid these rabbit holes, because you'll see enough examples below to get started.  For modules in particular, we'll automate the few things needed for Terratest, and everything else has really been said before.

Another bootstrapping issue is project layout. This is a holy war like spaces vs tabs or mono vs poly repo. This project's repo will use the organization scheme below – feel free to experiement and pick what works best for you. There really is no right way, but there are a few common patterns you'll see when browsing community modules. Aside from the Terraform practice of breaking out discrete functionality as modules, tests are often placed in ${PROJECT_ROOT}/test/src directory. Tests run against specific configurations in ${PROJECT_ROOT}/examples.

Here are the key parts we'll go with to get started... This is a collection of patterns I've found myself using lately, but avoid dogma and figure out what works for your team (for example, in a larger project you may want the application code in a separate repository, or prefer environment specifics organized by appropriately-named tfvars vs directories):

.
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── build
│   ├── Dockerfile
│   └── src
├── environments
│   └── dev
│       └── Makefile
├── examples
├── modules
│   └── network
│   └── web
└── test
    ├── Makefile
    └── src
        └── example_complete_test.go
        └── etc.
Avoid dogma. Find a project structure that works for you.

We want to focus on Terratest vs Terraform, so our contrived example takes a number of shortcuts such as using the default VPC in your AWS account and Amazon's DNS. To keep it slightly more realistic, we'll break out a couple modules. You might use a community module to handle the network bits, an internal module from your network team that exposes custom network details, or add additional services to meet your requirements... This all plugs nicely into our example hierarchy. You can represent high level functionality as modules, which in turn may consume other modules for the heavy lifting, then customize how it all gets stitched together for each target environment.

One last aside before we jump in... The sample project is configured to use aws-vault. If you aren't already using this great utility, you should be! Particularly useful when working across a lot of different accounts, it manages AWS-related environment details for you so you can focus on getting work done. It keeps credentials locked away in your OS' keychain and uses temporary credentials to access your infrastructure – a security win even if you only use a single account. You don't have to use aws-vault to use Terratest, but the scripts we'll be using assume it's in place. Feel free to refactor to meet your personal tastes, or take a few minutes to get aws-vault installed before jumping into the examples.

Boilerplate

Before we can test, we need a few lines of boilerplate to load any required modules and configure Terratest. Luckily it's not much, and once you work it out for one project it's easy to copy or turn into a template:

package test

import (
	"testing"

	"github.com/gruntwork-io/terratest/modules/terraform"
	"github.com/stretchr/testify/assert"
)

func TestExamplesComplete(t *testing.T) {
	t.Parallel()

	terraformOptions := &terraform.Options{
		TerraformDir: "../../examples/complete",
		Upgrade:      true,
		VarFiles: []string{"fixtures.us-east-2.tfvars"},
	}

	defer terraform.Destroy(t, terraformOptions)

	terraform.InitAndApply(t, terraformOptions)
Terratest Project Boilerplate

Note how the terraform import is under a terratest/modules path. We'll see examples of using modules below, but it's a signal of Terratest's modulear approach (check out the full list in their repo, or the module documentation). Aside from the expected Terraform coverage, there are modules for your IaaS of choice, ways to validate common DevOps tooling (Docker, k8s), as well as http and shell modules which provide a lot of flexibility.

Technically you don't need an assertion framework, but we also pull in testify for simplified assertions. You can easily swap this out if you have a preferred framework, or drop it entirely by using native comparisons. Next, we configure terraformOptions, providing the path to the code to test and passing any var files to be used (relative to TerraformDir). Lastly, in typical test fashion, we defer a cleanup operation to ensure we don't leave artifacts around (more on this below), and use InitAndApply to run terraform init and terraform apply as part of each test. There are also variations such as InitAndPlan, Init, Plan and Apply.

Since we are testing a simple example, we'll just build one "complete" test suite covering all functionality. In a real-world project supporting different use cases, you might have several test suites covering common configurations, with different fixtures customized accordingly.

A Simple Test

In our contrived example, we have a network module that discovers the default VPC and associated subnets. In the real world you might have complicated infrastructure you manage or shared infrastructure from another team that you simply consume via state. Regardless, you'll typically need to utilize network components like VPCs and subnets to get your service deployed. Wouldn't it be nice to confirm your sensitive production service actually deploys to the desired network?

Let's write the simplest test we can... Since the example code selects the default VPC in an attempt to run anywhere, we'll build a test that confirms the returned VPC starts with the string vpc-. This is a great example of a useless test, but easy to extend in your environments. For example, you might want to ensure target VPCs have specific tags.

In classic red/green/refactor style, let's write the test in a way we know will fail:

// ...

	vpcID := terraform.Output(t, terraformOptions, "vpc_id")
	assert.Equal(t, "vpc-foobah", vpcID)
A Simple Terratest

As expected, running our test returns:

    TestExamplesComplete: examples_complete_test.go:37: 
                Error Trace:    examples_complete_test.go:37
                Error:          Not equal: 
                                expected: "vpc-foobah"
                                actual  : "vpc-7a5ce123"
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -vpc-foobah
                                +vpc-7a5ce123
                Test:           TestExamplesComplete
--- FAIL: TestExamplesComplete (293.52s)
FAIL
exit status 1
FAIL    test    293.834s 
An Expected Failure

It's always nice to confirm things fail when expected... Let's fix that:

import (
	"strings"
	// ...
)

// ...

	vpcID := terraform.Output(t, terraformOptions, "vpc_id")
	assert.True(t, strings.HasPrefix(vpcID, "vpc-"))
Fixed?

Note how we used the standard strings library to extend our test. This is generic enough it should match any returned VPC. Does it?

--- PASS: TestExamplesComplete (291.19s)
PASS
ok      test    291.875s
A Passing Test

Awesome, now we have confidence things work as expected. Just to make this a bit more interesting, let's ensure the list of availability zones output by the network module match the region specified in our fixtures:

// ...

	availabilityZones := terraform.OutputList(t, terraformOptions, "availability_zones")
	for _, az := range availabilityZones {
		assert.True(t, strings.HasPrefix(az, "us-east-2"))
	}
Extending Our Tests

Ideally we've added coverage, and everything still passes:

--- PASS: TestExamplesComplete (308.07s)
PASS
ok      test    308.782s
A Passing Test Suite

One gotcha to be aware of, there is a lot of output when tests are running. I've purposefully zeroed in on the more informational parts. One section shows the outputs from the terraform run. Here's an example:

TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: Apply complete! Resources: 11 added, 0 changed, 0 destroyed.
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: Outputs:
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: availability_zones = [
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "us-east-2a",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "us-east-2b",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "us-east-2c",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ]
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: cloudwatch_log_group = /ecs/terratest-experiment-dev
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: dns_name = terratest-experiment-dev-2000011916.us-east-2.elb.amazonaws.com
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ecr_repository_url = 012345678901.dkr.ecr.us-east-2.amazonaws.com/terratest-experiment-dev
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: subnet_cidrs = [
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "172.31.32.0/20",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "172.31.0.0/20",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "172.31.16.0/20",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ]
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: subnet_ids = [
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "subnet-abcdef",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "subnet-ghjkil",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66:   "subnet-mnopqr",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ]
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: vpc_cidr = 172.31.0.0/16
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: vpc_id = vpc-7a5ce123
Terratest Outputs

When I first started writing tests that compare outputs, I naively reached for terraform.Output and tried treating things like subnet_cidrs as lists. Output returns a string. Be aware of Terratest's other output methods such as OutputList (seen above) and OutputMap. These will let you use native slices and map methods without cryptic errors about byte slices. :-)

That's a lot to take in for the basic example... One last thing before we move on. It takes awhile, huh? That's why I consider Terratest more of a "functional test" framework vs unit tests that typically only take seconds (or sub-seconds) to run. You can help this a bit by testing with intent (avoid too many tests, don't duplicate coverage, and minimize network calls), but in the end running Terraform and confirming the actions it took takes time.

Using Modules

Techncially, you've already used modules... terraform is a module like all the others. Let's extend this a bit by roping in the aws module to do some IaaS-specific probing:

import (
	"github.com/gruntwork-io/terratest/modules/aws"
	// ...
)

// ...

	deploymentSubnets := terraform.OutputList(t, terraformOptions, "deployment_subnets")
	for _, s := range deploymentSubnets {
		assert.True(t, aws.IsPublicSubnet(t, s, "us-east-2"))
	}
Using the AWS Module

Our network module just exposes the default subnets provided by AWS. In your case you likely have public and private subnets. Something like a database cluster is usually on a set of private subnets. Just aws.IsPrivateSubnet, right? Good guess, but this is where browsing the module documentation pays off. As it turns out, there is no IsPrivateSubnet method, but that's easy to work around by inverting our assertion:

	privateSubnets := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
	for _, s := range privateSubnets {
		assert.False(t, aws.IsPublicSubnet(t, s, "us-east-2"))
	}
Testing for Private Subnets

Advanced Topics

I'm sure you've noticed I kept showing tests, but not how I ran them... Since there's a bit of setup and you'll likely be running tests a lot (including in pipelines), I prefer a simple "test harness" to add consistency and reduce typing. One way is using a Makefile:

AWS_PROFILE := personal
REGION := us-east-2
VAULT_CMD := aws-vault exec $(AWS_PROFILE) --
# TF_CMD := $$GOPATH/bin/terraform
TF_CMD := terraform

export TF_DATA_DIR ?= $(CURDIR)/.terraform
export TF_CLI_ARGS_init ?= -get-plugins=true

init:
	cd src && go mod init test

tidy:
	cd src && go mod tidy

test:
	$(TF_CMD) fmt --write=false -check -diff -recursive ..
	cd src && $(VAULT_CMD) go test -v -timeout 30m

clean:
	cd src && rm -rf $(TF_DATA_DIR) go.mod go.sum
Saving Keystrokes

Another thing to think about if running a lot of parallel tests or using Terratest in a large organization is how to safely orchestrate tests at scale. Since tests are running Terraform and creating real resources, it's possible to have collisions if teams are testing in the same account. Even when doing everything right, you could leave artifacts behind or get something in a bad state. Agree on an approach to reduce stress... One way is dedicated test accounts. It's good to test new code in disposable environments. While you may ultimately run your tests in production to validate deployments, by then you will have confidence the code works as expected.

Conclusion

If your project is worth creating, it's probably worth testing... While wasteful tests are to be avoided, testing with intent is essential to ensure quality. Luckily, it's easy to get up and running with Terratest. You don't need to reinvent the wheel, you can leverage a veritable Swiss Army Knife to cover your infrastructure code and leverage the flexibility of modules to creatively extend validation beyond Terraform itself.

Best of all, Terratest is open source. Whether you want to help the community by submitting PRs or read the code to understand how it works and gain confidence in your tooling, the code is there for you to browse and extend.

Show Comments