🏗️ Infrastructure as Code with Pulumi

At Skutopia we provision and manage our cloud infrastructure (GCP) using Pulumi.

In the past I have provisioned infrastructure with tools such as CloudFormation (both when it had to be json 😭 and in the more modern yaml structure), Terraform, and now Pulumi. Pulumi has impressed me for it's ease of use, and developer friendly layout.

A Tool For Engineers

The majority of tooling we see around Infrastructure as Code has little emphasis on the code aspect. While we now have AWS and Terraform CDK's, the vast majority is still written in yaml or HCL. That has proven to be a steep learning curve for product focussed engineers, and often results in a "DevOps" team to manage. I put DevOps in quotes as having a team labelled DevOps goes completely against the philosophy.

How Does Pulumi Solve This?

We have found that Pulumi bridges the gap between infrastructure and product engineer. Our engineers install the pulumi npm package, along with the provider package and off we go. Writing infrastructure in Typescript just like our production code. This brings some core benefits.

The first is around familiarity. Our infrastructure is setup just like any other repo. All of our engineers know how to install a npm package and how to call the various functions within. Our engineers now look at infrastructure as using just another package. This keeps things simple.

The second core benefit is removing the context switch. As someone that has written code and infra at the same time. There is definitely a mind shift when this happens. Whether it is chucking curly braces in { } where they are not needed, or not indenting to the required format. It can become a frustrating experience. It looks correct to you as that is what you have been doing for the preceding few hours. This results in a slow resolution time for mistakes. Keeping our infrastructure in the same language as our code means it is a seamless shift.

Understanding Pulumi

Pulumi Projects

A Pulumi project is the parent of any Pulumi setup. By running a pulumi new we create the parent level Pulumi.yaml. This will contain some key information around the project as seen below.

name: <the project name>
description: <a short description of the project>
runtime: <in our case - nodejs>

Pulumi Stacks

The notion of a stack in Pulumi is independent and isolated configurable instance. The Pulumi stacks live within a Pulumi Project.

We use stacks to differentiate our phases of development. Within a repo we will generally have at least two stacks and sometimes three. These are our environments and they become represented in their own config file:

Pulumi.dev.yaml
Pulumi.stg.yaml
Pulumi.prd.yaml

Within each of the yaml files there will be various environment specific variables or encrypted secrets. Each stack has it's own encryption key within Pulumi meaning our secrets are protected between stacks, as well as protected from the outside world.

Pulumi Teams

The concept of teams in Pulumi allows us to put in role based access control at the stack level. We are using the GitHub identity provider which allows us to import our teams directly from GitHub. This brings benefits in that being granted access to a repo via a team will also grant you access to the Pulumi team.

We see it as absolutely critical that our staging and production environments are only deployed via our CI pipelines. Therefore our engineers in the teams are granted read only permissions. This doesn't prohibit development in any way. Engineers can still run a pulumi preview locally, create or modify secrets. They simply can't deploy and make changes to the Pulumi state.

For our dev or playground environments, depending on the stack, we allow this to be a little more free. Engineers will be granted write access as it is important that engineers have space to learn and understand the impacts.

Pulumi Policies

Pulumi allows us to introduce policies as code. This is a feature we are only just starting to get across. We can see how powerful this can be and are eager to introduce it. Pulumi Policies grant us the ability to implement business and security rules as functions that are executed across out projects and stacks.

This means that we can choose to meet certain compliance levels and proactively validate the stacks meet the compliance when deployed. Traditionally, meeting compliance happens after the fact in a slow feedback loop. We deploy with one tool, scan with another tool, and then create remediation tickets. The common result of this method is remediation never happens, resulting in security vulnerabilities.

By executing the policies at deployment time we have a fast feedback loop and can apply the remediation before our infrastructure is provisioned.

Pulumi State

Pulumi state is similar to the state of other IaC providers. One of the nicer things around the Pulumi state is the way we can visualise it in the Pulumi console. This is something I have had trouble with in he past with Terraform. A giant blob of json that is difficult to understand and see what the current state actually is.

In Pulumi we can drill into the state of each resource easily and get the individual resource state along with the outputs that are available to be consumed. We can quickly audit if a resource is protected with a simple padlock displaying next to the name. No need to find the resource in GCP and dig through the settings.

Wrapping Up

Using Pulumi as our Infrastructure as Code tooling has allowed our product engineers to quickly and easily expand their skill set into the infrastructure realm. This is because our production code and infrastructure are written in the same language. We see less mistakes and faster cycle time. Overall the Pulumi structure is clean and easy to follow

SKUTOPIA Technology