Skip to main content
Version: 5.11.x

Building Custom Machine Images

Aspect provides some simple Machine Images. However, none of these might be suitable:

  • You need tight control over security patches and versions of the operating system and packages.
  • We cannot guarantee our images have fixes for vulnerabilities.
  • If the build and tests are not 100% hermetic, they might require some system-level packages to be installed.

This page documents how to build a custom machine image, using Packer.

Preparation

First, you'll need the Packer program. You can download it from https://developer.hashicorp.com/packer/downloads.

If you'd like to have Bazel manage the Packer binary, chat with your Aspect account rep. That's how we do it in our own private monorepo.

Packer will need to make API calls to the Cloud provider (AWS/GCP) so make sure you're authenticated as a role that has access to create new EC2/Compute instances.

Choose a base image

Navigate to EC2 > AMIs. Make sure you're in the region where you plan to deploy.

We recommend you start from an Amazon-supplied image, for example search for amzn2-ami. Consult Getting Started with AWS for background on using Packer to make AMIs.

Create packer script

Packer scripts are written in the HashiCorp Configuration Language (HCL), just like Terraform. We commonly use a .pkr.hcl file extension, in a file next to the main.tf or workflows.tf.

First, install plugins.

You can create a locals block to hold values in a way that's easier to read, for example:

locals {
# This is a public Amazon Linux 2 image in us-east-2
# We use this AMI because it already contains everything needed to interact with AWS
# Name: amzn2-ami-kernel-5.10-hvm-2.0.20220719.0-x86_64-gp2
source_ami = "ami-051dfed8f67f095f5"
region = "us-east-2"
platform = "linux/amd64"
}

Make a source and build block following the Packer documentation, or look at our full example below.

To add system dependencies, add a provisioner to the build block of your script.

Minimal dependencies

To run Aspect Workflows, a machine image needs these programs installed:

DependencyPurpose
fuserequired for the Aspect Workflows high-performance remote cache configuration
gitfetching source code to be tested
mdadmrequired when mounting NVMe drives with raid 0
rsyncused during bootstrap
rsyslogused for system logging

The following programs are required on AWS AMIs:

DependencyPurpose
amazon-cloudwatch-agentused for gathering logs
amazon-ssm-agentrequired for AWS SSM support
note

amazon-ssm-agent comes pre-installed on Amazon Linux 2 and Amazon Linux 2023 base AMIs.

The follow programs are required on GCP machine images:

DependencyPurpose
google-osconfig-agentGoogle operational monitoring tools used to collect and alarm on critical telemetry

The follow programs are required when running Aspect Workflows on GitHub Actions:

DependencyPurpose
libicuneeded by the GitHub Actions agent; see https://github.com/actions/runner/issues/2511

The follow dependencies are recommended but may not be needed for all use cases:

DependencyPurpose
ziprequired by Bazel if any tests create zips of undeclared test outputs
patchmay be used by some rulesets and package managers during dependency fetching

Add dependencies for your build

For example, many builds require that docker is installed and running.

You can install packages from apt-get or yum (depending on your Linux distribution):

build {

...

provisioner "shell" {
inline = [
# Install additional dependencies
"sudo apt-get update",
"sudo apt-get --assume-yes install clang-13 libgdal-dev openjdk-17-jdk-headless libtinfo5"
]
}
}

Full examples

See our https://github.com/aspect-build/workflows-images repository for full packer files, along with a Bazel setup allowing you to easily run a hermetic Packer binary.

Testing a new image

After a trial, and once your Aspect Workflows deployment has real traffic, changes to the machine image could break the CI and slow down developers.

We recommend creating a separate "canary" runner group. In your Workflows terraform, add:

  • another data "aws_ami" or data "google_compute_image" block to select the "canary" image
  • another entry in the resource_types block setting the image_id to the "canary" image id
  • another entry in [CI platform]_runner_groups that selects the new resource_type

After an apply, the new infra should be running.

Next, make a "canary" version of the Aspect workflows configuration, in path such as .aspect/workflows/config-canary.yaml. Select the canary queue for all jobs, for example:

tasks:
- test:
queue: canary

Finally, you can modify your CI configuration file to use a conditional to select a subset of traffic to run on the "canary" runner group.

For example, in GitHub Actions you can use the queue input of the reusable workflow with an expression that checks the name of the branch where a PR originates. We'll use this to set the queue property, which applies to the setup job, as well as the aspect-config property, pointing to the config-canary.yaml file created above.

queue: ${{ contains(github.head_ref, '__canary__') && 'canary' || 'default' }}
aspect-config: .aspect/workflows/config${{ contains(github.head_ref, '__canary__') && '-canary' || '' }}.yaml

Now any PR from a branch containing __canary__ in the name will run on the new machine image.

You may want to ensure that builds on this branch don't get any cache hits, so you're sure the new machine image is able to build and test everything in a non-incremental build. You can edit the .bazelrc file to include cache-busting environment variables, both for actions and for repository rules. For example:

# cache-bust for testing canary, values are arbitrary
build --repo_env=CACHE_BUST=001 --action_env=CACHE_BUST=001