Building Custom Machine Images
Aspect provides some simple Machine Images. However, none of these might be suitable:
- You need tight control over security patches and versions of the operating system and packages.
- We cannot guarantee our images have fixes for vulnerabilities.
- If the build and tests are not 100% hermetic, they might require some system-level packages to be installed.
This page documents how to build a custom machine image, using Packer.
Preparation
First, you'll need the Packer program. You can download it from https://developer.hashicorp.com/packer/downloads.
If you'd like to have Bazel manage the Packer binary, chat with your Aspect account rep. That's how we do it in our own private monorepo.
Packer will need to make API calls to the Cloud provider (AWS/GCP) so make sure you're authenticated as a role that has access to create new EC2/Compute instances.
Choose a base image
- AWS
- GCP
Navigate to EC2 > AMIs. Make sure you're in the region where you plan to deploy.
We recommend you start from an Amazon-supplied image, for example search for amzn2-ami
. Consult Getting Started with AWS for background on using Packer to make AMIs.
Navigate to Images and search for a suitable base image. For example, search for ubuntu
or debian
.
Consult Google Compute Builder for background on using Packer to make Images.
Create packer script
Packer scripts are written in the HashiCorp Configuration Language (HCL), just like Terraform.
We commonly use a .pkr.hcl
file extension, in a file next to the main.tf
or workflows.tf
.
First, install plugins.
- For GCP, use https://developer.hashicorp.com/packer/plugins/builders/googlecompute
- For AWS, use https://developer.hashicorp.com/packer/plugins/builders/amazon/ebs
You can create a locals
block to hold values in a way that's easier to read, for example:
locals {
# This is a public Amazon Linux 2 image in us-east-2
# We use this AMI because it already contains everything needed to interact with AWS
# Name: amzn2-ami-kernel-5.10-hvm-2.0.20220719.0-x86_64-gp2
source_ami = "ami-051dfed8f67f095f5"
region = "us-east-2"
platform = "linux/amd64"
}
Make a source
and build
block following the Packer documentation, or look at our full example below.
To add system dependencies, add a provisioner
to the build
block of your script.
Minimal dependencies
To run Aspect Workflows, a machine image needs these programs installed:
Dependency | Purpose |
---|---|
fuse | required for the Aspect Workflows high-performance remote cache configuration |
git | fetching source code to be tested |
mdadm | required when mounting NVMe drives with raid 0 |
rsync | used during bootstrap |
rsyslog | used for system logging |
The following programs are required on AWS AMIs:
Dependency | Purpose |
---|---|
amazon-cloudwatch-agent | used for gathering logs |
amazon-ssm-agent | required for AWS SSM support |
amazon-ssm-agent
comes pre-installed on Amazon Linux 2 and Amazon Linux 2023 base AMIs.
The follow programs are required on GCP machine images:
Dependency | Purpose |
---|---|
google-osconfig-agent | Google operational monitoring tools used to collect and alarm on critical telemetry |
The follow programs are required when running Aspect Workflows on GitHub Actions:
Dependency | Purpose |
---|---|
libicu | needed by the GitHub Actions agent; see https://github.com/actions/runner/issues/2511 |
Recommended dependencies
The follow dependencies are recommended but may not be needed for all use cases:
Dependency | Purpose |
---|---|
zip | required by Bazel if any tests create zips of undeclared test outputs |
patch | may be used by some rulesets and package managers during dependency fetching |
Add dependencies for your build
For example, many builds require that docker
is installed and running.
You can install packages from apt-get
or yum
(depending on your Linux distribution):
build {
...
provisioner "shell" {
inline = [
# Install additional dependencies
"sudo apt-get update",
"sudo apt-get --assume-yes install clang-13 libgdal-dev openjdk-17-jdk-headless libtinfo5"
]
}
}
Full examples
See our https://github.com/aspect-build/workflows-images repository for full packer files, along with a Bazel setup allowing you to easily run a hermetic Packer binary.
Testing a new image
After a trial, and once your Aspect Workflows deployment has real traffic, changes to the machine image could break the CI and slow down developers.
We recommend creating a separate "canary" runner group. In your Workflows terraform, add:
- another
data "aws_ami"
ordata "google_compute_image"
block to select the "canary" image - another entry in the
resource_types
block setting theimage_id
to the "canary" image id - another entry in
[CI platform]_runner_groups
that selects the newresource_type
After an apply
, the new infra should be running.
Next, make a "canary" version of the Aspect workflows configuration, in path such as
.aspect/workflows/config-canary.yaml
. Select the canary
queue for all jobs, for example:
tasks:
- test:
queue: canary
Finally, you can modify your CI configuration file to use a conditional to select a subset of traffic to run on the "canary" runner group.
For example, in GitHub Actions you can use the
queue
input of the reusable workflow
with an expression that checks the name of the branch where a PR originates.
We'll use this to set the queue
property, which applies to the setup
job, as well as the
aspect-config
property, pointing to the config-canary.yaml
file created above.
queue: ${{ contains(github.head_ref, '__canary__') && 'canary' || 'default' }}
aspect-config: .aspect/workflows/config${{ contains(github.head_ref, '__canary__') && '-canary' || '' }}.yaml
Now any PR from a branch containing __canary__
in the name will run on the new machine image.
You may want to ensure that builds on this branch don't get any cache hits, so you're sure the new
machine image is able to build and test everything in a non-incremental build.
You can edit the .bazelrc
file to include cache-busting environment variables, both for actions
and for repository rules. For example:
# cache-bust for testing canary, values are arbitrary
build --repo_env=CACHE_BUST=001 --action_env=CACHE_BUST=001