pnpm and rules_js
rules_js models npm package dependency handling on pnpm. Our design goal is to closely mimic pnpm's behavior.
Our story begins when some non-Bazel-specific tool (typically pnpm) performs dependency resolutions
and solves version constraints.
It also determines how the node_modules
tree will be structured for runtime.
This information is encoded into a lockfile which is checked into the source repository.
The pnpm lockfile format includes all the information needed to define npm_import
rules for each package,
allowing Bazel's downloader to do the fetches individually. This info includes the integrity hash, as calculated by the package manager,
so that Bazel can guarantee supply-chain security.
Bazel will only fetch the packages which are required for the requested targets to be analyzed.
Thus it is performant to convert a very large pnpm-lock.yaml
file without concern for
users needing to fetch many unnecessary packages. We have benchmarked this code with
800+ importers and ~15,000 npm packages to run in 3sec, when Bazel determines that an input changed.
While the npm_import
rule can be used to bring individual packages into Bazel,
most users will want to import their entire lockfile.
The npm_translate_lock
rule does this, and its operation is described below.
You may wish to read the generated API documentation as well.
Rules overview
As a high level overview, the primary rules and targets used by developers to fetch and link npm package dependencies are:
npm_translate_lock()
- generate targets representing packages from a pnpm lockfile.npm_link_all_packages()
- defines anode_modules
tree and the associatednode_modules/{package}
targets. This rule is required in the BUILD file of each package in the pnpm workspace that has npm packages linked into anode_modules
folder as well the BUILD file of the package that corresponds to the root of the pnpm workspace where the pnpm lock file resides.:node_modules/{package}
- targets generated bynpm_link_all_packages()
representing each package dependency from apackage.json
within the pnpm workspace.
For example:
pnpm-lock.yaml
WORKSPACE.bazel
> npm_translate_lock()
BUILD.bazel
> npm_link_all_packages()
├── A/
├── BUILD.bazel
> npm_link_all_packages()
├── B/
├── BUILD.bazel
> npm_link_all_packages()
Where the lockfile was generated from a pnpm workspace with two projects, A and B:
package.json
pnpm-lock.yaml
pnpm-workspace.yaml
├── A/
├── package.json
├── B/
├── package.json
Bazel targets such as js_library()
rules can now depend on npm packages using the :node_modules/{package}
targets generated from each npm_link_all_packages()
.
The :node_modules/{package}
targets accessible to a package align with how Node.js resolves npm dependencies: node_modules
from the current directory BUILD and above can be depended on for resolution at runtime.
Using npm_translate_lock
In WORKSPACE
, call the repository rule pointing to your pnpm-lock.yaml
file:
load("@aspect_rules_js//npm:repositories.bzl", "npm_translate_lock")
# Uses the pnpm-lock.yaml file to automate creation of npm_import rules
npm_translate_lock(
# Creates a new repository named "@npm" - you could choose any name you like
name = "npm",
pnpm_lock = "//:pnpm-lock.yaml",
# Recommended attribute that also checks the .bazelignore file
verify_node_modules_ignored = "//:.bazelignore",
)
You can immediately load from the generated repositories.bzl
file in WORKSPACE
.
This is similar to the
pip_parse
rule in rules_python for example.
It has the advantage of also creating aliases for simpler dependencies that don't require
spelling out the version of the packages.
# Following our example above, we named this "npm"
load("@npm//:repositories.bzl", "npm_repositories")
npm_repositories()
Note that you could call npm_translate_lock
more than once, if you have more than one pnpm workspace in your Bazel workspace.
If you really don't want to rely on this being generated at runtime, we have experimental support to check in the result instead. See checked-in repositories.bzl below.
Hoisting
The node_modules
tree laid out by rules_js
should be bug-for-bug compatible with the node_modules
tree that
pnpm lays out, when hoisting is disabled.
To make the behavior outside of Bazel match, we recommend adding hoist=false
to your .npmrc
:
echo "hoist=false" >> .npmrc
This will prevent pnpm from creating a hidden node_modules/.pnpm/node_modules
folder with hoisted
dependencies which allows packages to depend on "phantom" undeclared dependencies.
With hoisting disabled, most import/require failures (in type-checking or at runtime)
in 3rd party npm packages when using rules_js
will be reproducible with pnpm outside of Bazel.
rules_js
does not and will not support pnpm "phantom" hoisting which allows for
packages to depend on undeclared dependencies.
All dependencies between packages must be declared under rules_js
in order to support lazy fetching and lazy linking of npm dependencies.
See Troubleshooting for suggestions on how to fix problems caused by hoisting.
Creating and updating the pnpm-lock.yaml file
Manual (typical)
If your developers are fully converted to using pnpm, then they'll likely perform workflows like
adding new dependencies by running the pnpm tool in the source directory outside of Bazel.
This results in updates to the pnpm-lock.yaml
file, and then Bazel naturally finds those updates
next time it reads the file.
update_pnpm_lock
During a migration, you may have a legacy lockfile from another package manager.
You can use the update_pnpm_lock
attribute of npm_translate_lock
to have
Bazel manage the pnpm-lock.yaml
file for you.
You might also choose this mode if you want changes like additions to package.json
to be automatically
reflected in the lockfile, unlike a typical frontend developer workflow.
Use of update_pnpm_lock
requires the data
attribute be used as well.
This should include the pnpm-workspace.yaml
file as well as all package.json
files
in the pnpm workspace.
The pnpm lock file update will fail if data
is missing any files required to run
pnpm install --lockfile-only
or pnpm import
.
To list all local
package.json
files that pnpm needs to read, you can runpnpm recursive ls --depth -1 --porcelain
.
When the pnpm-lock.yaml
file needs updating, npm_translate_lock
will automatically:
- run
pnpm import
if there is anpm_package_lock
oryarn_lock
attribute specified. - run
pnpm install --lockfile-only
otherwise.
To update the pnpm-lock.yaml
file manually, either
- install pnpm and run
pnpm install --lockfile-only
orpnpm import
- use the Bazel-managed pnpm by running
bazel run -- @pnpm//:pnpm --dir $PWD install --lockfile-only
orbazel run -- @pnpm//:pnpm --dir $PWD import
If the ASPECT_RULES_JS_FROZEN_PNPM_LOCK
environment variable is set and update_pnpm_lock
is True,
the build will fail if the pnpm lock file needs updating.
It is recommended to set this environment variable on CI when update_pnpm_lock
is True.
If the ASPECT_RULES_JS_DISABLE_UPDATE_PNPM_LOCK
environment variable is set, update_pnpm_lock
is disabled
even if set to True. This can be useful for some CI uses cases where multiple jobs run Bazel by you
only want one of the jobs checking that the pnpm lock file is up-to-date.
npm_translate_lock_<hash>
A .aspect/rules/external_repository_action_cache/npm_translate_lock_<hash>
file will be created and
used to determine when the pnpm-lock.yaml
file should be updated. This file persists the state of
package and lock files that may effect the pnpm-lock.yaml
generation and should be checked into the
source control along with the pnpm-lock.yaml
file.
The npm_translate_lock_<hash>
file has been a known source of merge conflicts in workspaces with
frequent lockfile or package.json
changes. As a generated file manual resolution of merge conflicts
is unnecessary as it should only be generated and updated by npm_translate_lock
.
To reduce the impact on developer workflows git
can be configured to ignore merge conflicts using
.gitattributes
and a custom merge driver.
See our blog post for a longer explanation.
First, mark the npm_translate_lock_<hash>
file (with <hash>
replaced with the hash generated in your workspace)
to use a custom custom merge driver, in this example named ours
:
.aspect/rules/external_repository_action_cache/npm_translate_lock_<hash>= merge=ours
Second, developers must define the ours
custom merge driver in their git configuration to always accept local change:
git config --global merge.ours.driver true
Working with packages
Patching via pnpm.patchedDependencies
Patches included in pnpm.patchedDependencies are automatically applied by rules_js.
These patches must be included in the data
attribute of npm_translate_lock
, for example:
{
...
"pnpm": {
"patchedDependencies": {
"fum@0.0.1": "patches/fum@0.0.1.patch"
}
}
}
npm_translate_lock(
...
data = [
"//:patches/fum@0.0.1.patch",
],
)
Patching applied by rules_js may slightly deviate from standard pnpm patching behavior. The bazel-lib patch util is used for patching within rules_js instead of the internal pnpm patching mechanism. For example a bad patch file may be partially applied when using pnpm outside of bazel but fail when applied by rules_js, see rules_js #1915.
Patching via patches
attribute
We recommend patching via pnpm.patchedDependencies as above, but if you are importing
a yarn or npm lockfile and do not have this field in your package.json, you can apply additional
patches using the patches
and patch_args
attributes of npm_translate_lock
.
These are designed to be similar to the same-named attributes of http_archive.
Paths in patch files must be relative to the root of the package. If the version is left out of the package name, the patch will be applied to every version of the npm package.
patch_args
defaults to -p0
, but -p1
will usually be needed for patches generated by git.
In case multiple entries in patches
match, the list of patches are additive.
(More specific matches are appended to previous matches.)
However if multiple entries in patch_args
match, then the more specific name matches take precedence.
Patches in patches
are applied after any patches included in pnpm.patchedDependencies
.
For example,
npm_translate_lock(
...
patches = {
"@foo/bar": ["//:patches/foo+bar.patch"],
"fum@0.0.1": ["//:patches/fum@0.0.1.patch"],
},
patch_args = {
"*": ["-p1"],
"@foo/bar": ["-p0"],
"fum@0.0.1": ["-p2"],
},
)
Lifecycles
npm packages have "lifecycle scripts" such as postinstall
which are documented here:
https://docs.npmjs.com/cli/v9/using-npm/scripts#life-cycle-scripts
We refer to these as "lifecycle hooks".
The lifecycle hooks of a package are determined by the package.json
pnpm.onlyBuiltDependencies
attribute.
If pnpm.onlyBuiltDependencies
is unspecified npm_translate_lock
will fallback to the legacy pnpm lockfile requiresBuild
attribute.
This attribute is only available in pnpm before v9, see pnpm #7707 for reasons why this attribute was removed.
When a package has lifecycle hooks the lifecycle_*
attributes are applied to filter which hooks are run and how they are run.
For example, you can restrict lifecycle hooks across all packages to only run postinstall
:
lifecycle_hooks = { "*": ["postinstall"] }
innpm_translate_lock
.
Because rules_js models the execution of these hooks as build actions, rather than repository rules, the result can be stored in the remote cache and shared between developers. Typically these actions are not run in Bazel's action sandbox because of the overhead of setting up and tearing down the sandboxes.
In addition to sandboxing, Bazel supports other execution_requirements
for actions,
in the attribute of https://bazel.build/rules/lib/actions#run.
You can have control over these using the lifecycle_hooks_execution_requirements
attribute of npm_translate_lock
.
Some hooks may fail to run under rules_js, and you don't care to run them.
You can use the lifecycle_hooks_exclude
attribute of npm_translate_lock
to turn them off for a package,
which is equivalent to setting the lifecycle_hooks
to an empty list for that package.
You can set environment variables for hook build actions using the lifecycle_hooks_envs
attribute of npm_translate_lock
.
Some hooks may depend on environment variables specified depending on use_default_shell_env which may be enabled for hook build actions using the lifecycle_hooks_use_default_shell_env
attribute of npm_translate_lock
. Requires bazel-lib >= 2.4.2.
In case there are multiple matches, some attributes are additive. (More specific matches are appended to previous matches.) Other attributes have specificity: the most specific match wins and the others are ignored.
attribute | behavior |
---|---|
lifecycle_hooks | specificity |
lifecycle_hooks_envs | additive |
lifecycle_hooks_execution_requirements | specificity |
Here's a complete example of managing lifecycles:
npm_translate_lock(
...
lifecycle_hooks = {
# These three values are the default if lifecycle_hooks was absent
# do not sort
"*": [
"preinstall",
"install",
"postinstall",
],
# This package comes from a git url so prepare has to run to compile some things
"@kubernetes/client-node": ["prepare"],
# Disable install and preinstall for this package, maybe they are broken
"fum@0.0.1": ["postinstall"],
},
lifecycle_hooks_envs: {
# Set some values for all hook actions
"*": [
"GLOBAL_KEY1=value1",
"GLOBAL_KEY2=value2",
],
# ... but override for this package
"@foo/bar": [
"GLOBAL_KEY2=",
"PREBULT_BINARY=http://downloadurl",
],
},
lifecycle_hooks_execution_requirements = {
# This is the default if lifecycle_hooks_execution_requirements was absent
"*": ["no-sandbox"],
# Omit no-sandbox for this package, maybe it relies on sandboxing to succeed
"@foo/bar": [],
# This one is broken in remote execution for whatever reason
"fum@0.0.1": ["no-sandbox", "no-remote-exec"],
}
)
In this example:
- Only the
prepare
lifecycle hook will be run for the@kubernetes/client-node
npm package, only thepostinstall
will be run forfum
at version 0.0.1, and the default hooks are run for remaining packages. @foo/bar
lifecycle hooks will run with Bazel's sandbox enabled, with an effective environment:GLOBAL_KEY1=value1
GLOBAL_KEY2=
PREBULT_BINARY=http://downloadurl
fum
at version 0.0.1 has remote execution disabled. Like other packages aside from@foo/bar
the action sandbox is disabled for performance.
Checked-in repositories.bzl
This usage is experimental and difficult to get right! Read on with caution.
You can check in the repositories.bzl
file to version control, and load that instead.
This makes it easier to ship a ruleset that has its own npm dependencies, as users don't
have to install those dependencies. It also avoids eager-evaluation of npm_translate_lock
for builds that don't need it.
This is similar to the update-repos
approach from bazel-gazelle.
The tradeoffs are similar to this rules_python thread.
In a BUILD file, use a rule like write_source_files to copy the generated file to the repo and test that it stays updated:
write_source_files(
name = "update_repos",
files = {
"repositories.bzl": "@npm//:repositories.bzl",
},
)
Then in WORKSPACE
, load from that checked-in copy or instruct your users to do so.