Avatar
Mostly just a place where I dump interesting things I've learned that I'm pretty sure I'll forget.. It's my own personal knowledge base.

shell path caching and the hash -r command

2025-10-20 23:00:00 +0000

About

I ran into a confusing issue recently where I had installed Java via Homebrew, my $PATH was correctly configured, but the java command kept pointing to the wrong version. The solution turned out to be a shell feature I’d never heard of: command path caching. This post explains what happened and how hash -r saved the day.

The Problem

After installing OpenJDK 17 via Homebrew and running brew link openjdk@17, I expected everything to work. My $PATH was configured correctly with /opt/homebrew/bin before /usr/bin:

$ echo $PATH
/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

But when I checked which java command was being used, it was still pointing to the macOS stub:

$ which java
/usr/bin/java

$ java -version
The operation couldn't be completed. Unable to locate a Java Runtime...

This made no sense! The correct java binary existed at /opt/homebrew/bin/java, and that path was first in my $PATH. Why wasn’t the shell finding it?

The Culprit: Shell Command Caching

It turns out that shells like Bash and Zsh don’t search your entire $PATH every single time you run a command. Instead, they maintain an in-memory hash table that caches the full path to each command the first time you run it.

Here’s how it works:

  1. First time you run java: The shell searches through each directory in $PATH (from left to right), finds /usr/bin/java, and caches that location.
  2. Every subsequent time: The shell uses the cached location - no PATH search needed.

This is a performance optimization… searching the filesystem is expensive, so caching the results makes sense. But it causes problems when:

  • You install a new version of a command
  • You change your $PATH order
  • You run brew link to create new symlinks
  • You install global npm/pip packages

In my case, the shell had cached java → /usr/bin/java before I installed Homebrew’s Java. Even though /opt/homebrew/bin/java now existed and should have taken precedence, the shell kept using the stale cached location.

The Solution: hash -r

The hash -r command tells your shell to clear all cached command locations:

$ hash -r

$ which java
/opt/homebrew/bin/java

$ java -version
openjdk version "17.0.16" 2025-07-15
OpenJDK Runtime Environment Homebrew (build 17.0.16+0)
OpenJDK 64-Bit Server VM Homebrew (build 17.0.16+0, mixed mode, sharing)

Perfect! After clearing the cache, the shell re-searched $PATH and found the correct Java installation.

Inspecting the Hash Table

You can see what’s currently cached in your shell:

$ hash
hits    command
   5    /bin/ls
   2    /usr/bin/git
   1    /opt/homebrew/bin/npm

The “hits” column shows how many times you’ve used each command in this session.

To see a specific command’s cached path:

$ hash -v java
hash: java=/opt/homebrew/bin/java

To clear just one command (instead of the entire cache):

$ hash -d java

Shell Differences: Bash vs Zsh

Bash:

  • Uses the hash command
  • Usually auto-clears the cache when $PATH changes
  • hash -r clears all cached commands

Zsh:

  • Uses both hash and rehash (they’re synonyms)
  • More aggressive caching - doesn’t always auto-clear on PATH changes
  • rehash is the “zsh way” to say hash -r

If you’re using Zsh (which is the default on modern macOS), you might encounter this issue more often than Bash users.

Other Ways to Bypass the Cache

If you don’t want to clear the entire cache, you can force a fresh PATH lookup:

# Use the full path directly
/opt/homebrew/bin/java -version

# Use 'command -v' to bypass the cache
command -v java

You can also check if the cache is causing your issue:

# These should match:
which java
command -v java

# If they're different → your cache is stale!

When to Use hash -r

Run hash -r (or rehash in Zsh) after:

  • Installing software via Homebrew: brew install something && hash -r
  • Changing your $PATH: export PATH="/new/path:$PATH" && hash -r
  • Installing global npm packages: npm install -g some-tool && hash -r
  • Updating dotfiles that modify $PATH: source ~/.zshrc && hash -r

Or just open a new terminal - fresh shells start with an empty cache!

Final Thoughts

The shell’s command caching is a clever performance optimization that works great… until it doesn’t. When you install new software or change your environment, stale cache entries can cause confusing behavior where which shows the wrong path despite your $PATH being correct.

Now you know: when commands aren’t being found where you expect them, try hash -r before diving deeper into troubleshooting. It’s saved me hours of debugging time, and hopefully it’ll save you some too!

docker caching monorepo NodeJS projects

2025-07-28 23:00:00 +0000

About

Building efficient Dockerfiles is critical for ensuring you have clean, optimized and reproducible builds. Unfortunately, sometimes while designing your Dockerfile you can fall into simple traps that wind up with your builds spending tons of time rebuilding layers that don’t actually need to be rebuilt.

This post will talk about NodeJS “monorepos”, and how to properly keep your dependencies (node_modules/**) cached while files in your application are constantly changing.

NodeJS Monorepo Setup

I’ve created a very simple monorepo at https://github.com/diranged/blog-docker-caching-with-node-projects. This repository has a pretty simple structure that you would see in many projects:

package.json
packages
packages/lib-c
packages/lib-c/package.json
packages/lib-b
packages/lib-b/package.json
packages/lib-a
packages/lib-a/package.json

In the top level package.json, I’ve informed Yarn that we have a series of packages in the packages/* path:

# package.json
{
  "name": "my-demo-monorepo",
  "private": true,
  "workspaces": ["packages/*"]
}

Then each of the nested libraries has a really simple package definition that installs a different dependency.. for example, here’s the packages/lib-c/package.json:

# packages/lib-c/package.json
{
  "name": "lib-c",
  "version": "1.0.0",
  "main": "index.js",
  "dependencies": {
    "moment": "^2.29.4"
  }
}

With this environment, we can see that a yarn install will automatically install all of the dependencies for all of the packages/lib-*/package.json packages (see the yarn.lock for details):

% yarn install
yarn install v1.22.22
[1/4] 🔍  Resolving packages...
[2/4] 🚚  Fetching packages...
[3/4] 🔗  Linking dependencies...
[4/4] 🔨  Building fresh packages...

success Saved lockfile.
✨  Done in 1.16s.
% yarn list
yarn list v1.22.22
...
├─ axios@1.11.0
│  └─ ...
├─ lodash@4.17.21
├─ moment@2.30.1
...
└─ proxy-from-env@1.1.0
✨  Done in 0.05s.

Example of a “bad” Dockerfile

In Dockerfile.bad I have laid out a structure that at first glance seems like it works:

################################################################################
# Common base layer
################################################################################
FROM node:20-alpine AS base
WORKDIR /app

################################################################################
# An easy trap to fall into ... thinking that you can do a yarn install in one
# stage and have that cached, and then not re-build later stages.
################################################################################
FROM base AS installer
WORKDIR /app
COPY . /app
RUN yarn install
RUN echo "If this ran, cache was invalidated inside the installer stage"

###############################################################################
# Now you might think that the install is cached ... and that code changes
# won't invalidate the cache.
###############################################################################
FROM base
COPY --from=installer /app /app
RUN echo "If this ran, then the installer cache has invalidated the final stage cache"

If you run your first docker build, we can see that all the layers need to be built:

% BUILDKIT_PROGRESS=plain docker build . --file Dockerfile.bad
#0 building with "orbstack" instance using docker driver

#1 [internal] load build definition from Dockerfile.bad
#1 transferring dockerfile: 1.11kB done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/node:20-alpine
#2 DONE 0.7s

#3 [internal] load .dockerignore
#3 transferring context: 55B done
#3 DONE 0.0s

#4 [internal] load build context
#4 transferring context: 52.36kB done
#4 DONE 0.0s

#5 [base 1/2] FROM docker.io/library/node:20-alpine@sha256:df02558528d3d3d0d621f112e232611aecfee7cbc654f6b375765f72bb262799
#5 resolve docker.io/library/node:20-alpine@sha256:df02558528d3d3d0d621f112e232611aecfee7cbc654f6b375765f72bb262799 0.0s done
...
#5 sha256:daf846a830553a0ff809807b7f2d956dbd9dcb959c875d23b6feb3d3aecdecef 0B / 42.67MB 0.2s
#5 DONE 2.5s

#6 [base 2/2] WORKDIR /app
#6 DONE 0.2s

#7 [installer 1/4] WORKDIR /app
#7 DONE 0.0s

#8 [installer 2/4] COPY . /app
#8 DONE 0.0s

#9 [installer 3/4] RUN yarn install
#9 0.252 yarn install v1.22.22
#9 0.285 [1/4] Resolving packages...
#9 0.304 [2/4] Fetching packages...
#9 1.122 [3/4] Linking dependencies...
#9 1.287 [4/4] Building fresh packages...
#9 1.291 success Saved lockfile.
#9 1.293 Done in 1.04s.
#9 DONE 1.3s

#10 [installer 4/4] RUN echo "If this ran, cache was invalidated inside the installer stage"
#10 0.115 If this ran, cache was invalidated inside the installer stage
#10 DONE 0.1s

#11 [stage-2 1/2] COPY --from=installer /app /app
#11 DONE 0.1s

#12 [stage-2 2/2] RUN echo "If this ran, then the installer cache has invalidated the final stage cache"
#12 0.091 If this ran, then the installer cache has invalidated the final stage cache
#12 DONE 0.1s

#13 exporting to image
#13 exporting layers
#13 exporting layers 0.1s done
#13 writing image sha256:b98364c03053a5b34eb2e882a342c2eb6fee170a7bd8d919d0531c7d1dc0b06d done
#13 DONE 0.1s
%

Now, re-running the build without changing any files, we can see it does seem to be cached:

% BUILDKIT_PROGRESS=plain docker build . --file Dockerfile.bad
#0 building with "orbstack" instance using docker driver

#1 [internal] load build definition from Dockerfile.bad
#1 transferring dockerfile: 1.11kB done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/node:20-alpine
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 55B done
#3 DONE 0.0s

#4 [base 1/2] FROM docker.io/library/node:20-alpine@sha256:df02558528d3d3d0d621f112e232611aecfee7cbc654f6b375765f72bb262799
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.78kB done
#5 DONE 0.0s

#6 [installer 4/4] RUN echo "If this ran, cache was invalidated inside the installer stage"
#6 CACHED

#7 [stage-2 1/2] COPY --from=installer /app /app
#7 CACHED

#8 [base 2/2] WORKDIR /app
#8 CACHED

#9 [installer 3/4] RUN yarn install
#9 CACHED

#10 [installer 1/4] WORKDIR /app
#10 CACHED

#11 [installer 2/4] COPY . /app
#11 CACHED

#12 [stage-2 2/2] RUN echo "If this ran, then the installer cache has invalidated the final stage cache"
#12 CACHED

#13 exporting to image
#13 exporting layers done
#13 writing image sha256:b98364c03053a5b34eb2e882a342c2eb6fee170a7bd8d919d0531c7d1dc0b06d done
#13 DONE 0.0s
%

Now here’s the problem… most application repositories do not change their dependencies nearly as often as the actual application code changes. So when application code (or even some unreated file like a README) is changed, we want to avoid re-installing all of the dependencies. Let’s see what happens if we touch an unrelated file and re-run the build:

# First, we'll touch a totally unrelated file that just so happens to be in the Docker build context.
% echo $(date) > tmp/trigger

# Now we re-run the build..
% BUILDKIT_PROGRESS=plain docker build . --file Dockerfile.bad
#0 building with "orbstack" instance using docker driver

#1 [internal] load build definition from Dockerfile.bad
#1 transferring dockerfile: 1.11kB done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/node:20-alpine
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 55B done
#3 DONE 0.0s

#4 [base 1/2] FROM docker.io/library/node:20-alpine@sha256:df02558528d3d3d0d621f112e232611aecfee7cbc654f6b375765f72bb262799
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.82kB done
#5 DONE 0.0s

#6 [base 2/2] WORKDIR /app
#6 CACHED

#7 [installer 1/4] WORKDIR /app
#7 CACHED

#8 [installer 2/4] COPY . /app
#8 DONE 0.0s                         <<<<<< The cache has been invalidated

#9 [installer 3/4] RUN yarn install  <<<<<< The `yarn install` is being re-run now
#9 0.290 yarn install v1.22.22
#9 0.328 [1/4] Resolving packages...
#9 0.347 [2/4] Fetching packages...
#9 1.194 [3/4] Linking dependencies...
#9 1.362 [4/4] Building fresh packages...
#9 1.365 success Saved lockfile.
#9 1.367 Done in 1.08s.
#9 DONE 1.4s

#10 [installer 4/4] RUN echo "If this ran, cache was invalidated inside the installer stage"
#10 0.114 If this ran, cache was invalidated inside the installer stage
#10 DONE 0.1s                        <<<<<<  We can see we've fully invalidated the "installer" stage

#6 [base 2/2] WORKDIR /app
#6 CACHED

#11 [stage-2 1/2] COPY --from=installer /app /app
#11 DONE 0.1s

#12 [stage-2 2/2] RUN echo "If this ran, then the installer cache has invalidated the final stage cache"
#12 0.091 If this ran, then the installer cache has invalidated the final stage cache
#12 DONE 0.1s                        <<<<<<  And now in the final application stage, we can see that we re-ran our fake build command

#13 exporting to image
#13 exporting layers 0.1s done
#13 writing image sha256:1da5edf25b9a75945992940ef7d4b1811f4b957fbcf5237ce4f74bcb170dc4ec done
#13 DONE 0.1s
%

What went wrong?

If we look carefully at the Dockerfile.bad - we can see that any change to any file in the Docker context is going to trigger invalidation to begin at the COPY . /app command:

FROM base AS installer
WORKDIR /app
COPY . /app          <<<<<< Any file change is going to cause this to be invalidated
RUN yarn install
RUN echo "If this ran, cache was invalidated inside the installer stage"

As soon as the COPY . /app is invalidated, the RUN yarn install is also invalidated and must be re-run, which is what we wanted to avoid.

Making matters even worse, the odds that the RUN yarn install will output exactly the same bytes run-after-run are very low… so when that command is re-run, it in-turn invalidates the next stage:

FROM base
COPY --from=installer /app /app
RUN echo "If this ran, then the installer cache has invalidated the final stage cache"

If you look closely at the output, we can see that not only did the installer stage get invalidated, but we can also see that the final app-stage was also invalidated!

...

#10 [installer 4/4] RUN echo "If this ran, cache was invalidated inside the installer stage"
#10 0.114 If this ran, cache was invalidated inside the installer stage
#10 DONE 0.1s                        <<<<<<  We can see we've fully invalidated the "installer" stage
...
#12 [stage-2 2/2] RUN echo "If this ran, then the installer cache has invalidated the final stage cache"
#12 0.091 If this ran, then the installer cache has invalidated the final stage cache
#12 DONE 0.1s                        <<<<<<  And now in the final application stage, we can see that we re-ran our fake build command

This invalidation of the final build stage could be extremely costly depending on the size of your NodeJS project… we have several projects that take 30-40 minutes to build, and a bug like this could cause these 30-40 minute rebuilds to occur when we’ve actually made no changes to the underyling application (a readme update, script update, etc).

A “good” Dockerfile that avoids intermediate stage rebuilds

The fix here is subtle… but what we need to do is be able to isolate changes to the package.json and yarn.lock files and only re-run the installation step if these files changed. However, because its a monorepo, we want to dynamically find these files so that as our team members add new packages, they don’t have to remember to touch the Dockerfile every time. How do we do that?

The simple answer here is that we need more build stages… we need a stage that dynamically finds all of the package.json and yarn.lock files, and then a separate stage that runs the yarn install, and finally a stage that runs our application build.

Let’s now look at Dockerfile.good:

################################################################################
# Start off with a common base image - this is primarily to make sure that all
# of our yarn steps (install, cache, etc.) are run in the same environment.
################################################################################
FROM node:20-alpine AS base
WORKDIR /app

###############################################################################
# In the first stage, we have to dynamically find all of the package.json and
# yarn.lock files in our repository - while excluding anything found in the
# nested node_modules directory.
#
# Hint: Definitely add "node_modules" to your .dockerignore file to avoid even
# copying that path into the build context.
###############################################################################
FROM base AS package_parser
COPY . /app
RUN mkdir /out
RUN find . \
        \( -name "package.json" -o -name "yarn.lock" \) \
        -not -path "*/node_modules/*" \
        -exec cp --parents {} /out \;
RUN echo "Discovered Package Files:" && find /out
RUN echo "If this ran, cache was invalidated inside the installer stage"

###############################################################################
# Now we separate out the yarn install step into its own stage. This is the
# critical piece ... pulling this out into its own stage means that it only
# has its cache invalidated if the "COPY --from=package-parser..." step changes
# its output. Otherwise, the yarn install is cached and not re-run on a build
# where some other unrelated file is changed.
###############################################################################
FROM base AS package_installer
COPY --from=package_parser /out /app
RUN --mount=type=cache,target=/usr/local/share/.cache yarn install

###############################################################################
# Application Stage - Do your application build here...
###############################################################################
FROM base
COPY --from=package_installer /app /app
RUN echo "If this ran, then the installer cache has invalidated the final stage cache"

With this file in place, we can re-run our test case above.. we’ll touch a trigger file, and see what steps that rebuilds:

# Let's touch our trigger file again
% echo $(date) > tmp/trigger

# Now we run our build
% docker build . --file Dockerfile.good
#0 building with "orbstack" instance using docker driver

#1 [internal] load build definition from Dockerfile.good
#1 transferring dockerfile: 2.21kB done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/node:20-alpine
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 55B done
#3 DONE 0.0s

#4 [base 1/2] FROM docker.io/library/node:20-alpine@sha256:df02558528d3d3d0d621f112e232611aecfee7cbc654f6b375765f72bb262799
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.82kB 0.0s done
#5 DONE 0.0s

#6 [base 2/2] WORKDIR /app
#6 CACHED

#7 [package_parser 1/5] COPY . /app
#7 DONE 0.0s

#8 [package_parser 2/5] RUN mkdir /out
#8 DONE 0.1s

#9 [package_parser 3/5] RUN find .         ( -name "package.json" -o -name "yarn.lock" )         -not -path "*/node_modules/*"         -exec cp --parents {} /out ;
#9 DONE 0.1s

#10 [package_parser 4/5] RUN echo "Discovered Package Files:" && find /out
#10 0.092 Discovered Package Files:
#10 0.093 /out
#10 0.093 /out/package.json
#10 0.093 /out/packages
#10 0.093 /out/packages/lib-a
#10 0.093 /out/packages/lib-a/package.json
#10 0.093 /out/packages/lib-b
#10 0.093 /out/packages/lib-b/package.json
#10 0.093 /out/packages/lib-c
#10 0.093 /out/packages/lib-c/package.json
#10 0.093 /out/yarn.lock
#10 DONE 0.1s

#11 [package_parser 5/5] RUN echo "If this ran, cache was invalidated inside the installer stage"
#11 0.145 If this ran, cache was invalidated inside the installer stage       <<<<<< This is OK - we invalidated our package parsing stage.. but the final output stays the same.
#11 DONE 0.2s

#12 [package_installer 1/2] COPY --from=package_parser /out /app
#12 CACHED                                                                    <<<<<< Now in the installer stage, we are still cached, so there is no yarn install!

#13 [package_installer 2/2] RUN --mount=type=cache,target=/usr/local/share/.cache yarn install
#13 CACHED

#14 [stage-3 1/2] COPY --from=package_installer /app /app
#14 CACHED

#15 [stage-3 2/2] RUN echo "If this ran, then the installer cache has invalidated the final stage cache"
#15 CACHED

#16 exporting to image
#16 exporting layers done
#16 writing image sha256:583d977647355590e3cf84a382e19a13096db97ae9cce33c7c86ecbd55d32b77 done
#16 DONE 0.0s
%

How does this work?

The “good” Dockerfile has a few important changes that work together:

Dynamic package.json and yarn.lock discovery

In the package_parser stage, we copy the entire application context in (COPY . /app), but then we search for the files we care about with a RUN find ... command and copy those files into a temporary directory /out. By doing this, we isolate the files that change infrequently, and allow them to be cached in the next stage.

Installation in a separate stage

After the package_parser stage has executed, we use COPY --from=package_parser /out /app to copy in only the infrequently changing depdendcy managent files.. by doing this, the cache will only be invalidated for the package_installer stage if the contents of the /out directory changed… this means that a change to /app/tmp/trigger will not change the /out files, and therefore, the package_installer stage can be cached.

Final Thoughts

There are a number of ways to approach this problem … and tools like Turborepo are great and can solve these issues for you. The examples above though hopefully give you a very clear low-level idea of what can go wrong, and how to approach fixing it if something like Turborepo isn’t right for your situation!

docker cache invalidation and the ARG command

2025-06-19 23:00:00 +0000

About

We ran into a caching issue on a docker build that was triggerd by a really easy to make mistake in your Dockerfile… this seemed like as good a place as any to start with a new blog.

Setup

Given a simple Dockerfile like this:

FROM alpine:latest

ARG CACHE_BUSTER

WORKDIR /app

RUN echo "Slow step..." && sleep 5
RUN echo "Fast step"
ENV CACHE_BUSTER=${CACHE_BUSTER}
RUN echo "Updated Cache Buster ${CACHE_BUSTER}"

It might be natural to expect that steps 1-3 would not be invalidated if CACHE_BUSTER is updated. However, here’s the reality … the ARG CACHE_BUSTER line invalidates every step after it any time the CACHE_BUSTER argument is changed.

In the first build we prep the cache:

$ BUILDKIT_PROGRESS=plain \
  DOCKER_BUILDKIT=1 \
  docker buildx build \
    -f Dockerfile.simple \
    --build-arg CACHE_BUSTER=0 .

#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile.simple
#1 transferring dockerfile: 233B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/5] FROM docker.io/library/alpine:latest@sha256:8a1f59ffb675680d47db6337b49d22281a139e9d709335b492be023728e11715
#4 DONE 0.0s

#5 [2/5] WORKDIR /app
#5 CACHED

#6 [3/5] RUN echo "Slow step..." && sleep 5
#6 0.335 Slow step...
#6 DONE 5.4s

#7 [4/5] RUN echo "Fast step"
#7 0.624 Fast step
#7 DONE 0.7s

#8 [5/5] RUN echo "Updated Cache Buster 0"
#8 0.476 Updated Cache Buster 0
#8 DONE 0.5s

#9 exporting to image
#9 exporting layers 0.1s done
#9 writing image sha256:8728d27baadfc85c7dd0858b9cf1b22f7803709d43d142b562d7bfeef94cc19e done
#9 DONE 0.1s

Now we re-build but we update the CACHE_BUSTER key…

$ BUILDKIT_PROGRESS=plain \
  DOCKER_BUILDKIT=1 \
  docker buildx build \
    -f Dockerfile.simple \
    --build-arg CACHE_BUSTER=100 .

#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile.simple
#1 transferring dockerfile: 233B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.5s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/5] FROM docker.io/library/alpine:latest@sha256:8a1f59ffb675680d47db6337b49d22281a139e9d709335b492be023728e11715
#4 DONE 0.0s

#5 [2/5] WORKDIR /app
#5 CACHED

#6 [3/5] RUN echo "Slow step..." && sleep 5
#6 0.243 Slow step...
#6 DONE 5.3s

#7 [4/5] RUN echo "Fast step"
#7 0.546 Fast step
#7 DONE 0.6s

#8 [5/5] RUN echo "Updated Cache Buster 100"
#8 0.649 Updated Cache Buster 100
#8 DONE 0.7s

#9 exporting to image
#9 exporting layers 0.1s done
#9 writing image sha256:fe4d3903b58dc81b6668a6275884ad5ec96348556a4c3b695490595e815c5774 done
#9 DONE 0.1s

We can see that the two RUN echo ... steps were re-run even though we didn’t expect them to be. Why?

The Fix - Move the ARG

The ARG command is changing its value, which causes the rest of the steps in the build to be re-calculated. The fix is to move the ARG statement to right before the ENV statement where it’s used:

FROM alpine:latest

WORKDIR /app

RUN echo "Slow step..." && sleep 5
RUN echo "Fast step"

ARG CACHE_BUSTER
ENV CACHE_BUSTER=${CACHE_BUSTER}

RUN echo "Updated Cache Buster ${CACHE_BUSTER}"

Now the first cache build works like this:

$ BUILDKIT_PROGRESS=plain DOCKER_BUILDKIT=1 docker buildx build -f Dockerfile.simple --build-arg CACHE_BUSTER=0 .
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile.simple
#1 transferring dockerfile: 235B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/5] FROM docker.io/library/alpine:latest@sha256:8a1f59ffb675680d47db6337b49d22281a139e9d709335b492be023728e11715
#4 DONE 0.0s

#5 [2/5] WORKDIR /app
#5 CACHED

#6 [3/5] RUN echo "Slow step..." && sleep 5
#6 0.293 Slow step...
#6 DONE 5.3s

#7 [4/5] RUN echo "Fast step"
#7 0.621 Fast step
#7 DONE 0.6s

#8 [5/5] RUN echo "Updated Cache Buster 0"
#8 0.602 Updated Cache Buster 0
#8 DONE 0.6s

#9 exporting to image
#9 exporting layers 0.1s done
#9 writing image sha256:d4b922a88bf6d9a3e7beb097d4c00c8c8173c2af22c3a8ce44c937adf773b28f done
#9 DONE 0.1s

and a followup build with a new CACHE_BUSTER key only recalculates the remaining steps after the ARG command:

$ BUILDKIT_PROGRESS=plain DOCKER_BUILDKIT=1 docker buildx build -f Dockerfile.simple --build-arg CACHE_BUSTER=100 .
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile.simple
#1 transferring dockerfile: 235B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/5] FROM docker.io/library/alpine:latest@sha256:8a1f59ffb675680d47db6337b49d22281a139e9d709335b492be023728e11715
#4 DONE 0.0s

#5 [2/5] WORKDIR /app
#5 CACHED

#6 [3/5] RUN echo "Slow step..." && sleep 5
#6 CACHED

#7 [4/5] RUN echo "Fast step"
#7 CACHED

#8 [5/5] RUN echo "Updated Cache Buster 100"
#8 0.450 Updated Cache Buster 100
#8 DONE 0.5s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:3f42832a297ad4d4bbd9fa1586c3e1c81b0055fd7fd207a71a49f5c9a3fe8db2 done
#9 DONE 0.0s

recent articles

shell path caching and the hash -r command
docker caching monorepo NodeJS projects
docker cache invalidation and the ARG command

all tags