Docker BuildKit Practical: Optimizing Dependency Management with Cache to Accelerate Builds
What is Docker BuildKit
Docker BuildKit is Docker’s next-generation build engine, offering more efficient and flexible container image building capabilities. Introduced in 2018, BuildKit has been integrated into the Docker engine since version 18.09 and became the default build system starting with Docker version 23.0.
Key Features of BuildKit
- Parallel Builds: Executes independent build steps in parallel, significantly improving build efficiency.
- Advanced Caching Mechanism: A smarter caching system supporting content-addressable storage.
- Mounting Capabilities: Supports mounting file systems during the build process, such as caches and secrets.
- Cross-Platform Builds: Enables building images for other platforms from a single platform.
- Enhanced Privilege Downgrade Security: Provides better security isolation.
How to Enable BuildKit
Starting from Docker Engine v23.0, BuildKit is used as Docker’s default build engine.
You can disable it by setting the environment variable
DOCKER_BUILDKIT=0
.
If you are using a Docker version earlier than v23.0, you can enable it via environment variables:
# Temporarily enable
export DOCKER_BUILDKIT=1
# Permanently enable in .zshrc
echo 'export DOCKER_BUILDKIT=1' >> ~/.zshrc
Alternatively, enable it in the Docker configuration file (/etc/docker/daemon.json
or Docker Desktop settings on macOS):
{
"features": {
"buildkit": true
}
}
Remember to restart the Docker Engine after making changes.
Example
Using a Java project as an example, we will leverage Docker BuildKit’s caching capabilities to optimize the download and build process for Maven dependencies. Here’s the Dockerfile:
# Stage 1: Build the application
FROM maven:3.9-eclipse-temurin-17 AS build
# Set the working directory
WORKDIR /app# Copy pom.xml
COPY pom.xml .# Copy source code
COPY src/ /app/src/# Use dependency hash value as cache ID identifier for better cache management
# A new cache is automatically created when pom.xml changes
COPY pom.xml pom.xml.checksum
RUN md5sum pom.xml > pom.xml.checksum# Build the application (with cache mount support)
RUN --mount=type=cache,target=/root/.m2 mvn clean package -DskipTests# Stage 2: Create the runtime image
FROM eclipse-temurin:17-jre-jammyWORKDIR /app# Copy the JAR file from the build stage
COPY --from=build /app/target/spring-boot-3-rest-api-sample-1.0-SNAPSHOT.jar app.jar# Expose the port the app runs on
EXPOSE 8080# Command to run the application
ENTRYPOINT ["java", "-jar", "app.jar"]
In this example, the command RUN --mount=type=cache,target=/root/.m2 mvn clean package -DskipTests
replaces the original RUN mvn clean package -DskipTests
. With the original command, Maven dependencies are downloaded every time, resulting in slow builds and wasted network traffic.
After the modification, the first build will still take time to download all Maven dependencies, but subsequent builds will be significantly faster. You can test this using the code from this repository.
BuildKit Cache Command Analysis
A typical command using BuildKit cache in a Dockerfile looks like this:
RUN --mount=type=cache,target=/root/.m2 mvn clean package -DskipTests
Command Components
RUN
- A standard Dockerfile instruction for executing commands.--mount=type=cache,target=/root/.m2
- BuildKit's cache mount parameter:
type=cache
- Specifies the mount type as cache.target=/root/.m2
- Specifies the mount point as the Maven repository directory inside the container.
mvn clean package -DskipTests
- The actual Maven build command.
In addition to the cache
type used here, BuildKit supports other mount types:
bind
- Mounts a host directory.secret
- Mounts secret files.ssh
- Mounts SSH keys.tmpfs
- Mounts a temporary file system.
For more information, refer to the Dockerfile documentation.
How It Works
BuildKit’s caching mechanism optimizes the build process as follows:
- First Build:
- Maven downloads all dependencies to the
/root/.m2
directory. - BuildKit saves this directory as a cache.
2. Subsequent Builds:
- BuildKit mounts the previously cached directory.
- Maven detects existing dependencies and uses the cached versions.
- Significantly reduces build time and network traffic.
In addition to Java projects, BuildKit caching is also applicable to other languages and tools, such as Node.js, Golang, and Python.
Cache Lifecycle
BuildKit caches do not have a fixed expiration time by default and will persist until:
- Manually cleared using the
docker builder prune
command. - Automatically cleared when the system runs out of disk space.
- Docker daemon restarts (under certain configurations).
Advanced Usage
Adding Cache IDs for Management
When using BuildKit caching, you can specify an ID for the cache to facilitate management and cleanup:
RUN --mount=type=cache,target=/root/.m2,id=maven-deps-myproject mvn clean package
Adding Cache Statistics
Output cache statistics during the build process to understand cache usage:
RUN --mount=type=cache,target=/root/.m2 \
echo "==== Maven Repository Cache Information ====" && \
find /root/.m2 -name "*.jar" | wc -l | xargs echo "Number of cached JAR files:" && \
du -sh /root/.m2 | xargs echo "Maven cache size:" && \
echo "=======================================" && \
mvn clean package -DskipTests
Cache Management Based on Dependency Changes
In some cases, you can manage the cache based on the hash value of dependency files. As shown in the example above, BuildKit automatically creates a new cache when the pom.xml
file changes:
COPY pom.xml pom.xml.checksum
RUN md5sum pom.xml > pom.xml.checksum
RUN --mount=type=cache,target=/root/.m2 \
mvn clean package -DskipTests
Cache Management Commands
Clear caches:
# Clear all unused build caches
docker builder prune
# Clear only cache mounts
docker builder prune --filter type=exec.cachemount# Clear caches unused for over 7 days
docker builder prune --filter "until=168h"
Comparison with Traditional Methods
Traditional methods use Docker layer caching and volume mounts to manage dependencies but have some limitations, as shown in the table below:
FeatureBuildKit CacheTraditional Docker Layer CacheVolume MountConfiguration ComplexityLowMediumHighPersistenceHighMediumHighEffectiveness on Dependency ChangesHighLowHighCI/CD SuitabilityHighMediumLowCleanup ManagementSimpleSimpleComplex
Verifying Cache Effectiveness
After enabling caching, how can you verify that it is working correctly? Here are a few ways to check:
- Compare Build Times:
time docker build -t myapp . # First build
time docker build -t myapp . # Second build should be significantly faster
2. Check Maven Logs:
docker build --progress=plain -t myapp . | grep "Downloaded from"
- The first build will have multiple “Downloaded from” logs.
- Subsequent builds should have few or no such logs.
3. View Cache Statistics: Add cache information output commands during the build.