Github Actions Caching 101
Contents
What is Caching in CI/CD?
In Continuous Integration/Continuous Deployment (CI/CD) workflows, caching dependencies is a crucial technique used to reduce the time needed for jobs to run.
Specifically, dependency caching in workflows aims to avoid the repetitive downloading and compiling of package dependencies during subsequent builds.
By storing dependencies from previous successful runs, the workflow can experience a cache hit and quickly restore the necessary files, leading to faster execution times.
Github Actions Caching
GitHub Actions uses the actions/cache action to manage dependency caching. When implementing caching, you must define two primary input parameters: key and path.
The path parameter defines the file paths or directories on the runner that should be cached or restored. You can specify single paths, multiple paths separated by newlines, or use glob patterns.
The key is required both when saving a new cache and when searching for an existing one. Cache keys have a maximum length of 512 characters and can incorporate static strings, variables, context values, and functions. A powerful best practice is to use expressions that calculate a hash of dependency files (like package-lock.json) in the key. This ensures that when dependencies change, the key changes, and a new cache is automatically created.
Here is an example of defining a cache using the runner’s operating system and the hash of a lock file:
- name: Cache dependencies
id: cache-dependencies
uses: actions/cache@v5
with:
path: ~/.dependency-installation-directory
key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/dependency-lock.json') }}
restore-keys: |
${{ runner.os }}-build-${{ env.cache-name }}-
${{ runner.os }}-build-
${{ runner.os }}- If an exact match for the key is found, it is a cache hit, and the cached files are restored. If there is a cache miss, the action automatically creates a new cache upon successful job completion, using the provided key and the files specified in path.
For scenarios where the exact key doesn’t match, you can use restore-keys. These optional keys are checked sequentially for partial matches to find and restore a cache.
You can determine if the cache was restored by checking the cache-hit output parameter:
- if: ${{ steps.cache-npm.outputs.cache-hit != 'true' }}
name: List the state of node modules
continue-on-error: true
run: npm list Alternatively, for specific package managers like npm, pip, Gradle, or RubyGems, using their dedicated setup-* actions (e.g., setup-node, setup-python) can usually handle dependency caching with little to no configuration.
Limitations and Best Practices
To optimize your caching strategy, it is important to understand storage limits and access restrictions:
Usage Limits and Eviction:
- GitHub imposes a default storage limit of 10 GB per repository for all caches. Usage beyond this 10 GB limit will be billed to your account.
- Cache entries that have not been accessed for more than seven days are automatically removed.
- If a repository reaches its maximum storage limit, the oldest caches (based on the last access date) are evicted to make space for new ones.
Access Restrictions (Scope):
- For security and isolation, workflow runs can only restore caches created in the current branch or the default branch (usually
main). - A workflow run triggered by a pull request can also restore caches created in the base branch.
- Caches created for pull requests (on the merge ref) have a limited scope and can only be restored by re-runs of that specific pull request.
- Crucially, workflow runs cannot restore caches created for sibling branches or child branches.
Security and Best Practices:
Order
restore-keyswiselyList your
restore-keysfrom the most specific to the least specific (e.g., specific hash, then feature prefix, then general prefix) to ensure the best possible match is found during a cache miss.Do not store sensitive data
Avoid placing sensitive information, such as login credentials or access tokens, in the cached path. Anyone with read access to the repository, or who can create a pull request, may be able to access the cache contents.
Use conditional execution
Utilize the
cache-hitoutput to skip time-consuming installation steps if the cache was successfully restored.
Effectively using caching is one of the best ways to speed up your GitHub Actions workflows, especially for projects with large dependency trees. By following the guidelines and best practices outlined in this guide, you can optimize your CI/CD pipelines for faster builds and deployments.
Ready to optimize your CI/CD pipelines?
Stop guessing where your pipeline time goes. pipescan gives you the insights you need to identify bottlenecks, reduce build times, and ship faster.
Get started for free