Debugging Phantom Hash Changes in CI
Asset hashes that change on every CI run — even when no source file was touched — are one of the most disorienting problems in a frontend build pipeline. The filename on disk changes, the CDN cache busts needlessly, and every debugging session ends in the same frustration: the source is identical, so why is the output different?
These are phantom hash changes. They are caused by non-deterministic inputs that bundlers silently embed in compiled artifacts: wall-clock timestamps, machine-specific absolute paths, randomized module IDs, and transitive dependency drift. Because content hashes are derived from the raw bytes of the output file, a single non-deterministic byte anywhere in the bundle produces a completely different fingerprint. The avalanche property that makes SHA-256 secure for cryptography makes it unforgiving for reproducible builds.
This guide provides a systematic approach to identifying and eliminating every category of phantom hash change, with per-bundler fixes and a step-by-step bisection workflow for CI environments. For background on why byte-exact outputs matter structurally, see Deterministic Build Outputs and Why Deterministic Builds Matter for Asset Fingerprinting.
Diagnosis Table
The five root causes below account for the vast majority of phantom hash changes in production CI pipelines. Match your symptom to the correct row before applying any fix.
| Symptom / Observation | Root Cause | Bundler(s) Affected | Fix |
|---|---|---|---|
| Hash changes between two runs on the same commit, same runner | Build timestamp embedded in bundle comment or sourcemap timestamp field |
Webpack, Rollup, esbuild | Set SOURCE_DATE_EPOCH, disable Terser comment injection, pass --define:Date.now=0 to esbuild for build-time calls |
| Hash changes between different CI runners on the same commit | Filesystem ordering differences — readdir on Linux ext4 returns creation order; macOS APFS returns insertion order |
All bundlers | Sort entry points explicitly; set LC_ALL=C.UTF-8; use `find … |
| Hash changes after adding an unrelated dependency | Non-deterministic module IDs — webpack assigns sequential numeric IDs based on module graph resolution order, which shifts when new modules appear | Webpack | Set optimization.moduleIds: 'deterministic' and optimization.chunkIds: 'deterministic' |
| Absolute paths appear in output or hash changes across machines | Absolute path embedded in sourcemap sourceRoot or __filename / __dirname references compiled into the bundle |
Webpack, Vite/Rollup, esbuild | Set output.devtoolModuleFilenameTemplate to a relative path; use build.sourcemap: 'hidden'; replace __filename with import.meta.url |
| Hash changes after dependency hoisting changes in a workspace | Transitive dependency version drift — pnpm or npm workspace hoisting resolves the same logical package to a different concrete version across lockfile installs | All bundlers | Use npm ci / pnpm install --frozen-lockfile; pin exact versions in pnpm.overrides; audit with pnpm why <package> |
Bisecting with sha256sum Diffs
When the diagnosis table does not immediately identify the root cause, a structured bisection workflow finds the exact non-deterministic byte. This is the most reliable technique for CI pipelines where you cannot attach a debugger.
Step 1 — Build twice and capture sorted checksums
Run the build twice in sequence on the same commit, clearing only intermediate caches (not node_modules) between runs. Sort by filename so the diff is stable regardless of output ordering.
#!/usr/bin/env bash
set -euo pipefail
# First build
npm run build
find dist -type f | sort | xargs sha256sum > /tmp/run1.sha256
# Clear only build output and bundler cache, keep node_modules
rm -rf dist node_modules/.cache .vite
# Second build
npm run build
find dist -type f | sort | xargs sha256sum > /tmp/run2.sha256
# Show diverging files
diff /tmp/run1.sha256 /tmp/run2.sha256
The diff output names every file whose bytes changed between the two runs. If the diff is empty, the phantom hash change is caused by dependency drift (Step 5) rather than runtime entropy.
Step 2 — Inspect the changed binary
Use strings to extract all printable sequences from the changed file, then grep for telltale patterns. For a typical JS bundle, timestamps appear as ISO-8601 strings or Unix epoch integers.
# Replace dist/assets/app.a1b2c3d4.js with the file identified in Step 1
strings dist/assets/app.a1b2c3d4.js | grep -E '[0-9]{10}|[0-9]{4}-[0-9]{2}-[0-9]{2}|\/home\/|\/Users\/'
Alternatively, use xxd and pipe through grep to locate the byte offset of a known string pattern:
xxd dist/assets/app.a1b2c3d4.js | grep -i "2026"
Step 3 — Locate the exact token in a JS bundle
For JavaScript bundles, node --print evaluates the bundle and prints to stdout, which is more readable than raw hex. For source-mapped bundles, use a sourcemap decoder to translate byte offsets back to original module lines:
# Print the first 200 characters around the non-deterministic region
node --print "require('fs').readFileSync('dist/assets/app.a1b2c3d4.js','utf8').slice(14200,14400)"
Once the token is identified — a timestamp, an absolute path segment, a /*! ... */ comment block — search your bundler configuration and plugins for the injection site.
Per-Bundler Fixes
Webpack
The three most common phantom hash sources in webpack are module IDs, chunk IDs, and Terser comment blocks. All three are fixable with configuration, as detailed in Webpack Output Hashing Setup.
// webpack.config.js
const TerserPlugin = require('terser-webpack-plugin');
module.exports = {
mode: 'production',
output: {
filename: '[name].[contenthash:8].js',
chunkFilename: '[name].[contenthash:8].chunk.js',
hashSalt: 'my-project-v1',
devtoolModuleFilenameTemplate: 'webpack://[namespace]/[resource-path]'
},
optimization: {
moduleIds: 'deterministic',
chunkIds: 'deterministic',
runtimeChunk: 'single',
minimizer: [
new TerserPlugin({
terserOptions: {
format: {
comments: false
}
},
extractComments: false
})
]
}
};
hashSalt is particularly useful in monorepos where multiple webpack configs share output directories: it scopes the hash namespace so that identical module content in different projects does not produce identical filenames. For monorepos, prefer contenthash:12 or contenthash:16 to reduce collision probability across thousands of assets — see Fixing Missing Asset Hashes in Webpack 5 for the full configuration pattern.
Vite and Rollup
Vite’s [hash] token in chunkFileNames is content-based by default, but two configuration mistakes introduce phantom changes: non-sorted entry point globs and plugins that call Date.now() during the build transform phase.
// vite.config.js
import { defineConfig } from 'vite';
import { globSync } from 'glob';
const entries = globSync('src/pages/**/*.js').sort();
export default defineConfig({
build: {
rollupOptions: {
input: entries,
output: {
chunkFileNames: 'assets/[name]-[hash:8].js',
assetFileNames: 'assets/[name]-[hash:8][extname]',
entryFileNames: 'assets/[name]-[hash:8].js'
}
},
sourcemap: 'hidden'
}
});
Calling .sort() on the glob result is the single most effective fix for cross-runner hash divergence in Vite projects. File ordering from the filesystem is not guaranteed to be consistent across Linux and macOS, as noted in the diagnosis table. For Vite asset pipeline configuration in production, also audit every custom plugin for Date.now(), Math.random(), or crypto.randomBytes() calls executed outside of a lazy context.
esbuild
esbuild is deterministic by default for most inputs, but two patterns break reproducibility: code that calls Date.now() at module evaluation time, and the use of --metafile without a fixed output path.
# Define Date.now to a fixed epoch for build-time evaluation
# SOURCE_DATE_EPOCH is set by the CI environment wrapper
esbuild src/index.ts \
--bundle \
--minify \
--outdir=dist \
--entry-names='[dir]/[name]-[hash]' \
--define:Date.now="(() => 1735689600000)" \
--metafile=dist/meta.json \
--log-level=warning
The --define:Date.now trick replaces all calls to Date.now() in the bundled source with a constant, which is evaluated at parse time. Use the Unix timestamp corresponding to your SOURCE_DATE_EPOCH value. This only affects code that esbuild can statically resolve; dynamic new Date() calls remain at runtime and do not affect the build hash.
All Bundlers: Environment Variable Wrapper
Regardless of bundler, apply these three environment variables before every build invocation in CI. They neutralize the most common sources of cross-runner divergence:
export TZ=UTC
export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
export LC_ALL=C.UTF-8
npm run build
TZ=UTC prevents timezone-dependent date formatting from the host OS from appearing in any string that a plugin might embed. SOURCE_DATE_EPOCH is respected by a growing number of bundler plugins, archivers, and compilers. LC_ALL=C.UTF-8 forces byte-order string comparison so that any locale-dependent sort inside a build tool produces the same sequence on every machine.
Lockfile Discipline
Dependency drift is a phantom hash cause that looks like a code change: the source files are identical, but the compiled output differs because a transitive dependency resolved to a different version, changing the minified output.
npm install re-resolves the dependency graph on every invocation. npm ci installs exactly what the lockfile records and fails if the lockfile is inconsistent with package.json. In CI, always use npm ci.
For pnpm workspaces, the workspace: protocol in pnpm-lock.yaml pins workspace package references to the local copy rather than the registry. When hoisting changes — typically after adding a new package to any workspace member — transitive dependencies can resolve differently. Audit with:
pnpm why <package-name>
If a transitive dependency appears at two different versions in the output, add an explicit entry to pnpm.overrides in the root package.json to pin it to a single resolved version across the entire workspace.
Verification Command
After applying fixes, run the dual-build bisect in a single command to confirm phantom hash changes are eliminated:
npm run build && find dist -type f | sort | xargs sha256sum | sort > /tmp/b1.sha256 && rm -rf dist node_modules/.cache && npm run build && find dist -type f | sort | xargs sha256sum | sort > /tmp/b2.sha256 && diff /tmp/b1.sha256 /tmp/b2.sha256 && echo "PASS: build is deterministic"
An empty diff and a PASS message confirm that the output is byte-stable across consecutive runs on the same machine. For cross-runner verification, run the same command on two separate CI jobs and upload the .sha256 files as artifacts, then compare them in a third job.
When to Reconsider
The configurations in this guide optimize for byte-exact reproducibility. There are cases where strict determinism is the wrong tradeoff:
- Development builds should not use
SOURCE_DATE_EPOCHor fixed module IDs. Fast rebuild performance matters more than reproducibility in local development, and tools like HMR depend on dynamic identifiers. - Long-lived monorepos with many contributors may find that
contenthash:8produces too many collisions as the asset count grows. Increase hash length to 12 or 16 characters. Collision probability is negligible for fewer than a few thousand assets at 8 characters, but 16 characters provides margin for repositories that generate tens of thousands of fingerprinted files per build. - Source maps for production debugging require a careful balance. Setting
sourcemap: falsemaximizes bundle determinism but eliminates error traceability. Usesourcemap: 'hidden'and store maps in a private artifact store rather than on the public CDN. Ensure thesourceRootin the emitted map uses a relative path, not an absolute machine path, or every developer workstation will produce a different map hash. - Third-party build plugins that inject build metadata (license headers, version comments) should be audited and either disabled in production or configured with a fixed input value derived from
SOURCE_DATE_EPOCH.
Frequently Asked Questions
Why does my hash change on every CI run even though no file changed?
The most likely cause is a build timestamp embedded somewhere in the bundle. Run the bisect workflow above: build twice, diff the sha256 outputs, then use strings on the changed file and grep for date-like patterns. Set SOURCE_DATE_EPOCH and disable Terser comment injection to eliminate the two most common sources.
Why do hashes differ between my Mac and the Linux CI runner?
Filesystem readdir ordering is not guaranteed to be consistent across operating systems. macOS APFS returns files in insertion order; Linux ext4 returns them in hash-table order. Any bundler that globs entry points without sorting will pick them up in different sequences, producing different module graph resolution and therefore different output bytes. Sort all glob results explicitly and set LC_ALL=C.UTF-8.
My lockfile is committed but hashes still differ after a fresh CI install.
Check whether you are using npm install instead of npm ci in your CI pipeline. npm install may update the lockfile silently when package ranges are satisfied by a newer transitive version. Switch to npm ci and fail the build if the lockfile is dirty after install. For pnpm workspaces, use pnpm install --frozen-lockfile.
How many hex characters should my content hash be?
Eight characters (32 bits) is the conventional default and is sufficient for most single-application builds. For monorepos that produce thousands of assets per build, use 12 or 16 characters to reduce collision probability. Hash collisions in asset fingerprinting do not cause security issues but do cause incorrect cache behavior — two different files would share a fingerprinted URL, and CDN edge nodes would serve whichever version they cached first.
Related
- Deterministic Build Outputs — parent page covering the full scope of build reproducibility for asset fingerprinting
- Why Deterministic Builds Matter for Asset Fingerprinting — cryptographic and caching rationale for byte-exact outputs
- Webpack Output Hashing Setup — complete webpack contenthash configuration reference
- Vite Asset Pipeline Configuration — Vite rollupOptions, hash length, and plugin determinism
- CI/CD Asset Pipeline Integration — environment isolation, artifact manifests, and cross-runner hash validation