Fingerprinting in HTTP Headers
Static assets deployed with content-hashed filenames still fail to cache correctly when HTTP headers contradict the filename contract — understanding the full dual-layer strategy is what separates a cache that works from one that silently wastes origin bandwidth.
When to Use Header-Level Fingerprinting
Not every project needs every header technique described here. Use this decision matrix to pick the right combination for your deployment.
| Scenario | Recommended approach |
|---|---|
SPA with Vite/webpack, assets in /dist |
Cache-Control: public, max-age=31536000, immutable on asset paths; no-cache on HTML |
| Multi-region CDN (Cloudflare, Fastly) | Add Surrogate-Control or Surrogate-Key tags for tag-based purging |
| Files served from Nginx directly | Disable inode ETags; inject content-hash ETags from build manifest |
| AWS CloudFront with S3 origin | Set Cache-Control metadata on S3 objects at upload time; CloudFront inherits it |
| Monorepo with thousands of chunks | Use 12–16 hex character hashes; default 8-char hashes risk collisions at scale |
Legacy clients, no immutable support |
Keep max-age=31536000; add stale-while-revalidate=86400 as fallback |
| Assets served behind a cookie-based auth layer | Strip Vary: Cookie at the CDN edge; keep assets on a separate cookieless domain |
Prerequisites
Before applying the configurations in this guide, confirm the following:
- Nginx 1.15.3+ — versions before 1.15.3 do not support
add_header … alwayson non-2xx responses, which causes headers to vanish on 304 replies. - OpenSSL 1.1.1+ on the build host — needed if you are generating SHA-256 ETags with
openssl dgst -sha256. jq1.6+ — used in the manifest-to-map conversion script.- Cloudflare Cache Rules — available on all paid Cloudflare plans (Pro and above for custom rules; Free plan supports Page Rules with more limited control).
- AWS CloudFront —
Cache-Controlheaders set on S3 object metadata propagate automatically; no additional CloudFront distribution config is required for basicmax-age/immutablebehaviour. - Browser support for
immutable: Chrome 99+, Firefox 49+, Safari 17.2+. Edge (Chromium) 99+. Legacy IE and older Safari fall back gracefully to themax-ageTTL.
HTTP Header Configuration Reference
| Header | Type | Default (no config) | Effect on fingerprinted assets |
|---|---|---|---|
Cache-Control: max-age |
Integer seconds | Browser heuristic (~10% of Last-Modified age) | Sets absolute TTL; use 31536000 (1 year) for immutable assets |
Cache-Control: immutable |
Flag | Absent | Tells the browser not to revalidate during TTL; eliminates conditional requests |
Cache-Control: no-cache |
Flag | Absent | Forces revalidation before each use; required for HTML entry points |
Cache-Control: stale-while-revalidate |
Integer seconds | Absent | Serves stale copy while fetching fresh; useful as legacy fallback |
ETag |
String | Nginx generates inode/mtime weak ETag | Validation token; must be content-hash-derived for deterministic caching |
Last-Modified |
HTTP date | File system mtime | Less reliable than ETag for fingerprinted assets; mtime varies across nodes |
Vary |
Header name list | None | Instructs CDN to key cache on listed request headers; must be kept minimal |
Surrogate-Control |
Directive string | None | Varnish/Fastly override for CDN-side TTL, independent of browser Cache-Control |
Surrogate-Key |
Space-separated tags | None | Fastly/Varnish tag-based purge; lets you purge all assets for a release atomically |
Implementation: Step by Step
Step 1 — Decide on a hash length
Eight hex characters (32 bits) is safe for most projects. At 8 chars, the birthday-problem collision threshold is roughly 65,000 files, which covers almost all single-application builds. For monorepos or build pipelines emitting thousands of chunks, move to 12 or 16 characters. All examples below use 8 chars; a comment marks where to change the cut length.
Step 2 — Generate content-hash ETags at build time
Do not rely on Nginx’s default ETag generation — it uses inode number and mtime, both of which change between deployments even when file content is identical. Instead, generate hashes during the build and write them to a manifest.
#!/usr/bin/env bash
# scripts/generate-asset-manifest.sh
# Produces dist/asset-manifest.json mapping URL paths → content hashes
set -euo pipefail
DIST_DIR="./dist/assets"
MANIFEST="./dist/asset-manifest.json"
printf '{\n' > "$MANIFEST"
first=true
for file in "$DIST_DIR"/*; do
[[ -f "$file" ]] || continue
filename=$(basename "$file")
# Change cut -c1-8 to cut -c1-12 or cut -c1-16 for monorepos
hash=$(openssl dgst -sha256 -hex "$file" | awk '{print $2}' | cut -c1-8)
if [ "$first" = true ]; then
first=false
else
printf ',\n' >> "$MANIFEST"
fi
printf ' "/assets/%s": "%s"' "$filename" "$hash" >> "$MANIFEST"
done
printf '\n}\n' >> "$MANIFEST"
echo "Manifest written to $MANIFEST"
The output looks like:
{
"/assets/app.a1b2c3d4.js": "a1b2c3d4",
"/assets/vendor.e5f6a7b8.js": "e5f6a7b8",
"/assets/main.9c0d1e2f.css": "9c0d1e2f"
}
Step 3 — Configure Nginx with content-hash ETags and immutable caching
Convert the manifest to an Nginx map block at deploy time, then include it in the server configuration. This avoids a server restart — only a reload is needed.
# scripts/build-nginx-etag-map.sh
# Run after generate-asset-manifest.sh
jq -r 'to_entries[] | " \"\(.key)\" \"\(.value)\";' \
dist/asset-manifest.json \
> /etc/nginx/conf.d/etag-entries.conf
nginx -t && nginx -s reload
Full Nginx server block:
# /etc/nginx/sites-available/myapp.conf
# Map URI → content-hash ETag value (populated at deploy time)
map $uri $asset_etag {
default "";
include /etc/nginx/conf.d/etag-entries.conf;
}
server {
listen 443 ssl http2;
server_name example.com;
root /var/www/myapp/dist;
index index.html;
# ── Fingerprinted static assets ─────────────────────────────────────────
location ~* ^/assets/.*\.(js|css|woff2?|png|jpg|webp|svg|ico)$ {
# Disable Nginx's inode/mtime-based ETag — we set our own below
etag off;
# Inject the content-hash ETag from the deploy-time map
# If $asset_etag is empty (unknown file), no ETag header is emitted
if ($asset_etag) {
add_header ETag "\"$asset_etag\"" always;
}
# One-year TTL + immutable: browsers skip revalidation entirely
add_header Cache-Control "public, max-age=31536000, immutable" always;
# Accept-Encoding only — prevent cache fragmentation
add_header Vary "Accept-Encoding" always;
# Never let a stray Set-Cookie leak onto static responses
proxy_hide_header Set-Cookie;
expires 365d;
}
# ── HTML entry points ────────────────────────────────────────────────────
location ~* \.html$ {
# Force revalidation on every request — HTML references hashed URLs
# so the browser must always fetch the freshest entry point
add_header Cache-Control "no-cache, no-store, must-revalidate" always;
add_header Pragma "no-cache" always;
expires 0;
}
# ── Everything else ──────────────────────────────────────────────────────
location / {
try_files $uri $uri/ /index.html;
add_header Cache-Control "public, max-age=3600" always;
}
}
Step 4 — Configure Cloudflare Cache Rules
Cloudflare Cache Rules (dashboard → Caching → Cache Rules) let you override Cache-Control at the edge without touching your origin config. Create two rules in order:
Rule 1 — Fingerprinted assets (highest priority)
IF URI Path matches regex ^/assets/.*\.(js|css|woff2?|png|webp|svg|ico)$
THEN
Cache eligibility: Eligible for cache
Edge TTL: Override — 1 year (31536000 seconds)
Browser TTL: Override — 1 year
Respect origin Cache-Control: disabled (use rule values)
Set response header: Cache-Control = public, max-age=31536000, immutable
Rule 2 — HTML entry points
IF URI Path matches regex \.html$ OR URI Path equals /
THEN
Cache eligibility: Bypass cache
Set response header: Cache-Control = no-cache, no-store, must-revalidate
Cloudflare strips weak ETag headers generated by inode/mtime by default when it compresses a response. To preserve your content-hash ETags, either:
- Enable “Respect Strong ETags” in Cloudflare’s Speed → Optimization settings (Cloudflare then uses a variant ETag rather than stripping it), or
- Disable Cloudflare’s automatic compression for the asset path and handle
Content-Encodingat the origin.
Step 5 — AWS CloudFront note
CloudFront forwards Cache-Control headers from S3 object metadata to the browser unchanged and uses them to set the edge TTL by default (when the “Cache based on selected request headers” behaviour is set to “None (Improves Caching)”). Upload fingerprinted assets with --cache-control "public, max-age=31536000, immutable" in the S3 metadata at CI time:
aws s3 cp dist/assets/ s3://my-bucket/assets/ \
--recursive \
--cache-control "public, max-age=31536000, immutable" \
--metadata-directive REPLACE
CloudFront does not inject immutable automatically — you must set it on the S3 object or via a CloudFront Function / Lambda@Edge response handler.
Step 6 — Add Surrogate-Key headers for tag-based purging (Fastly / Varnish)
When you use Fastly or Varnish in front of an origin, Surrogate-Key (Fastly) or Xkey (Varnish Plus) lets you purge entire sets of assets atomically by release tag, without knowing every individual URL. This is the server-side complement to filename hashing — see cache key architecture for the broader strategy.
Add a release tag header at your origin:
# In the Nginx assets location block, add:
add_header Surrogate-Key "assets release-2024-10-01" always;
add_header Surrogate-Control "max-age=31536000" always;
Surrogate-Control sets the CDN-side TTL without affecting the Cache-Control header the browser sees. After a deploy, purge all assets for the old release in one API call:
# Fastly instant purge by Surrogate-Key tag
curl -X POST "https://api.fastly.com/service/${FASTLY_SERVICE_ID}/purge/release-2024-10-01" \
-H "Fastly-Key: ${FASTLY_API_KEY}"
Dual-Layer Caching Strategy Diagram
Verification Commands
Run these after each deploy to confirm headers are set correctly at both origin and CDN.
# 1. Check Cache-Control and ETag on a fingerprinted asset at the CDN edge
curl -sI "https://example.com/assets/app.a1b2c3d4.js" | grep -Ei "cache-control|etag|vary|x-cache"
# Expected output:
# cache-control: public, max-age=31536000, immutable
# etag: "a1b2c3d4"
# vary: Accept-Encoding
# x-cache: HIT ← Cloudflare; "HIT" confirms edge served it
# 2. Confirm immutable prevents a conditional request on repeat fetch
curl -sI "https://example.com/assets/app.a1b2c3d4.js" \
-H 'If-None-Match: "a1b2c3d4"' | head -1
# Should return "HTTP/2 200" from CDN cache (not 304) because immutable
# assets are served directly from edge, bypassing conditional logic
# 3. Confirm HTML entry point is not cached
curl -sI "https://example.com/" | grep -i cache-control
# Expected: cache-control: no-cache, no-store, must-revalidate
# 4. Verify ETag matches build manifest hash
EXPECTED=$(jq -r '"/assets/app.a1b2c3d4.js"' dist/asset-manifest.json)
ACTUAL=$(curl -sI "https://example.com/assets/app.a1b2c3d4.js" | grep -i etag | awk '{print $2}' | tr -d '"')
echo "Manifest: $EXPECTED CDN: $ACTUAL"
[ "$EXPECTED" = "$ACTUAL" ] && echo "MATCH" || echo "MISMATCH — check deploy"
# 5. Check Vary header does not include Cookie or User-Agent
curl -sI "https://example.com/assets/app.a1b2c3d4.js" | grep -i "^vary:"
# Must NOT show: vary: Cookie, vary: User-Agent
# 6. For Fastly: verify Surrogate-Key tag is present on origin responses
curl -sI "https://origin.example.com/assets/app.a1b2c3d4.js" | grep -i surrogate
# Expected: surrogate-key: assets release-2024-10-01
# surrogate-control: max-age=31536000
Edge Cases and Known Issues
ETag stripping by CDN compression
When a CDN compresses a response (gzip/brotli), it changes the byte content and therefore invalidates the strong ETag. Cloudflare converts a strong ETag to a variant ETag (appending -gzip or -br) rather than stripping it. Fastly strips ETags on compression by default. Mitigation options:
- Pre-compress assets at build time (
.js.gz,.js.br) and serve the pre-compressed file directly, disabling on-the-fly compression for those paths. The ETag then matches the pre-compressed bytes and remains stable. - On Cloudflare: enable “Respect Strong ETags” in Speed → Optimization. Cloudflare then transforms the ETag to match the compressed variant rather than generating a new weak one.
- On Nginx: use
gzip_static onandbrotli_static onso Nginx serves pre-compressed files directly, keeping the ETag you injected intact.
immutable ignored by older clients
Safari added immutable support in version 17.2 (late 2023). Older Safari and all IE clients treat the directive as unknown and fall back to the max-age TTL only — they will still revalidate at the end of max-age, but the behaviour is correct, just slightly less optimal. No special fallback is needed beyond including a long max-age.
Last-Modified unreliability across nodes
Last-Modified reflects the filesystem mtime of the served file. In a multi-node deployment, even when file content is identical, mtime will differ across nodes if they received files at slightly different timestamps. A client switching nodes between requests may receive a 200 instead of a 304, wasting bandwidth. For cache key architecture decisions, treat Last-Modified as a secondary fallback only — ETag is the authoritative validator.
Vary: Cookie causing cache fragmentation
If your application sets a session cookie on responses — even on static asset responses — and your CDN is configured to Vary: Cookie, the CDN creates a separate cache entry per distinct Cookie header value. A site with 10,000 users will generate 10,000 cache entries for the same file. Fix: serve static assets from a cookieless subdomain (assets.example.com) and ensure no Set-Cookie header appears on asset responses.
AWS CloudFront and the immutable directive
CloudFront does not natively interpret immutable to modify its own TTL behaviour — it uses Cache-Control: max-age for the edge TTL by default. The immutable flag passes through to the browser unchanged, where it does take effect. If you need CloudFront to honour a longer or independent edge TTL, set a custom “Default TTL” and “Maximum TTL” in the CloudFront cache behaviour. S3 object metadata Cache-Control takes precedence over the CloudFront default TTL when present.
Cloudflare Cache-Control overrides
Cloudflare’s default “Browser Cache TTL” setting (under Caching → Configuration) can override origin Cache-Control headers for the browser. Set it to “Respect Existing Headers” to ensure your origin-set max-age=31536000, immutable reaches the browser unchanged.
Hash length and collision risk
The default 8 hex characters (32 bits of entropy) gives a collision probability of ~1% at around 9,300 files. Most single-app projects stay well below this. However, monorepos with multiple apps sharing a single CDN bucket, or build pipelines that emit thousands of code-split chunks, should move to 12 chars (48 bits, collision threshold ~370,000 files) or 16 chars (64 bits, effectively collision-proof). See hash algorithm choice for detailed collision probability tables.
Performance Impact
The dual-layer strategy eliminates the most expensive category of browser-to-origin requests: conditional revalidation on unchanged assets.
Without immutable: On every page load after the max-age window closes, the browser sends an If-None-Match request for each cached asset. A page with 20 assets generates 20 conditional requests. Even if all return 304 Not Modified, each round-trip adds latency — typically 20–100 ms per hop depending on geography.
With immutable: The browser skips conditional requests entirely for the max-age duration. For a one-year TTL, the only origin (or CDN) fetch per asset happens once per browser. Subsequent page loads serve everything from the browser’s local disk cache with zero network activity for those assets.
CDN hit ratio effect: Because the cache key is the full fingerprinted URL path, and that URL never changes for a given file content, CDN hit ratios for fingerprinted assets routinely exceed 99% once the cache warms. Non-fingerprinted assets keyed on plain paths with Vary: Cookie or Vary: User-Agent often see hit ratios below 50% on high-traffic sites.
Surrogate-Key purge speed: Tag-based purging on Fastly propagates globally in under 150 ms. Compared to path-based purge loops (one API call per file), tag purging is O(1) from the operator’s perspective regardless of how many assets share the tag. For immutable TTL tuning details including staggered deployment patterns, see the linked guide.
Bandwidth savings from pre-compression + stable ETags: Serving brotli-pre-compressed assets (.js.br) avoids on-the-fly compression CPU cost at the origin and CDN, while keeping ETags stable across compressions.
FAQ
Why do I need both a content hash in the filename and headers like Cache-Control: immutable — isn’t one of them enough?
The filename hash ensures that every unique content version has a unique URL, so browser and CDN caches never serve stale content at a known URL. But without Cache-Control: immutable, browsers still send a conditional request (If-None-Match) once the max-age expires, even for a URL that can never change. The immutable directive tells the browser: “for the entire max-age window, do not even attempt to revalidate.” The two layers solve different problems — the filename hash handles cache busting; the header handles revalidation frequency. Removing either layer degrades caching efficiency. For a detailed comparison of the ETag vs immutable strategies, see the linked page.
Should I disable ETags entirely for fingerprinted assets, or keep them?
Keep strong, content-hash-derived ETags even when using immutable. The immutable directive affects browser-to-CDN revalidation; ETags are used when the CDN itself revalidates against the origin (on a CDN cache miss or CDN TTL expiry). Without an ETag, the CDN must issue a full 200 response rather than a 304, wasting bandwidth on the CDN-to-origin leg. Disable only the server’s automatic inode/mtime ETag generation (etag off in Nginx), and replace it with the build-manifest-derived value.
What Cache-Control header should I set on the HTML entry point (index.html)?
Use Cache-Control: no-cache, no-store, must-revalidate on index.html and any other HTML entry points. The HTML file is the only document that references all the hashed asset URLs. If the browser caches a stale HTML file, it requests old asset URLs — which may still be cached at the CDN but represent the previous deployment. Keeping HTML uncached guarantees the browser always fetches the latest manifest of hashed references, while every referenced asset itself is cached aggressively. The extra latency of fetching HTML on every navigation is negligible compared to the asset payload it references.
How does Surrogate-Key purging interact with fingerprinted filenames during a rolling deploy?
During a rolling deploy, old asset filenames (old hashes) and new asset filenames (new hashes) coexist on the CDN simultaneously. You should not purge old-hash URLs during the deploy — in-flight HTML responses referencing old hashes must still resolve. Instead, tag new assets with the new release tag and old assets with the old release tag. Only purge the old release tag after HTML delivery has fully transitioned to the new deployment (typically after your old instance terminates). Alternatively, let old-hash assets expire naturally — their max-age=31536000 means they occupy CDN cache space but are never requested again once all HTML is updated.
Related
- Static Asset Fingerprinting Fundamentals — parent overview
- ETag vs Immutable Cache-Control for Assets — deep comparison of the two validators
- Best Practices for Static Asset Naming Conventions — filename patterns that align with header strategy
- Cache Key Architecture — how CDN cache keys interact with fingerprinting
- Cache-Control Immutable and TTL Tuning — fine-tuning TTL values and staggered purge patterns