CDN Edge Caching Configuration: Frontend Asset Delivery & Performance Tuning

This guide sits under the broader advanced caching and CDN architecture reference and focuses on one job: making the edge serve your frontend bundles within budget. Modern frontend architectures rely on distributed delivery networks to minimize latency and maximize throughput. Proper edge caching is not merely about setting expiration headers; it requires precise rule orchestration, diagnostic validation, and protocol alignment. The actionable boundaries we hold throughout: cached static assets should return a Time-To-First-Byte of TTFB ≤ 200ms (ideally ≤ 50ms from a warm edge), the cache hit ratio for hashed assets should exceed 85%, and the downstream effect — Largest Contentful Paint — must stay under LCP < 2.5s at the field 75th percentile.

The workflow below moves from baseline capture, through bottleneck isolation, to a validated fix. Every configuration snippet carries an explicit trade-off comment so you know when not to apply it.

Edge cache request path and budgets A browser request resolves at the nearest edge node, falls back to a regional shield, then origin, each hop with a latency budget. Edge cache request path Browser SW + disk cache Edge node TTFB ≤ 50ms Shield regional cache Origin last resort Hit ratio target: > 85% static, > 60% dynamic API. Each miss that reaches origin costs a full round trip — keep it rare. Shielding collapses many edge misses into one origin fetch. Field p75 ships; synthetic probes only locate the slow hop.

Problem Framing: Where Edge Latency Actually Comes From

When a frontend feels slow despite a "fully cached" CDN, the regression almost always traces to one of four measurable failure modes: a cache key that fragments identical assets into distinct objects, a Vary header that multiplies stored variants, a TTL that forces conditional revalidation on every request, or a TLS handshake that adds a hidden round trip before any byte of payload moves. Each degrades a specific number. Fragmentation and over-broad Vary crush the hit ratio below the 85% floor. Short or absent TTLs push TTFB past the 200ms boundary because the browser issues If-None-Match and waits for a 304. Handshake bloat inflates the connection setup phase that precedes TTFB entirely.

The diagnostic discipline is to attribute the slowdown to one phase before touching configuration. A 600ms TTFB on a "cached" asset is not a TTL problem if CF-Cache-Status: HIT is present — it is an origin-shield or handshake problem. Read the headers first.

1. Environment Setup: Categorize Assets by Volatility

The foundation of edge caching is TTL alignment with asset volatility. Categorize every build output from Vite or Webpack into three tiers before writing a single rule.

Immutable assets — content-hashed JS/CSS bundles like app.a1b2c3d4.js — should enforce max-age=31536000 with the immutable directive. This eliminates conditional If-None-Match requests entirely, cutting origin load by roughly 90% for static resources because the browser never revalidates. The exact directive semantics that make this safe are covered in HTTP Cache-Control headers explained; the short version is that immutable is a promise the filename will change before the content does.

Versioned assets — non-hashed images, legacy polyfills, favicons — require max-age=86400 through max-age=604800 depending on release cadence. They lack a content hash, so they need an expiry that bounds staleness.

Dynamic payloads — JSON configs, feature flags, user-adjacent data — should lean on stale-while-revalidate to keep sub-100ms responses during background origin fetches. The full implementation of that directive across origin, edge, and service worker is documented in stale-while-revalidate implementation.

2. Capture Baseline: Cache Status & TTFB Probes

Before deploying new edge rules, capture a baseline so you can prove the fix worked. Audit current cache-status headers across your top 20 traffic paths with curl -I or the DevTools Network tab, recording X-Cache, CF-Cache-Status, or Age for each.

bash
# Probe cache status and TTFB across a path list from one region.
for p in / /assets/app.js /assets/main.css /api/config; do
  curl -sw 'TTFB=%{time_starttransfer}s\n' -o /dev/null -D - "https://your-domain.com$p" \
    | grep -iE 'cf-cache-status|x-cache|age|cache-control|TTFB='
  echo "--- $p"
done
# trade-off: single-region curl misses geographic latency entirely.
# Do NOT trust these numbers as a field baseline — use WebPageTest from
# distributed probes when the issue is regional, not configurational.

Run synthetic monitoring (WebPageTest, Lighthouse CI) from geographically distributed probes to measure TTFB at the edge. If TTFB exceeds 200ms for assets that report a cache HIT, the problem is origin shielding or handshake latency, not TTL. Set alert thresholds in your observability stack: warn when the static hit ratio drops below 75% for 10 consecutive minutes, or when TTFB p95 crosses 200ms. Correlate those alerts with Core Web Vitals regressions so you remediate what actually moves the field number.

3. Isolate Bottleneck: Cache Key & Vary Audit

A low hit ratio with correct TTLs almost always means cache-key fragmentation. The CDN treats /assets/app.js?utm_source=x and /assets/app.js as two objects, so a marketing campaign can silently halve your hit ratio overnight. Audit the cache key composition first.

Configure path-based rules to intercept /assets/, /static/, and /*.js, then normalize the key to strip tracking parameters (utm_*, fbclid, gclid) that fragment storage without altering the payload. For SPA fallbacks like index.html, enforce no-cache so route updates propagate immediately without a full purge. The second fragmentation source is Vary: a Vary: User-Agent header explodes a single asset into hundreds of variants. Keep Vary to Accept-Encoding and, where genuinely needed, Accept for content negotiation — nothing else.

When the slowdown survives a clean key and tight Vary, the remaining suspect is invalidation strategy — stale objects being purged on every deploy and re-fetched cold. The patterns that avoid that thrash are covered in cache invalidation patterns, which contrasts tag-based purging against versioned URLs.

4. Apply Fix: Edge Rules for Frontend Bundles

With assets categorized and keys normalized, write provider rules. The three snippets below are production-ready starting points.

Cloudflare Cache Rules for Immutable Assets

Cloudflare's modern Cache Rules (replacing the deprecated Page Rules API) use the Ruleset Engine. Configure via Caching > Cache Rules in the dashboard or the API:

json
{
  "rules": [
    {
      "expression": "(http.request.uri.path matches \"^/assets/\")",
      "action": "set_cache_settings",
      "action_parameters": {
        "cache": true,
        "edge_ttl": { "mode": "override_origin", "default": 31536000 },
        "browser_ttl": { "mode": "override_origin", "default": 31536000 }
      }
    },
    {
      "expression": "(http.request.uri.path eq \"/index.html\")",
      "action": "set_cache_settings",
      "action_parameters": { "cache": false }
    }
  ]
}

Enforces a 1-year edge and browser TTL for hashed assets while bypassing cache for the SPA entry point. The trade-off lives in the comment-free JSON, so state it here: do NOT apply the ^/assets/ immutable rule unless every file under that path carries a content hash — a single unhashed file there will be pinned for a year and require a manual purge to update.

Vercel Edge Middleware Cache Headers

javascript
import { NextResponse } from 'next/server';

export function middleware(request) {
  const response = NextResponse.next();
  const url = request.nextUrl.pathname;

  if (url.startsWith('/_next/static/')) {
    // Hashed build output: pin for a year, skip revalidation entirely.
    response.headers.set('Cache-Control', 'public, max-age=31536000, immutable');
  } else if (url.endsWith('.json')) {
    // Semi-dynamic config: instant serve, refresh in background.
    response.headers.set('Cache-Control', 'public, max-age=60, stale-while-revalidate=86400');
  }
  // trade-off: middleware runs on EVERY matched request and adds edge compute
  // cost. Do NOT use it for headers you can set statically in next.config —
  // reserve it for conditional logic that static config cannot express.
  return response;
}

AWS CloudFront Cache Policy

json
{
  "CachePolicy": {
    "Name": "Frontend-Static-Assets",
    "ParametersInCacheKeyAndForwardedToOrigin": {
      "CookiesConfig": { "CookieBehavior": "none" },
      "HeadersConfig": { "HeaderBehavior": "none" },
      "QueryStringsConfig": { "QueryStringBehavior": "none" }
    },
    "DefaultTTL": 31536000,
    "MaxTTL": 31536000,
    "MinTTL": 0
  }
}

Strips cookies, headers, and query strings from the cache key to maximize hit ratio for versioned assets. The trade-off: CookieBehavior: none means this policy can never serve personalized responses — do NOT attach it to any route that reads a session cookie, or every user will receive the first cached variant.

Deconstructing the TTFB Phases at the Edge

To tune systematically, break the cached-asset response into its timing phases, each with its own budget and its own fix. A 250ms TTFB is not one problem; it is a stack of four.

  • DNS + connection (target ≤ 30ms warm): resolved by edge anycast and connection reuse. If this dominates, your problem is handshake, not cache.
  • TLS handshake (target ≤ 100ms on 4G, ≤ 0ms resumed): TLS 1.3 collapses key exchange and authentication into one flight; session resumption removes it entirely for returning users.
  • Edge processing (target ≤ 10ms on HIT): cache lookup and rule evaluation. Heavy edge-worker logic on the hot path shows up here.
  • Origin round trip (only on MISS): the phase you are trying to make rare. Every uncached request pays it in full.

Attribute the dominant phase from a WebPageTest connection view before changing anything. Fixing TTL when the cost is in the handshake wastes a deploy cycle.

Advanced Diagnostics: TLS, Protocol, and Service Worker Coordination

TLS handshake and session resumption

Even with optimal caching, an unoptimized handshake negates edge gains. Enforce TLS 1.3 to combine key exchange and authentication into a single round trip. Enable session tickets and IDs for resumption, targeting a resumption rate above 70% for returning users. Disable legacy cipher suites (RC4, 3DES, CBC) and enforce forward secrecy via ECDHE. Use OCSP stapling at the edge to remove certificate-validation latency on first connect. Hold full connection establishment (TCP + TLS) under 150ms for cached routes; inspect chrome://net-export traces or WebPageTest connection timings when handshake latency spikes during traffic surges.

Protocol selection and resource hints

Edge nodes are the first point of TLS negotiation and HTTP protocol selection. Enable HTTP/2 multiplexing — or HTTP/3 where available — to reuse connections. Note that HTTP/2 Server Push was removed from Chrome 106+, so use Link: rel=preload response headers from an edge worker for above-the-fold CSS and fonts only. Verify with performance.getEntriesByType('resource') that preloaded assets are consumed within 500ms of navigation start; if not, drop the hint to avoid priority inversion against the LCP resource.

javascript
// Edge worker injecting a preload hint for the critical stylesheet only.
addEventListener('fetch', (event) => {
  event.respondWith((async () => {
    const res = await fetch(event.request);
    if (event.request.url.endsWith('/index.html')) {
      const h = new Headers(res.headers);
      h.append('Link', '</assets/critical.css>; rel=preload; as=style');
      return new Response(res.body, { status: res.status, headers: h });
    }
    return res;
    // trade-off: a preload hint for a resource the page does not use early
    // wastes bandwidth and can demote the real LCP image. Do NOT preload
    // speculatively — only assets the first paint provably needs.
  })());
});

Service worker and edge coordination

Frontend apps often run dual cache layers: the CDN edge and a client-side service worker. Without coordination they conflict, serving stale shells or double-fetching. Use cache-first at the edge for versioned assets while the service worker owns offline fallback and dynamic routes. On deploy, purge edge caches for changed assets and signal active workers via postMessage to skipWaiting. For the strategy patterns that govern the client layer, study service worker caching strategies, and verify version parity across layers with caches.keys() and navigator.serviceWorker.controller.

The remaining edge-case failure mode is the origin outage. When the origin returns 5xx mid-revalidation, a naive config surfaces the error to users. The stale-if-error directive serves the last good cached copy instead — the complete edge configuration for that is covered in configuring stale-if-error for origin outages.

Validation & Budgeting in CI

Make the thresholds executable so a regression fails the build, not the field. Assert cache headers and hit-status in CI against a staging deploy before promotion.

javascript
// playwright + a header assertion run in CI against staging.
import { test, expect } from '@playwright/test';

test('hashed assets ship immutable with a year TTL', async ({ request }) => {
  const res = await request.get('https://staging.your-domain.com/assets/app.js');
  const cc = res.headers()['cache-control'] ?? '';
  expect(cc).toContain('immutable');
  expect(cc).toMatch(/max-age=31536000/);
  expect(Number(res.headers()['age'] ?? 0)).toBeGreaterThanOrEqual(0);
  // trade-off: asserting on a hashed filename couples the test to one build.
  // Do NOT hardcode the hash — glob the manifest, or this test breaks every deploy.
});

Pair the header assertion with a Lighthouse CI budget that holds TTFB and LCP. Set a time-to-first-byte budget at 200ms and an largest-contentful-paint budget at 2500ms so a cache regression that pushes either past the boundary blocks the merge. Track the same metrics in RUM keyed on CF-Cache-Status so you can distinguish an edge MISS spike from an origin slowdown in production.

Common Pitfalls

  • Identical TTLs for hashed and unhashed assets — stale delivery for the unhashed files or needless origin fetches; tier them.
  • Ignoring Vary fragmentationVary: User-Agent multiplies storage and tanks the hit ratio; keep it to Accept-Encoding.
  • Purging instead of versioning — manual purges cause cold-cache TTFB spikes on every deploy; prefer content hashing.
  • Unnormalized query parameters — duplicate cached objects for the same resource; strip tracking params from the key.
  • Uncoordinated SW and edge invalidation — mixed-version asset loading and broken hydration; add a version handshake in the app shell.
  • Caching index.html as immutable — pins the SPA entry point and freezes route updates for a year.
  • Trusting single-region curl as a baseline — masks the geographic latency that field users actually feel.

FAQ

What is the optimal cache hit ratio for frontend assets?

Target above 85% for static, versioned assets and above 60% for dynamic API responses. Ratios below those floors signal cache-key fragmentation, misconfigured TTLs, or over-broad Vary usage.

Should I use stale-while-revalidate for all frontend routes?

No. Reserve it for semi-dynamic payloads. Immutable assets should use strict max-age with immutable to avoid revalidation overhead entirely.

When should I purge the CDN versus deploying versioned assets?

Prefer versioned naming (app.a1b2c3.js) over purges. Reserve purges for emergency hotfixes or origin config changes that fingerprinting cannot express.