Service Worker Caching Strategies: Implementation, Diagnostics & Thresholds

This guide sits under the broader discipline of advanced caching and CDN architecture and focuses on the one layer you fully control in JavaScript: the service worker.

Implementing robust caching at the network boundary requires moving beyond browser defaults and programmatically controlling asset delivery. Service worker caching gives deterministic control over fetch interception, cache population, and fallback routing. The actionable boundaries you are defending are concrete: a repeat-visit Time to First Byte (TTFB) at or under 200ms when assets are served from the Cache API, a cache hit rate above 85% for versioned static assets, and zero regression in interaction responsiveness — an Interaction to Next Paint under 200ms — caused by background revalidation work. This page targets engineers who need to implement, measure, and troubleshoot caching logic without compromising data freshness or Core Web Vitals.

The workflow below is ordered: set up the environment and asset taxonomy, capture a baseline hit rate, isolate the dominant bottleneck, then apply the matching strategy. Skipping the baseline is the most common reason teams ship a service worker that feels fast in DevTools and regresses in the field.

1. Environment Setup & Strategy Selection Matrix

Before writing interception logic, pin your tooling and map your asset taxonomy to a strategy. Assume a Workbox 7.x toolchain (or hand-rolled Cache API), an HTTPS origin (or localhost for development), and a registered worker scoped to your app root.

Service workers intercept fetch events at the network boundary, letting you route requests through caches.match(), fetch(), or hybrid patterns. Static assets — JS bundles, CSS, hashed images — call for cache-first logic; dynamic API payloads call for network-first or stale-while-revalidate. Because the Cache API is independent of the HTTP cache, you must explicitly manage cache keys, expiration, and cleanup. Understanding how these patterns interact with origin directives is critical; the precedence rules live in HTTP Cache-Control headers explained.

The selection decision hinges on a single question per asset class: what is the cost of serving a value that is one revision out of date? For an immutable, content-hashed bundle the answer is "zero" — the filename changes when the bytes change, so a cached copy is never wrong, and cache-first is unambiguously correct. For a stock ticker or a checkout total the answer is "a correctness bug", and no amount of cache cleverness is worth it; those endpoints must stay network-only. Most real assets sit between those poles, and that middle ground is where teams get it wrong by reaching for whichever strategy they implemented first instead of the one the freshness tolerance demands.

A useful default mapping:

Asset class	Freshness tolerance	Strategy
Hashed JS/CSS/images	Immutable	Cache-first
App shell HTML	Minutes	Network-first w/ cache fallback
Read-heavy API (feeds, profiles)	Seconds–minutes	Stale-while-revalidate
Transactional API (cart, auth)	Real-time	Network-only

Environment checklist

Confirm registration scope: a worker at /sw.js controls /; a worker at /app/sw.js only controls /app/ unless the origin sends Service-Worker-Allowed.
Audit asset types via Chrome DevTools > Application > Cache Storage to verify pre-cached routes.
Map URL patterns to strategy types using explicit routing tables (/\.[0-9a-f]{8}\.js$/ for hashed assets vs /api/).
Verify the fetch listener wraps the entire promise chain in event.respondWith().

2. Capture a Baseline Hit Rate

You cannot improve what you have not measured. Before tuning strategies, instrument the live hit rate so every later change has a before/after. Detect a cache hit in the field by reading the Resource Timing entry: a transferSize of 0 alongside a non-zero encodedBodySize means the bytes came from a local store rather than the network.

javascript

// Field beacon: classify each resource as hit or miss for RUM.
const po = new PerformanceObserver((list) => {
  for (const e of list.getEntries()) {
    const hit = e.transferSize === 0 && e.encodedBodySize > 0;
    navigator.sendBeacon('/rum/cache', JSON.stringify({ name: e.name, hit }));
  }
});
po.observe({ type: 'resource', buffered: true });
// trade-off: transferSize is unreliable for cross-origin resources without
// Timing-Allow-Origin; skip this beacon for third-party hosts or it will
// misclassify every CDN asset as a miss.

Baseline thresholds

Cache hit rate target: >85% for versioned static assets, measured at the field p75.
Max static asset cache size: ~50MB per origin before risking QuotaExceededError on low-end devices — confirm actual headroom with navigator.storage.estimate().
Stale data tolerance: <24h for non-critical API endpoints.

Record the p50/p75/p95 hit rate over at least one full deploy cycle. First-install misses are expected and must be filtered out by checking that navigator.serviceWorker.controller exists. The reason a full cycle matters is that hit rate is a lagging indicator tied to the deploy cadence: it dips sharply right after every release as users pick up the new worker and repopulate caches, then climbs as the population warms. A single afternoon's sample taken just after a deploy will look alarming and just before one will look flawless; only the distribution across a whole cycle tells you the true floor you are defending. Capture lab numbers in parallel — a Lighthouse run under throttling locates which asset is missing — but let the field p75 be the number that gates whether a change ships.

3. Isolate the Dominant Bottleneck

With a baseline in hand, find the single largest source of misses or staleness before writing fixes. Most regressions trace to one of three causes: URL normalization drift, scope mismatch, or strategy mismatch.

Isolation workflow

Simulate offline mode in DevTools to verify fallback routing and caches.match() resolution.
Apply the sw initiator filter in the Network tab; rows reading (network) instead of (service worker) are your misses.
Compare caches.match() versus fetch() latency using performance.getEntriesByType('resource') to confirm the cache is actually faster, not just present.
Track storage usage via navigator.storage.estimate() to rule out silent quota eviction.

When the routing itself looks correct but assets still miss, the failure is almost always in cache-key construction. The full production playbook for that — query-string drift, Vary mismatches, scope conflicts — is covered in debugging service worker cache misses in production.

The discipline at this step is to change one variable at a time. It is tempting, having found a low hit rate, to simultaneously normalize URLs, widen scope, and rotate cache names in a single commit. That makes the regression go away but teaches you nothing about which lever moved the number, and it leaves you with three new behaviors to reason about the next time something breaks. Isolate the dominant cause, confirm it against the baseline beacon from the previous step, fix only that, and re-measure before touching the next suspect. Caching bugs are state bugs, and state bugs reward patience over breadth.

4. Apply the Matching Strategy

Cache-first with versioned fallback

Cache-first minimizes repeat-visit TTFB by serving from caches.open() before touching the network. Use a strict versioning scheme (static-v1.2.0) to prevent stale bundle execution, and attach cache.addAll() to the install event for critical shell assets.

javascript

self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/static/')) {
    event.respondWith(
      caches.match(event.request).then((cached) => {
        return cached || fetch(event.request).then((res) => {
          return caches.open('static-v2').then((cache) => {
            cache.put(event.request, res.clone());
            return res;
          });
        });
      })
    );
  }
});
// trade-off: cache-first serves stale bytes forever for any URL that is not
// content-hashed. Never apply it to mutable HTML or unversioned assets, or a
// bad deploy will be pinned in users' caches until the cache name rotates.

Network-first with timeout

For the app shell and near-real-time reads, prioritize live data but wrap fetch() in a timeout and fall back to cache if the network exceeds 3 seconds. When coordinating with the edge, keep origin TTLs aligned with worker logic — misaligned TTLs cause double-fetching. Synchronize edge behavior using CDN edge caching configuration.

javascript

self.addEventListener('fetch', (event) => {
  if (event.request.mode === 'navigate') {
    event.respondWith(
      Promise.race([
        fetch(event.request),
        new Promise((_, rej) => setTimeout(() => rej(new Error('timeout')), 3000)),
      ]).catch(() => caches.match(event.request))
    );
  }
});
// trade-off: a 3s timeout still blocks render for up to 3s on a flaky network
// before falling back. For shells that must paint instantly even when offline,
// prefer cache-first-with-revalidation over network-first.

Stale-while-revalidate for read-heavy API routes

Stale-while-revalidate (SWR) returns cached content immediately, then refreshes the entry in the background — ideal where perceived speed outweighs absolute freshness. It is the right default for content that changes on a human timescale: a profile page, a notification list, a product catalog where a few seconds of lag is invisible to the user but the instant first paint is not. The pattern's power is that it decouples perceived latency from network latency entirely; the user never waits on the origin for a route they have visited before. Its danger is that the staleness is silent — there is no spinner, no "loading", just confidently rendered old data — so the freshness budget has to be a deliberate decision rather than an accident of which strategy happened to be wired up. This is the headline pattern of stale-while-revalidate implementation.

javascript

self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/')) {
    event.respondWith(
      caches.match(event.request).then((cached) => {
        const network = fetch(event.request).then((res) => {
          if (res.ok) caches.open('api-cache').then((c) => c.put(event.request, res.clone()));
          return res;
        }).catch(() => cached);
        return cached || network;
      })
    );
  }
});
// trade-off: SWR always shows one-revision-old data on the first paint after a
// change. Do not use it for cart totals, balances, or auth state where a stale
// value is a correctness bug, not a cosmetic delay.

Offline resilience & fallback routing

A resilient worker degrades gracefully when the network fails. Add a catch-all handler that routes unmatched navigations to a pre-cached offline shell, and pre-cache that shell — minimal HTML, inline critical CSS, and an SVG logo — during install. Intercept only request.mode === 'navigate' so document requests get the fallback while subresources fail normally. Keep the fallback under 50KB gzipped so it paints instantly, and test it under both throttled 3G and full offline so you catch the difference between "slow" and "unreachable".

javascript

self.addEventListener('fetch', (event) => {
  if (event.request.mode === 'navigate') {
    event.respondWith(
      fetch(event.request).catch(() => caches.match('/offline.html'))
    );
  }
});
// trade-off: this fallback only covers full-document navigations. Subresource
// failures (a missing chunk, a 404 image) still surface as broken UI, so pair
// it with per-asset error handling rather than treating it as a safety net for
// everything.

When the offline shell renders but interactive features are missing, the cause is usually that runtime caches were never populated for the routes the shell links to — verify those routes are reachable from a cold cache, not just from a warmed-up development session.

Deconstructing the Cache Lifecycle Into Timing Phases

A request handled by a service worker passes through four measurable phases, each with its own budget. Diagnosing slowness means identifying which phase dominates rather than blaming "the cache" as a whole.

Registration & activation (one-time per version): keep install-phase cache.addAll() to essential shell files so activation completes in under ~1s; large precache manifests delay clientsClaim() and leave the first navigation uncontrolled.
Lookup (caches.match()): typically sub-5ms; if it dominates, your cache holds too many near-duplicate keys from unnormalized URLs.
Revalidation (background fetch() in SWR): must stay off the critical path — it should never delay the response already returned to the page. Budget it against your INP ceiling of 200ms so it does not contend for the main thread during interactions.
Eviction & rotation (activate): old caches must be deleted in the same activation that introduces the new version, or storage climbs toward the quota and the browser evicts unpredictably.

The dominant phase tells you which fix to apply first. A slow lookup is a key-hygiene problem; slow activation is a precache-size problem; staleness complaints are a strategy-selection problem.

Two cross-phase concerns deserve their own thresholds. First, storage pressure: once the origin's cache footprint approaches the browser's quota, eviction stops being something you control and becomes something the browser does to you at unpredictable moments. Poll navigator.storage.estimate() and treat crossing ~80% of the reported quota as a hard signal to trim runtime caches or shorten retention. Second, revalidation contention: background fetches in SWR are cheap on the network but not free on the main thread once their responses resolve and you parse and re-cache them. Schedule that work so it never lands inside an interaction window, or the very pattern you added to feel faster will quietly push INP past 200ms during scrolling and tapping.

A practical instrumentation habit is to tag every cache write with a timestamp in a sidecar IndexedDB record. That single piece of metadata turns "is this entry stale?" from a guess into a comparison, and it is the foundation of the per-strategy expiration thresholds described next.

Advanced Diagnostics & Framework Failure Modes

SPA hydration races. Calling skipWaiting() plus clientsClaim() mid-session can swap the controlling worker while a single-page app is mid-render, producing partial states or a mismatch between server- and client-rendered markup. Gate the swap behind an explicit user prompt or a navigation boundary. For frameworks that hydrate aggressively, validate that the new worker does not interrupt an in-flight hydration pass.

Opaque responses. A no-cors cross-origin fetch returns a response with status: 0 whose body cannot be inspected. Store these only when you can tolerate not knowing whether they are error pages, and configure CORS headers on your origin or CDN to get transparent, validatable responses instead.

Background Sync persistence. Pair SWR with the Background Sync API to queue failed writes and replay them on reconnect. Use workbox-background-sync or a custom IndexedDB queue so the queue survives browser restarts; an in-memory queue silently drops everything on tab close.

Cache stampedes. Under SWR, many concurrent requests for the same stale entry can each trigger a revalidation fetch. Deduplicate by holding a Map of in-flight promises keyed by URL and returning the existing promise for duplicate requests.

Timestamp-based expiration. Cache-first and SWR both need a maximum age or they serve indefinitely. Read the sidecar timestamp on retrieval, discard entries older than 24 hours for API data and 30 days for static assets, and fall through to the network when the entry is too old. Without this check, "cache-first" silently becomes "cache-forever" for any URL that is not content-hashed.

javascript

async function freshMatch(request, maxAgeMs) {
  const cached = await caches.match(request);
  if (!cached) return null;
  const ts = Number(cached.headers.get('sw-cached-at')) || 0;
  return (Date.now() - ts) < maxAgeMs ? cached : null;
}
// trade-off: this reads a custom header you must set at put() time, which means
// you cannot age-check responses cached by another tool (e.g. Workbox) that
// does not write the same header. Standardize the timestamp source or the
// check silently returns null and forces a network fetch every time.

When invalidation logic itself becomes the bottleneck — coordinating versioned rotation with edge purges and hashed-asset lifecycles — graduate to the dedicated patterns in cache invalidation patterns. And when you are choosing between SWR and cache-first specifically for a React SPA, the trade-off matrix in SWR vs cache-first service worker for React SPAs walks through the decision.

Workbox vs hand-rolled handlers

The snippets above are intentionally framework-free so the mechanics are visible, but most production teams reach for Workbox 7.x rather than maintaining bespoke fetch listeners. Workbox earns its place by collapsing the error-prone parts — precache manifest generation tied to your build hashes, Vary-aware matching, expiration plugins with quota enforcement, and Background Sync queues that survive restarts — into declarative routes. The cost is a layer of abstraction over fetch that you must still understand to debug, because a misconfigured Workbox route fails in exactly the same ways a hand-rolled one does: wrong scope, unnormalized keys, stale strategy choice.

javascript

import { registerRoute } from 'workbox-routing';
import { CacheFirst, StaleWhileRevalidate } from 'workbox-strategies';
import { ExpirationPlugin } from 'workbox-expiration';

registerRoute(/\.[0-9a-f]{8}\.(js|css)$/, new CacheFirst({
  cacheName: 'static-v3',
  plugins: [new ExpirationPlugin({ maxEntries: 60, maxAgeSeconds: 30 * 24 * 60 * 60 })],
}));
registerRoute(/\/api\/feed/, new StaleWhileRevalidate({ cacheName: 'api-feed' }));
// trade-off: Workbox adds ~8-12KB to the worker and hides the fetch flow behind
// strategy classes. For a worker with two or three routes, the hand-rolled
// version is smaller and easier to step through in DevTools — only adopt
// Workbox once precache generation and expiration become real maintenance pain.

Whichever path you take, the diagnostic and budgeting discipline below is identical; the routing layer does not change what "healthy" looks like.

Validation & CI Budgeting

Treat caching behavior as a budgeted, asserted property — not something verified by eyeballing DevTools once.

javascript

// Lighthouse CI assertion: enforce an efficient cache policy on every build.
module.exports = {
  ci: {
    assert: {
      assertions: {
        'uses-long-cache-ttl': ['error', { maxLength: 0 }],
        'service-worker': 'error',
        'works-offline': 'error',
      },
    },
  },
};
// trade-off: works-offline as an error will fail builds for apps that
// legitimately have no offline mode. Drop it to 'warn' (or remove it) unless
// offline support is a product requirement.

After each deploy, run Lighthouse under Fast 3G throttling, confirm (service worker) appears in the Network initiator column for all critical paths, and check that the field hit rate has not regressed below the 85% floor. Treat the worker like any other production dependency: it gets a rollback plan, a phased rollout, and an alert. Ship the new version to a small slice of traffic first, watch the cache-hit-rate and origin-latency beacons for a stable window, and only then promote to the full population. The reason staged rollout matters more for service workers than for ordinary code is that a broken worker can pin itself in users' browsers — a cache-first handler serving a corrupt asset will keep serving it until the cache name rotates, so the blast radius of a bad release outlives the release itself. If miss rates spike above 20% within 15 minutes of a release, roll back by bumping the cache version and deleting stale versions in activate.

javascript

const CURRENT = 'v3';
self.addEventListener('activate', (event) => {
  event.waitUntil(
    caches.keys().then((keys) =>
      Promise.all(keys.filter((k) => k !== CURRENT).map((k) => caches.delete(k)))
    )
  );
});
// trade-off: deleting all non-current caches on activate purges runtime API
// caches too, so the first post-deploy visit pays a full network round-trip.
// Whitelist long-lived runtime caches if that cold-start cost is unacceptable.

Common Implementation Pitfalls

Ignoring HTTP cache headers. Relying solely on worker logic causes double-fetching or stale content when origin Cache-Control directives conflict with routing. Align edge and worker TTLs.
Skipping cache cleanup. Without versioned rotation in activate, storage exhausts and the browser evicts silently. Tie cache names to build hashes.
Cache-first on dynamic endpoints. Reserve cache-first for immutable, hashed assets only; transactional data must stay network-first or network-only.
Careless skipWaiting()/clientsClaim(). These cause partial updates or broken hydration in SPAs. Gate them behind explicit version checks or user prompts.
Unversioned cache names. Hardcoded names make deployment-time invalidation impossible.
Caching opaque responses blindly. no-cors responses (status: 0) cannot be validated; store them only when error detection is unnecessary.

HTTP Cache-Control headers explained — align origin directives with your worker routing.
Stale-while-revalidate implementation — the canonical SWR pattern across edge and worker layers.
SWR vs cache-first service worker for React SPAs — decide which strategy fits a hydrating SPA.
Debugging service worker cache misses in production — trace and fix routing and key-hygiene failures live.
Cache invalidation patterns — coordinate worker rotation with edge purges.