JavaScript Bundle Optimization & Code Splitting

JavaScript is the single most expensive resource on most modern sites, and unlike images it costs twice: once to download and again to parse, compile, and execute on the main thread. Both halves of that cost map directly onto field metrics. Initial transfer size sets the ceiling for Largest Contentful Paint when scripts are render-blocking or when hydration gates first paint; execution time sets the floor for Interaction to Next Paint, because every long task that occupies the main thread is a task that cannot dispatch a click handler. The actionable boundaries are explicit: INP under 200ms at field p75, no single task longer than the 50ms long-task threshold, LCP under 2.5s, and a defensible byte budget for the initial route — a working default is 150KB gzipped of JavaScript before interactivity, with per-route chunks under 50KB gzipped.

This guide is organized around where bytes and milliseconds are actually lost: network delivery (how much ships and in how many requests), client execution (parse, compile, tree-shaking, and hydration), and module-format and scheduling decisions that determine whether your bundler can eliminate dead code and whether your code yields to the browser. Each section moves from a measurable baseline to a root-cause isolation step to a targeted fix you can verify in CI. The assumption throughout is that you already use Chrome DevTools, a modern bundler (Vite, webpack 5, esbuild, or Rollup), and a framework with route-based code splitting available.

Diagnostic Overview: Lab to Locate, Field to Decide

The first discipline is separating the two questions a bundle audit must answer: what is shipping and what is it costing real users. Lab tooling answers the first deterministically; only field data answers the second.

Field data is the boundary that ships. Pull Core Web Vitals from the Chrome User Experience Report or your own RUM beacon and read them at percentiles, not averages. The p75 of INP and LCP is the number Google uses for ranking and the number that reflects your slower-device, slower-network users; p50 will flatter you and p95 will panic you. If field INP p75 sits above 200ms while your lab Lighthouse run is green, the bottleneck is almost always main-thread execution under real input pressure — hydration, third-party scripts, or a heavy event handler — not download size.

Lab data is how you locate the bottleneck once the field tells you one exists. Run Lighthouse under simulated mid-tier mobile throttling (a Moto G-class CPU multiplier and a slow 4G profile) so the numbers approximate your p75 device, not your workstation. The Performance panel's main-thread flame chart shows you which scripts dominate Total Blocking Time and where long tasks cluster around hydration. The Coverage panel quantifies unused bytes per file at first load — anything above roughly 40% unused in a route chunk is a code-splitting opportunity. For the byte side, generate a treemap with Webpack bundle analysis techniques on every build so duplicate dependencies and oversized transitive packages surface before review, not after deploy.

The workflow that follows is the same regardless of framework: capture the field p75 baseline, reproduce the dominant cost in the lab under throttling, attribute it to a specific module or chunk, apply one targeted change, and re-measure both lab and field before moving on.

Architecture 1: Network Delivery and Code-Splitting

The delivery layer governs how many bytes cross the wire before the user can interact and how those bytes are partitioned across requests. The goal is not minimum total bytes — it is minimum critical-path bytes with a request count that fits the connection.

Start by splitting along the two axes that change at different rates: vendor code (stable, cacheable for a year) and application code (volatile, rebuilt every deploy). Keeping framework runtime in its own long-lived chunk means a feature change never invalidates React or Vue for returning users. Layered on top of that, route-based splitting ensures the dashboard's charting library never ships to the marketing landing page. The detailed mechanics — import(), framework lazy boundaries, and route manifests — are covered in dynamic imports and route-based splitting.

The failure mode at this layer is over-splitting. Each dynamic chunk is a separate request, and on a high-latency connection a dozen tiny chunks loaded sequentially can be slower than one medium chunk, because the browser only discovers chunk N+1 after parsing chunk N. The countermeasure is modulepreload: emit <link rel="modulepreload"> for the chunks a route is known to need so they download in parallel rather than in a waterfall. Reserve prefetch (lower priority) for chunks the user is likely to need next, such as the route behind the primary call-to-action.

javascript

// webpack.config.js — split by cache lifetime, not by file count
module.exports = {
  optimization: {
    runtimeChunk: 'single', // one shared runtime so chunk hashes stay stable across deploys
    splitChunks: {
      chunks: 'all',
      maxInitialRequests: 25, // cap parallel initial requests; raising this invites waterfalls on HTTP/1.1
      minSize: 20000,         // trade-off: below ~20KB a separate chunk costs more in request overhead than it saves
      cacheGroups: {
        framework: {
          test: /[\\/]node_modules[\\/](react|react-dom|scheduler)[\\/]/,
          name: 'framework',
          priority: 40,       // highest: framework changes least often, deserves its own long-lived hash
        },
        vendor: {
          test: /[\\/]node_modules[\\/]/,
          name: 'vendor',
          priority: 20,
        },
      },
    },
  },
};
// trade-off: this manual grouping is for webpack 5; Vite/Rollup users should prefer
// manualChunks only when the default heuristics produce a duplicated or oversized vendor chunk.

Delivery is also where caching and compression compound the wins. Content-hashed filenames make Cache-Control: public, max-age=31536000, immutable safe, so returning users pay zero bytes for unchanged chunks — the deployment-time invalidation pattern is detailed under advanced caching strategies and CDN architecture. Enforce Brotli for JavaScript at the edge; it typically beats gzip by 15-20% on minified code. Over HTTP/2 or HTTP/3 the cost of an extra parallel request drops sharply, which is why your maxInitialRequests ceiling should be tuned to your actual protocol, not copied from an HTTP/1.1-era config.

Architecture 2: Client Execution, Tree-Shaking, and Hydration

Once bytes arrive, the main thread pays the execution cost — and this is where INP is won or lost. Three levers dominate: how much dead code survived into the bundle, how the framework hydrates, and how long any single task runs.

Tree-shaking is the cheapest byte reduction available, but it is fragile. A bundler can only prove a module is unused if it can statically analyze the import graph, which means ESM import/export and an honest sideEffects declaration. One CommonJS dependency, one re-export barrel that touches every file, or a missing "sideEffects": false can retain an entire package. The mechanics and the common offenders — Lodash, Moment, and barrel files — are covered in tree-shaking and dead-code elimination.

json

{
  "name": "@org/frontend-app",
  "type": "module",
  "sideEffects": [
    "*.css",
    "./src/polyfills.js"
  ],
  "exports": {
    "./utils": {
      "import": "./dist/esm/utils.js",
      "require": "./dist/cjs/utils.js"
    }
  }
}

text

// trade-off: "sideEffects": false enables aggressive pruning, but if any module relies on
// import-time effects (polyfills, CSS injection, global registration) you MUST list it here,
// or tree-shaking will silently delete code your app depends on at runtime.

Hydration is the execution cost most teams under-measure. Server-rendered HTML paints fast, but the framework must then re-attach event listeners by walking the component tree — and on the initial route that walk is frequently the largest long task in the entire load. The flame chart will show it as a single uninterrupted block right after the main bundle parses. The fixes are progressive: ship less hydration-critical code (move logic behind interaction or visibility), use partial or selective hydration where the framework supports it, and ensure hydration work is broken into yielding chunks rather than one monolithic task. Anything that pushes a hydration task past the 50ms long-task threshold is directly delaying INP for any user who taps during load.

The discipline that ties this section together is to ship only what the first interaction needs synchronously and defer everything else. Analytics, A/B frameworks, charting, rich-text editors, and date pickers almost never belong in the initial execution path. Loading them lazily — on idle, on interaction, or on viewport entry — keeps the initial main-thread budget intact while still delivering the feature when it is actually used.

Architecture 3: Module Formats and Long-Task Scheduling

The format your dependencies ship in and the way your code yields to the browser are the two structural decisions that determine whether the previous two sections can succeed at all.

Module format is upstream of tree-shaking. Native ESM is statically analyzable, so bundlers can prune exports and hoist modules; CommonJS is evaluated dynamically, so a bundler must conservatively keep whole modules. Prefer dependencies that publish an ESM exports entry, and prefer named imports over namespace imports so the analyzer sees exactly what you use. The interop edge cases — dual-package hazards, require of an ESM-only package, and conditional exports — are detailed in modern module formats: ESM vs CommonJS. The practical rule: every CommonJS dependency in your critical path is a tree-shaking risk worth auditing.

Scheduling is what keeps execution inside the 50ms budget even when the work is irreducible. A long task is any task that runs the main thread for more than 50ms without yielding; during that window the browser cannot respond to input, which is the direct mechanism of poor INP. The fix is to break the work into chunks and yield between them so the scheduler can interleave input handling. The modern primitive is scheduler.yield() (with a setTimeout/isInputPending fallback), which lets you cooperatively cede the thread mid-task. This pairs naturally with INP work covered under optimizing input responsiveness.

javascript

// Break a heavy synchronous loop into yielding chunks so input can be serviced.
async function processInChunks(items, handle) {
  for (let i = 0; i < items.length; i++) {
    handle(items[i]);
    // Yield roughly every 50ms so no single task crosses the long-task threshold.
    if (i % 200 === 0 && navigator.scheduling?.isInputPending?.()) {
      if ('scheduler' in window && 'yield' in scheduler) {
        await scheduler.yield();
      } else {
        await new Promise((r) => setTimeout(r)); // fallback for browsers without scheduler.yield
      }
    }
  }
}
// trade-off: yielding adds wall-clock latency to the total job, so do NOT chunk work that must
// complete atomically or that the user is actively waiting on with no other interaction possible —
// for pure background computation, prefer offloading to a Web Worker instead.

For genuinely heavy, parallelizable computation — large JSON parsing, image processing, search indexing — the right structural move is to leave the main thread entirely and run the work in a Web Worker, so hydration and input handling are never contended. Scheduling and offloading are complementary: yield for work that must touch the DOM, offload for work that does not.

Monitoring & CI: Budgets That Block Regressions

Optimizations that are not defended in CI decay within a few sprints, because every feature adds bytes and no single PR looks expensive. Two gates, run on every pull request, hold the line.

The first is a hard byte budget per chunk. A size check fails the build when the initial JavaScript exceeds its budget or any route chunk crosses its limit, forcing a documented decision rather than a silent regression. The second is Lighthouse CI under throttling, asserting on Total Blocking Time and LCP so an execution regression is caught even when byte size is flat — the canonical setup is documented in the best Lighthouse CI setup for frontend pipelines.

yaml

# .github/workflows/perf-budget.yml — fail the PR before bytes reach production
- name: Check bundle budgets
  run: npx bundlesize
# bundlesize.config in package.json:
#   [{ "path": "dist/assets/index-*.js", "maxSize": "150 kB", "compression": "gzip" },
#    { "path": "dist/assets/route-*.js", "maxSize": "50 kB", "compression": "gzip" }]

- name: Lighthouse CI
  run: |
    npx lhci autorun \
      --collect.settings.throttlingMethod=simulate \
      --assert.assertions.total-blocking-time=error:300 \
      --assert.assertions.largest-contentful-paint=error:2500
# trade-off: hard error budgets stop regressions but block legitimate large features;
# keep an explicit, reviewed override path (e.g. raising the budget in the same PR) so the
# gate forces a conversation instead of being disabled outright the first time it's inconvenient.

Pair these synthetic gates with a field RUM beacon so you know whether the lab improvements actually moved real users. Report the metric, the device class, and the connection type so a regression can be attributed to a segment rather than averaged away.

javascript

import { onLCP, onINP, onCLS } from 'web-vitals';

function sendToAnalytics(metric) {
  const body = JSON.stringify({
    name: metric.name,
    value: metric.value,
    rating: metric.rating,
    // Segment by capability so a regression on low-end devices is not hidden in the p50.
    cores: navigator.hardwareConcurrency || 0,
    connection: navigator.connection?.effectiveType || 'unknown',
  });
  navigator.sendBeacon('/api/vitals', body); // trade-off: sendBeacon is fire-and-forget — use fetch+keepalive if you need delivery confirmation
}

onLCP(sendToAnalytics);
onINP(sendToAnalytics);
onCLS(sendToAnalytics);

Reference Implementation: Route-Level Lazy Boundary with Resilient Fallback

A dynamic import that lacks an error boundary turns a transient chunk-load failure (a deploy mid-session, a flaky network) into a blank screen. The production pattern wraps every lazy boundary in both a loading state and a recovery path.

jsx

import React, { Suspense, lazy } from 'react';
import ErrorBoundary from './ErrorBoundary';

const Dashboard = lazy(() => import(/* webpackChunkName: "dashboard" */ './routes/Dashboard'));

function DashboardRoute() {
  return (
    <ErrorBoundary fallback={<RetryPrompt />}>
      <Suspense fallback={<RouteSkeleton />}>
        <Dashboard />
      </Suspense>
    </ErrorBoundary>
  );
}
// trade-off: a skeleton fallback prevents layout shift but adds a render; for instant routes
// (already-cached chunks) the extra Suspense boundary is pure overhead — only wrap routes whose
// chunk is genuinely deferred, not every component.

Framework-specific wiring, including Next.js route segments and prefetch tuning, is covered in implementing route-level code splitting in Next.js.

Common Pitfalls

Over-splitting into micro-chunks. Dozens of sub-20KB chunks loaded sequentially create a request waterfall that delays interactivity more than a single medium chunk would. Set a minSize floor and count requests, not just total bytes.
A single CommonJS dependency defeating tree-shaking. One dynamically-evaluated package or barrel re-export can retain an entire module. Audit the treemap for surprisingly large dependencies and prefer ESM builds.
Missing or wrong sideEffects declaration. Setting "sideEffects": false while a module relies on import-time effects deletes needed code; omitting it entirely disables pruning. List the exact effectful files.
Treating hydration as free. Server-rendered HTML paints fast, but the hydration walk is often the largest long task on the initial route. Measure it on the flame chart and defer non-critical components.
Synchronous third-party scripts in the critical path. Analytics and tag managers loaded synchronously block the main thread and spike INP. Load them on idle or after first interaction.
No modulepreload for known route chunks. Without preload hints, dependent chunks download in a serial waterfall. Emit modulepreload for chunks the route is certain to need.
Optimizing against lab numbers only. A green Lighthouse run on a fast machine says nothing about field p75 INP on a mid-tier phone. Always validate against RUM before declaring a win.
Budgets that exist but never block. A size check that only warns is ignored within weeks. Make the gate a hard error with a reviewed override path so regressions force a decision.

Frequently Asked Questions

What initial JavaScript budget keeps LCP under 2.5s? Target under 150KB gzipped of JavaScript before interactivity, with individual route chunks under 50KB gzipped. That figure accounts for download plus parse, compile, and execute time on a mid-tier mobile device under slow-4G throttling. Beyond it, main-thread blocking during startup typically pushes LCP past 2.5s even when the LCP element itself is an image.

Why is my field INP above 200ms when Lighthouse is green? Lighthouse measures a cold load on a relatively fast environment and does not exercise real interactions under contention. Field INP above 200ms almost always traces to main-thread execution — hydration, a heavy event handler, or a third-party script — running long tasks while the user is trying to interact. Profile real interactions in the Performance panel and break any task over 50ms into yielding chunks.

Why isn't tree-shaking removing code I'm sure is unused? Tree-shaking requires statically analyzable ESM and an accurate sideEffects declaration. CommonJS modules, namespace imports, and barrel files that re-export everything force the bundler to retain whole modules. Switch to named imports from ESM builds, declare sideEffects precisely, and verify removal in the bundle treemap rather than assuming.

Does more aggressive code-splitting always improve performance? No. Splitting reduces the bytes on any one route, but each chunk is a request, and too many small chunks loaded sequentially create a waterfall that can be slower than a single larger chunk — especially over HTTP/1.1 or high-latency links. Balance chunk count with modulepreload hints, set a minSize floor, and tune maxInitialRequests to your actual transport protocol.

Should I prioritize HTTP/3 or bundle work first? Bundle work first. HTTP/2 multiplexing already removes head-of-line blocking for parallel chunk fetches, and HTTP/3 mainly helps on lossy networks. Immutable caching, Brotli compression, dead-code elimination, and right-sized chunks deliver larger and more predictable gains than a protocol upgrade, and they compound regardless of transport.

Dynamic imports and route-based splitting — the request-level mechanics of deferring chunks without creating waterfalls.
Tree-shaking and dead-code elimination — why pruning fails and how to make every export statically analyzable.
Modern module formats: ESM vs CommonJS — the format decisions that determine whether your bundler can optimize at all.
Webpack bundle analysis techniques — generating and reading treemaps to find duplicate and oversized dependencies.
Optimizing input responsiveness — keeping INP under 200ms once the bundle is lean by scheduling and offloading main-thread work.
Image and media optimization — the other half of the critical-path byte budget once JavaScript is under control.