tooling.report

feature

Avoid cascading hash changes

Can a map be used to centralized bundle hashes?

Person in shorts with blue hair walking left

Introduction

When using hashed URLs for long-term caching, hashes can be collected into a centralized mapping (like an Import Map) to reduce the scope of cache invalidation. The mapping allows individual resource URLs to change without invalidating the URLs of every resource that references them.

This is most beneficial for JavaScript bundles, where each bundle is often referenced from at least one other bundle. Adopting a mapping technique for hashed bundled URLs makes it possible to push new versions of dependencies without having to push new versions of everything that references them.

The Test

This test simulates a multi-page application, with two entry modules representing pages, a shared dependency, a code-splitted bundle, and two text files (one loaded initially and one on demand). Each tool is configured to centralize bundle hashes, which generally requires embedding a module loader capable of performing URL hash resolution at runtime.

index.js

import { logCaps } from './utils.js';
import { exclaim } from './exclaim.js';
logCaps(exclaim('This is index'));
import('./lazy.js').then(({ str }) => logCaps(str));

profile.js

import txtURL from './some-asset.txt';
import { logCaps } from './utils.js';
logCaps('This is profile');
fetch(txtURL).then(async response => console.log(await response.text()));

exclaim.js

export function exclaim(msg) {
  return msg + '!';
}

utils.js

export function logCaps(msg) {
  console.log(msg.toUpperCase());
}

lazy.js

export const str = 'This is a string';

import txtURL from './some-asset2.txt';
fetch(txtURL).then(async response => console.log(await response.text()));

some-asset.txt

This is an asset!

some-asset2.txt

This is another asset!

The build produces four JavaScript bundles and two text files, all with hashed URLs. There should be bundles for the index.<hash>.js and profile.<hash>.js "routes", another for their logCaps() dependency, and a fourth for lazy.js.

To pass this test:

  • All output files except the HTML files must be hashed, and their hash must change if their content changes.
  • Changing some-asset.txt or some-asset2.txt should only change the hash of its output file and the hash of the file containing the mapping.
  • Changing utils.js should only change the hash of its output file and the hash of the file containing the mapping.
  • An entry point should still pick up the new some-asset.txt, some-asset2.txt and utils.js.

Conclusion

browserify

Please refer js-entry-cascade test. Marking this test fail because hashing is done after browserify run the build, which would change the content regardless of resource name. This test does not reflect how browserify expects the code to be written.

Issues

  • N/A
parcel

If you build HTML pages that reference scripts via <script src>, Parcel will use a custom runtime which acts as a registry and loader. This means that dynamically loaded content, such as assets and dynamic imports, can be looked up in the registry.

If you build HTML pages that reference scripts via <script type="module" src>, this causes Parcel to output ECMAScript modules rather than use its own loader. This doesn't yet allow for a central registry, though import maps may allow for this in the future.

rollup

You need to use SystemJS, which is a little weighty at 2.4k, but it's easy to get it working for both chunks and assets.

webpack

One of Webpack's core principles is that bundled modules are loaded using a small runtime, which acts as a module and registry and loader depending on the configuration. The generally recommended approach of configuring Webpack to produce a standalone runtime script.

The runtime script is a natural place to embed bundle hash information, since it already contains the mappings required for translating bundle identifiers into URLs. This centralized hash disambiguation allows dependency bundles to be updated with new hashed URLs, without requiring their dependent bundles to be updated with those hash values. The result is something like an Import Map, and it's enabled by default in Webpack. With a few configuration changes, it's relatively easy to ensure hash changes don't cascade between bundles.

One trade-off of centralizing hash information is that all hashes must be loaded up-front, which can impact page load for larger applications. To address this, webpack uses a two-level hash cascade by default: JS bundle hashes are loaded up-front at initial page load (since this number is small compared to all modules/assets), whereas asset hashes (for things like file-loader) are inlined into the chunks that reference the asset. This ensures most hashes are loaded as-needed and don't impact the initial page load.

In cases where there are very few assets or where chunks containing asset references are very large, it can make sense to opt out of the default one-level cascade behavior using splitChunks.cacheGroups. This can be used to move on-demand-loaded asset hashes into the runtime so they are loaded up-front, or into a separate bundle (by omitting name) containing the asset hashes and initialization code. It's also worth noting that assets don't cascade when referenced by entry bundles, which are loaded directly without relying on hashes in the runtime.