Sysadmin

Browser HTTP caching

The browser has mainly 3 ways to use its cache when fetching a resource:

  1. Do not use the cache.
  2. Use the cache only if the resource has not been modified.
  3. Use the cache until given resource’s expiry date has been reached.

Do not use the browser’s cache

The server may disallow the browser to store a resource in the browser’s cache. To achieve this, the server adds the Cache-Control: no-store directive into the response headers. This is useful for preventing private information to be stored in cache.

A user may also bypass the browser’s cache for all resources of a web page by doing a “hard refresh” of the web page (press ctrl-f5 or shift-f5 in the browser). This will also add the Cache-Control: no-cache directive into the request headers, asking the server and intermediate proxies not to use a cached copy of the resource when responding. A developer may also want to add this header directive into a background request (e.g. with cURL) in order to make sure to fetch the latest version of the resource.

Use the cache only if the resource has not been modified

There are two ways to verify if a resource has been changed:

  1. Compare the file’s last modification date on the server with the last modification date known by the client (using Last-Modified response header and If-Modified-Since request header).
  2. Compare the file’s ETag value with the last ETag value known by the client (using ETag response header and If-None-Match request header). The ETag value is the hash of the file’s content (but can also be a revision number or other value generated depending on the web server).

Note: the file’s ETag value is considered more accurate than the last modification date: if the file has been modified with the same data, the ETag will keep the same value whereas the file’s last modification date will differ. The file’s last modification date is used as a fallback mechanism for resources that don’t have an ETag value.

The browser includes the If-None-Match and/or If-Modified-Since request headers when:

The browser sets the last known ETag value of the resource it cached in the If-None-Match request header and/or sets the last known modification date in the If-Modified-Since request header.

The server checks if these values match with the ETag value and the last modification date of the file. If the resource has been modified, the server will send the resource back with a 200 status and add the new ETag value of the file in the ETag response header, and add the new file’s last modification date in the Last-Modified response header. The client will use these two new values as values for the next If-None-Match and If-Modified-Since request headers in subsequent requests.

If the resource has not been modified, the server will send a response with a 304 Not Modified status without any body. The browser will then fetch the resource from its cache.

This caching mechanism doesn’t reduce server requests as the client has to validate the version of a cached resource with the version of the resource on the server. However, if the client has the latest version (server returns a 304 status for the requested resource), the content of the file is not downloaded; instead, the client will fetch the resource from its cache, allowing to save server bandwidth and fetch much faster (from client’s memory or local filesystem).

Use the cache until given resource’s expiry date has been reached

Cache-Control: max-age=600

The max-age directive gives the maximum amount of time in seconds that a resource should be allowed to be fetched from the cache.

The server may also add the Cache-Control: must-revalidate directive to strictly force the browser to verify that the cache contains the latest resource as soon as the expiry has passed, and prevent using expired resources.

As long as the resource is not expired, the resource will be fetched from the browser’s cache; so in contrast to the previous strategy, the browser will not even send a request to the server to verify if the resource has been modified. Server bandwidth is reduced even further and resources are fetched even faster (the browser doesn’t have to wait for a server request to complete before using its cache).

Default behavior when no Cache-Control header is present

If no Cache-Control header is present, the browser will cache static assets such as JS, CSS, images, etc. and will use a heuristic approach to determine the asset's expiration date.

A common caching strategy

A common caching strategy is to cache static resources such as JS and CSS files with an long expiration date, say 1 year (max-age=31536000), and change the URL to the resource whenever its content changes. The URL may change by appending a hash of the file's content or a version number to the filename. This is the most optimized technique, as the resource won't be re-fetched until it has been modified.

Comments

Comments including links will not be approved.