Understanding the JMeter cache

How to use the HTTP Cache Manager to make your JMeter load testing more realistic

Web Caching can be a complicated topic and is vital to web application performance. The main reasons for caching are to reduce latency and network traffic between a client and origin server. Most representations such as HTML, JavaScript, CSS and Images are cacheable and implementations of caching are often misunderstood. In this post we'll deal with the most obvious cache, that is, the browser cache and how JMeter emulates this browser behaviour.

JMeter is not a Browser

A common misconception of first time users of JMeter is that it functions exactly like a browser. It doesn't. JMeter like most performance test tools can emulate browser behaviour in a number of ways. Static resource fetching, caching, cookie and header management are some of the main components to browser emulation. Let's talk about caching and how that can be implemented in JMeter.

Browser Cache

You're probably already familiar with your browser's local cache. This lets you keep representations of content on local storage. When you hit your browser's back button for example, chances are you're seeing cached content. Based on simple rules, the browser cache typically checks representations are fresh according to the current browser session. If representations are determined to be stale, then new requests will be made to the origin server.

Other Caches

Other caches, such as proxies, can exist between your browser cache and the origin server. It's important to be aware of these as intermediary caching obviously impacts the workload profile of the origin server itself. Often these are outside of your control, and are particularly prevalent in any form of testing outside of a controlled network environment.

Cache Management

Freshness and Validation are two important concepts of cache management. Fresh content is typically served straight from the browser cache, whilst validated content will avoid requesting the same content if it hasn't changed. Cache management, in terms of determining if content is stale or fresh, is mostly implemented via HTTP headers.

Pragma: no-cache

Pragma headers specify optional behaviour. The Pragma: no-cache header is only defined for backwards compatibility with HTTP/1.0 and is equivalent to the Cache-Control: no-cache header directive. Some caches will simply ignore the Pragma directive so it's not a great way of ensuring content is not cached.

Cache-Control

Cache-Control headers were introduced in HTTP/1.1 to specify directives that must be followed by all caching implementations in the request/response chain. The common Cache-Control directives you will come across are:

  • public
  • private
  • no-cache
  • no-store
  • no-transform
  • must-revalidate
  • proxy-revalidate
  • max-age
  • s-maxage

Expires

The Expires header gives the date/time after which the response is considered stale. The presence of an Expires header field with a date value of some time in the future on a response that otherwise would by default be non-cacheable indicates that the response is cacheable, unless indicated otherwise by a Cache-Control header field.

Last Modified

The Last-Modified header field indicates the date and time at which the origin server believes the variant was last modified.

ETag

The ETag header field provides the current value of the entity tag for the requested variant.

JMeter Cache Manager

The JMeter HTTP Cache Manager is used to add caching functionality to HTTP requests within its scope. Essentially it will check response headers and respect the majority of headers related to cache management around Expires, ETag and Cache-Control directives.

LRU Map

It implements a per-user/thread Map which has a maximum size and uses a Least Recently Used algorithm to remove items from the Map when the maximum size is reached and new items are added.

The default size is 5000 items, an item being stored as the URL along with Last Modified, Expires and ETag header information. It's possible to increase this value, however setting an appropriate value for this is difficult with no feedback from JMeter itself. Flood IO provides a Response Code timeline chart, so if you only see response code 200, or a much higher rate of response code 200 compared to expected response codes 304 and 204 when using the Cache Manager you may need to adjust.

Expected Response Codes

When using the JMeter Cache Manager you can expect to see HTTP response codes 304 or 204.

When JMeter makes a conditional GET request but the document has not been modified, a server will typically respond with a 304 response code and no response body.

Thread Name: Thread Group 1-1
Sample Start: 2013-10-29 20:20:07 EST
Load time: 3
Latency: 0
Size in bytes: 150
Headers size in bytes: 150
Body size in bytes: 0
Sample Count: 1
Error Count: 0
Response code: 304
Response message: Not Modified

When JMeter simulates using content directly from its cache, it will record a 204 response code and no response body.

Thread Name: Thread Group 1-1
Sample Start: 2013-10-29 20:20:07 EST
Load time: 0
Latency: 0
Size in bytes: 0
Headers size in bytes: 0
Body size in bytes: 0
Sample Count: 1
Error Count: 0
Response code: 204
Response message: No Content

Confusingly, this does not mean the server has fulfilled the request nor responded with an entity-body, in order to return updated meta information as per the HTTP/1.1 specification. The HTTPClient4 Implementation in JMeter uses a constant from the HttpURLConnection which sets this value. So rest assured, no actual requests are made to the origin server in this case.

Testing the Directives

Putting this all together, we can test a simple scenario which has two iterations with a Cache Manager, to simulate the first visit (with an empty browser cache) and a second visit (with a primed cache). The target site is a simple stub using Ruby / Sinatra to exercise different cache control headers.

no-cache

If a no-cache directive is set, JMeter will keep this in cache, but set the expires date to null which will trigger a revalidation for each request. You will generally see a HTTP response code 304 for this type of conditional request. This is consistent with the HTTP/1.1 specification which must NOT use the response to satisfy a subsequent request without successful revalidation with the origin server.

no-store

If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it. JMeter does in fact cache this type of response so it is probably not consistent with the HTTP/1.1 specification.

This has since been fixed via bug ID 55721 thanks to prompt support by the JMeter core team!

remaining directives

The remaining directives including Expires / ETag headers all behave consistently with the HTTP/1.1 specification. In general, if the "Use Cache-Control/Expires header" option is selected in the Cache Manager, JMeter will issue a HTTP response code 204 for content in its cache, otherwise it will make a conditional request and you will see a HTTP response code 304 where appropriate.

A Real Example: squarefoot.hk

Let's look at a realistic example using a real estate site based in Hong Kong. This example is interesting because it has a wide range of content and is fairly heavy in terms of a first visit to the site.

YSlow shows that the page has a total of 136 HTTP requests and a total weight of 933.6K bytes with an empty cache.

Subsequent visits to the site should see 17 requests with total weight of 165.1K bytes with a primed cache.

Using a simple JMeter plan with a HTTP Cache Manager, 1 thread, 1 iteration with 2 requests to the home page produces the following summary statistics when sampled via a debugging proxy (Charles):

First Visit Empty Cache
Requests 100
Responses 1.03 MB

Second Visit Primed Cache
Requests 37
Responses 123 KB

Not too bad in terms of anticipated drop in requests and response size, however not the same order of magnitude we observed with the simple YSlow comparison.

Part of the reason for this is that we're getting JMeter to automatically parse the HTML file and send HTTP/HTTPS requests for all images, Java applets, JavaScript files, CSSs, etc. referenced in the file. This is not 100% accurate but it allows us to also use a pool of concurrent connections to get embedded resources, which provides a more realistic simulation of browser behaviour, evident in the following waterfall chart when using this approach:

What this means is that you will need to determine which resources were not downloaded using this approach and include them manually in your test plan if they are significant. There may also be 3rd party domains which we don't want to test e.g.:

http://b.scorecardresearch.com
http://reagroup.122.2o7.net
http://secure-sg.imrworldwide.com

That's why it's extremely useful running your JMeter test plans through a debugging proxy like Charles or Fiddler to get an idea of what your script is doing for a single user over multiple iterations. You can also do this more simply with a View Results Tree listener in JMeter, but the waterfall view is particularly nice to inspect network traffic.

The other reason for discrepancies between first and second visits to the site is that the contents regularly change. Different images are displayed each time we visit the site so those items won't already be cached, hence a new request to the origin server. In the case of YSlow, it is comparing the exact same request/response chain.

TL;DR

The JMeter HTTP Cache Manager is used to add caching functionality to HTTP requests within its scope. Essentially it stores a copy of the URL along with Last Modified, Expires and ETag headers taken from the response in a Hash Map which will evict items based on a Least Recently Used algorithm.

It will typically record HTTP response codes 204 or 304 for items which are cacheable and meet freshness or validation criteria. Response body will be empty for these items so you need to be careful with any assertions on content.

It's extremely useful running your JMeter test plans through a debugging proxy like Charles or Fiddler to get an idea of what your script is doing for a single user over multiple iterations. Alternatively use a View Results Tree listener in JMeter.

Start load testing now

It only takes 30 seconds to create an account, and get access to our free-tier to begin load testing without any risk.

Keep reading: related stories
Return to the Flood Blog