Planning for High Concurrency Load Tests: Millions of Users and Beyond

Concurrency: what is it? It can simply be defined as

The property of systems in which several computations are executing simultaneously, and potentially interacting with each other.

Our customers frequently share requirements with us like:

JMeter Load Test with 1,000,000 threads
JMeter Load Test with 1,000,000 RPS
JMeter Load Test with 10,000,000 Requests Per Minute

While Flood can certainly support tests of this scale, we find most companies haven't given enough thought to what type of workload they really need to test with.

Concurrency is often used to define workload for load testing, as in concurrent users. Too often it's the only input defined. In reality there are a number of factors which contribute to workload and affect concurrency itself. Some of these I will talk about in this post.

How to calculate target concurrent users?

A common method is to divide the number of unique users by the average visit duration for a given period.

For example, 12,000 unique visitors per hour / ( 60 minutes / 15 minutes per visit ) would equal 3,000 concurrent users.

Problems with this approach include:

Assumption that visitors are spread evenly across the 1 hour period sampled. Visitors are more likely to follow some form of Poisson process.
The number of unique visitors may be miscalculated. Depending on the method used to sample uniques, this can often be misleading, particularly if using things like source IPs which can often be shared or proxied by real users.
Concurrency contributes to but does not define overall workload. What were those users actually doing? Idle or busy?
More calculated methods might be to profile or trace real users over different periods, or use descriptive statistics aside from the averages above to show a frequency distribution or box plot so you can better model concurrent use.

Concurrency is important to get right because ultimately, the workload is most likely being processed by some form of queuing system(s), and concurrency can affect arrival time on those queues.

In real life computer systems, distribution is rarely uniform so it is good to try and observe, understand and model concurrency within your system under test.

How to start concurrent users?

Assuming you've created a model, good or bad, it's now time to turn to your favourite load testing tool and simulate that model under load.

At the heart of JMeter is a Thread Group, which lets you define how many users, a rampup period and either the number of times to execute the test or a fixed duration for the test. Essentially rampup will delay the start of individual threads.

For example 3,000 concurrent users with a rampup of 300 seconds will impose a 10 second delay between the start of each user.

This provides a linear, uniform distribution for starting threads, so unless your objectives are to measure performance once all users have started up (and ignore startup itself) then this method is unlikely to simulate a realistic load profile.

The popular JMeter Plugins library provides an alternative called the Ultimate Thread Group which as its name implies, gives you more options for starting concurrent users. This includes variable quantities of threads, rampup times and duration. With patience and a little wrist RSI plugging the information into the UI, you can create more realistic concurrent load profiles.

FWIW I often find many performance related defects during rampup, especially with more realistic models for starting users. It's an important period of your test to monitor and measure for performance. Too often this period is discarded from test results.

How to run concurrent users?

Now your users have started and the test is under way, many testers will focus on this period of the test and use it as the basis for further observations and results. Some testers may refer to it as "peak concurrent load" or "steady state load".

A key component aside from concurrency which will affect this period is throughput.

Throughput can be measured in many different ways, such as network throughput or number of requests per second. But ultimately throughput is created by users performing some action on the system under test. As I mentioned earlier, a high concurrent user test might not mean much if the majority of users are just idle, so throughput is just as important to model and effectively simulate.

Throughput is most often impacted by things like think time (between user transactions) or pacing (time between each iteration), but it is also affected by the service time in any of the system queues, such as web app response time, or network latency and bandwidth.

A Poisson process or the time between each pair of consecutive events has an exponential distribution and each of these inter-arrival times is assumed to be independent of other inter-arrival times.

So how can we simulate that?

JMeter has a decent set of Timers which can be used to help randomize the inter-arrival times of user actions. I like to use the Gaussian Random Timer which

Pauses each thread request for a random amount of time, with most of the time intervals occurring near a particular value. The total delay is the sum of the Gaussian distributed value (with mean 0.0 and standard deviation 1.0) times the deviation value you specify, and the offset value.

JMeter also has a Test Action controller which I like to use to add a variable pauseat the end of each iteration. This can help determine the pacing, or the rate at which each user goes through a list of transactions. Still, pacing between iterations can be difficult to get right, especially if you're aiming to simulate average visit duration for example.

The best way I believe is to shakeout your scripts manually with a single user, multiple iterations, so you get an understanding of timing between iterations.

Frequency of execution for particular code blocks can be controlled with the Throughput Controller. I prefer to use percent execution so that I can 'weight' a certain percentage of the iterations through the test plan.

The Throughput Timer can also be useful as it introduces variable pauses, calculated to keep the total throughput (in terms of samples per minute) as close as possible to a given figure, which itself can be random.

Translating samples per minute (as JMeter executed them) to your model's throughput targets can be difficult.

A certain amount of effort / understanding is required to accurately model this.

Understanding results!

Assuming you've shaken out your JMeter scripts standalone, you can also run them on Flood IO using our free nodes to get a feel for the reports that we produce.

Concurrency in Flood IO is the number of active threads for all thread groups in your test plan. However before you get too excited about the massive concurrency numbers you can generate with JMeter or any other popular load testing tool, have a think about what concurrency actually means in your test plan.

It is possible to have 100K users sitting bone idle or 1K users iterating faster than a thousand startled gazelles, effectively applying the same workload on the system under test. So concurrency as a single measure lacks context.

Other important metrics you should look at include throughput and errors. There's a couple of ways to view throughput, network or requests and I generally prefer the latter as it ties in well with external monitor tools such as New Relic.

Monitoring for errors from the client side is extremely important to help track down potential bottlenecks on the server side. For example the Response Codesgraph will help show you non HTTP 200 response codes. You can also view failed transaction counts in the Transaction Response Times graph.

TL;DR

Concurrency on its own is not enough to define system workload. Other metrics such as Throughput can be used in combination to help describe workload.

Beware of averages and uniform distributions. Default settings in the majority of commercial and open source testing tools are normally not enough to simulate a realistic workload model.

The starting or ramping up of users in your test plan is just as important as "steady state" load. Don't discard the results!

Always look for relationships between different metrics such as Concurrency, Throughput and Response Time. Never ignore Errors!

‍