Testing

Use Cases to Test Cases

Applications are typically designed with specific end user scenarios documented as use cases (for example, see the book Writing Effective Use Cases by Alistair Cockburn). Use cases drive the test cases that are created for load testing.

100% vs 80/20 rule?

A common perception in IT is performance testing can be accommodated by what is know as the 80/20 rule: We will test what 80% of actions the users do and ignore the 20% they do not as frequently. However, what is not addressed are the 20% that can induce a negative performance event causing serious performance degradation to the other 80%. Performance testing should always test 100% of the documented use cases.

The 80/20 rule also applies to how far you should tune. You can increase performance by disabling things such as performance metrics (PMI) and logging, but this may sacrifice serviceability and maintenance. Unless you're actually benchmarking for top speed, then we do not recommend applying such tuning.

Load Testing

General testing guidelines:

Begin by choosing a benchmark, a standard set of operations to run. This benchmark exercises those application functions experiencing performance problems. Complex systems frequently need a warm-up period to cache objects, optimize code paths, and so on. System performance during the warm-up period is usually much slower than after the warm-up period. The benchmark must be able to generate work that warms up the system prior to recording the measurements that are used for performance analysis. Depending on the system complexity, a warm-up period can range from a few thousand transactions to longer than 30 minutes.

Another key requirement is that the benchmark must be able to produce repeatable results. If the results vary more than a few percent from one run to another, consider the possibility that the initial state of the system might not be the same for each run, or the measurements are made during the warm-up period, or that the system is running additional workloads.

Several tools facilitate benchmark development. The tools range from tools that simply invoke a URL to script-based products that can interact with dynamic data generated by the application. IBM Rational has tools that can generate complex interactions with the system under test and simulate thousands of users. Producing a useful benchmark requires effort and needs to be part of the development process. Do not wait until an application goes into production to determine how to measure performance.

The benchmark records throughput and response time results in a form to allow graphing and other analysis techniques.

Reset as many variables possible on each test. This is most important for tests involving databases which tend to accumulate data and can negatively impact performance. If possible, data should be truncated & reloaded on each test.

Stress Testing Tool

There are various commercial products such as IBM Rational Performance Tester. If such a tool is not available, there are various open source alternatives such as Apache Bench, Apache JMeter, Siege, and OpenSTA. The Apache JMeter tool is covered in more detail in the Major Tools chapter and it is a generally recommended tool.

Apache Bench

Apache Bench is a binary distributed in the "bin" folder of the httpd package (and therefore with IBM HTTP Server as well). It can do very simple benchmarking of a single URL, specifying the total number of requests (-n) and the concurrency at which to send the requests (-c):

 $ ./ab -n 100 -c 5 http://ibm.com/
 This is ApacheBench, Version 2.0.40-dev <$Revision: 30701 $> apache-2.0
 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd
 Copyright (c) 1998-2002 The Apache Software Foundation
 
 Benchmarking ibm.com (be patient).....done
 
 Server Software:        
 Server Hostname:        ibm.com
 Server Port:            80
 
 Document Path:          /
 Document Length:        227 bytes
 
 Concurrency Level:      5
 Time taken for tests:   2.402058 seconds
 Complete requests:      100
 Failed requests:        0
 Write errors:           0
 Non-2xx responses:      100
 Total transferred:      49900 bytes
 HTML transferred:       22700 bytes
 Requests per second:    41.63 [#/sec] (mean)
 Time per request:       120.103 [ms] (mean)
 Time per request:       24.021 [ms] (mean, across all concurrent requests)
 Transfer rate:          19.98 [Kbytes/sec] received
 
 Connection Times (ms)
               min  mean[+/-sd] median   max
 Connect:       44   56   8.1     55      85
 Processing:    51   61   6.9     60      79
 Waiting:       51   60   6.8     59      79
 Total:         97  117  12.1    115     149
 
 Percentage of the requests served within a certain time (ms)
   50%    115
   66%    124
   75%    126
   80%    128
   90%    132
   95%    141
   98%    149
   99%    149
  100%    149 (longest request)

Common Benchmarks

DayTrader

DayTrader is a commonly used benchmark application for Java Enterprise Edition. It simulates an online stock trading system and exercises servlets, JSPs, JDBC, JTA transactions, EJBs, MDBs, and more.

There are open source versions of DayTrader for Java EE 7 and Java EE 8.

DayTrader provides three different implementations of the business services:

  1. TradeDirect (default): The TradeDirect class performs CRUD (create, read, update, and delete) operations directly agaist the supporting database using custom JDBC code. Database connections, commits, and rollbacks are managed manually in the code. JTA user transactions are used to coordinate 2-phase commmits.
  2. TradeJDBC: The TradeJDBC stateless session bean serves as a wrapper for TradeDirect. The session bean assumes control of all transaction management while TradeDirect remains responsible for handleing the JDBC operations and connections. This implementation reflects the most commonly used JavaEE application design pattern.
  3. TradeBean: The TradeBean stateless session bean uses Caontainer Managed Persistence (CMP) entity beans to represent the business objects. The state of these objects is completely managed by the application servers EJB container.

IBMStockTrader

IBMStockTrader is an open source sample application that simulates an online stock trading system. It exercises MicroServices, OpenShift operators and more.

Acme Air

Acme Air is an open source benchmark application for Java MicroServices. It simulates a fictitious airline called Acme Air which handles flight bookings.

Acme Air is available as part of multiple repositories with the mainservice holding installation instructions:

There is a monolithic version of the application:

There are SprintBoot versions of the microservices as well:

Notes:

Think Times

Think time is defined to be the amount of time a user spends between requests. The amount of time a user spends on the page depends on how complex the page is and how long it takes for the user to find the next action to take. The less complex the page the less time it will take for the user to take the next action. However, no two users are the same so there is some variability between users. Therefore think time is generally defined as a time range, such as 4-15 seconds, and the load test tool will attempt to drive load within the parameters of think time. Testing that incorporates think time is attempting to simulate live production work load in order to attempt to tune the application for optimal performance.

There is also a "stress" test where think time is turned down to zero. Stress testing is typically used to simulate a negative production event where some of the application servers may have gone off line and are putting undue load on those remaining. Stress testing helps to understand how the application will perform during such a negative event in order to help the operations team understand what to expect. Stress testing also typically breaks the application in ways not encountered with normal think time testing. Therefore, stress testing is a great way to both:

  • Break the application and have an attempt to fix it before being placed in production, and
  • Providing the operations staff with information about what production will look like during a negative event.