Connection pool hangs in createOrWaitForConnection

A thread waiting in com/ibm/ejs/j2c/FreePool.createOrWaitForConnection is waiting to get a connection from a connection pool. This may mean either that there are more threads than the maximum connection pool size (and a subset of them is using all connections in the pool), or, less commonly, that one or more threads are holding more than one connection per thread.

More threads than the maximum connection pool size

Normally, connections are requested from managed thread pools such as WebContainer, ORB.thread.pool, WMQJCAResourceAdapter, etc. If the sum of the maximum sizes of the thread pools that may use connections is less than the maximum size of the connection pool, then, with sufficient concurrent workload, or if there is a backup in the backend resource (database, queue, etc.) or something else causing the connections to be held for a long time, then this can become a bottleneck.

Common solutions include:

  1. Optimize whatever bottleneck is holding the connections for a long time, and/or,
  2. Increase the maximum connection pool size to sum(max(connection-using thread pool size))+1, and/or,
  3. Reduce the maximum thread pool sizes so that their sum is less than or equal to the maximum connection pool size + 1.

Option 2 may put unexpected stress on the backend resource (database, queue, etc.) and/or unexpected stress on the application server (e.g. CPU, Java heap, etc.) because more concurrent work can run that would previously be bottlenecked. Option 3 may increase response times due to queuing.

Therefore, there is no easy answer as this involves a tuning exercise.

Note: the reason for +1 in the formulas above is to avoid the case of a deadlock due to threads requesting more than one connection (though this can still be a livelock; see next section).

Threads holding more than one connection per thread

It's common that one thread should only use one concurrent connection from a connection pool, though there may be cases where an application may legitimately use multiple concurrent connections from the pool, so it depends on the use case.

In either case (as temporary relief, or as expected behavior, respectively), the connection pool maximum may be set to a multiple of sum(max(connection-using thread pool size))+1 with the same caveats above about additional stress on the backend resource and/or application server.

Connection leaks

Connection leaks are normally not a problem because Java enterprise edition managed resources default to shareable connections which means that the application server will close the connection at the end of the transaction containment even if the application forgot to call close.

However, shareable connections may cause connections to be held for longer than expected because of a delay of something else within the same transaction containment. For example, imagine that the application gets a connection from the pool in a local transaction containment (e.g. servlet, etc.), performs some work, "closes" the connection, and then performs a web service call which is very slow, then the connection will be held on that thread until the slow web service call responds, and this may put extra pressure on the connection pool. In this case, unshareable connections may help as the connection will be given back to the pool when the application calls close on it. So there are various tradeoffs between shareable and unshareable connections.

Connection pool maximum less than thread pools' maximums

A connection pool maximum may be consciously set to less than the sum of the thread pools' maximums. This may be done because the backend resource (e.g. database, queue, etc.) has limited capacity; however, it's generally recommended to queue work in front of the application server rather than within it:

In general, requests wait in the network in front of the web server, rather than waiting in WebSphere Application Server. This configuration only supports those requests that are ready for processing to enter the queuing network.

Setting the thread pool some amount larger may be acceptable so that queued work is available immediately as connections are freed under maximum load:

Because there is work waiting to enter a component at each point upstream, no component in this system must wait for work to arrive. The bulk of the requests wait in the network, outside of WebSphere Application Server.