WebSphere Liberty

WebSphere Liberty Recipe

  1. Review the Operating System recipe for your OS. The highlights are to ensure CPU, RAM, network, and disk are not consistently saturated.
  2. Review the Java recipe for your JVM. The highlights are to tune the maximum heap size (-Xmx or -XX:MaxRAMPercentage), the maximum nursery size (-Xmn) and enable verbose garbage collection and review its output with the GCMV tool.
  3. Liberty has a single thread pool where most application work occurs and this pool is auto-tuned based on throughput. In general, it is not recommended to tune nor specify this element; however, if there is a throughput problem or there are physical or virtual memory constraints, test with <executor maxThreads="X" />. If an explicit value is better, consider opening a support case to investigate why the auto-tuning is not optimal.
  4. If receiving HTTP(S) requests:
    1. If using the servlet feature less than version 4, then consider explicitly enabling HTTP/2 with protocolVersion="http/2".
    2. For HTTP/1.0 and HTTP/1.1, avoid client keepalive socket churn by setting maxKeepAliveRequests="-1". This is the new default as of Liberty 21.0.0.6.
    3. For servers with incoming LAN HTTP traffic from clients using persistent TCP connection pools with keep alive (e.g. a reverse proxy like IHS/httpd or web service client), consider increasing persistTimeout to reduce keepalive socket churn.
    4. For HTTP/1.0 and HTTP/1.1, minimize the number of application responses with HTTP codes 400, 402-417, or 500-505 to reduce keepalive socket churn or use HTTP/2.
    5. If using HTTP session database persistence, tune the <httpSessionDatabase /> element.
    6. If possible, configure and use HTTP response caching.
    7. If using TLS, set -DtimeoutValueInSSLClosingHandshake=1.
    8. Consider enabling the HTTP NCSA access log with response times for post-mortem traffic analysis.
    9. If there is available CPU, test enabling HTTP response compression.
    10. If the applications don't use resources in META-INF/resources directories of embedded JAR files, then set <webContainer skipMetaInfResourcesProcessing="true" />.
    11. Consider reducing each HTTP endpoint's tcpOptions maxOpenConnections to approximately the expected maximum number of concurrent processing and queued requests that avoids excessive request queuing under stress, and test with a saturation & slow performance test.
  5. If using databases (JDBC):
    1. Connection pools generally should not be consistently saturated. Tune <connectionManager maxPoolSize="X" />.
    2. Consider tuning each connectionManager's numConnectionsPerThreadLocal and purgePolicy, and each dataSource's statementCacheSize and isolationLevel.
    3. Consider disabling idle and aged connection timeouts (and tune any firewalls, TCP keep-alive, and/or database connection timeouts, if needed).
  6. If using JMS MDBs without a message ordering requirement, tune activation specifications' maxConcurrency to control the maximum concurrent MDB invocations and maxBatchSize to control message batch delivery size.
  7. If using EJBs:
    1. If using non-@Asynchronous remote EJB interfaces in the application for EJBs available within the same JVM, consider using local interface or no-interface equivalents instead to avoid extra processing and thread usage.
    2. If an EJB is only needed to be accessed locally within the same server, then use local interfaces (pass-by-reference) instead of remote interfaces (pass-by-value) which avoids serialization.
  8. If using security, consider tuning the authentication cache and LDAP sizes.
  9. Use the minimal feature set needed to run your application to reduce startup time and footprint.
  10. Upgrade to the latest version and fixpack as there is a history of making performance improvements and fixing issues or regressions over time.
  11. Consider enabling request timing which will print a warning and stack trace when requests exceed a time threshold.
  12. Review logs for any errors, warnings, or high volumes of messages.
  13. Monitor, at minimum, response times, number of requests, thread pools, connection pools, and CPU and Java heap usage using mpMetrics-2.3, monitor-1.0, JAX-RS Distributed Tracing, and/or a third party monitoring program.
  14. Consider enabling event logging which will print a message when request components exceed a time threshold.
  15. Consider running with a sampling profiler such as Health Center or Mission Control for post-mortem troubleshooting.
  16. Disable automatic configuration and application update checking if such changes are unexpected.
  17. If the application writes a lot to messages.log, consider switching to binary logging for improved performance.
  18. Review the performance tuning topics in the OpenLiberty and WebSphere Liberty documentation.
  19. If running on z/OS:
    1. Consider enabling SMF 120 records.
    2. Consider WLM classification: zosWlm-1.0
    3. Enable hardware cryptography for Java 8, Java 11, or Java 17

Documentation

OpenLiberty is an open source Java application server. WebSphere Liberty is IBM's commercial extension of OpenLiberty. Both may be entitled for support.

For feature differences between editions, find the latest Quarterly Webinar by searching for "Liberty quarterly" (without double quotes) under Search Library Entries, then sort by Most Recent, click the link, open the PDF, and search for "Periodic Table of Liberty".

WebSphere Liberty:

OpenLiberty:

Continuous Delivery

If you have longer-term support considerations, consider using Liberty releases ending in .3, .6, .9 or .12.

server.xml

Liberty is configured through a server.xml:

jvm.options

Generic JVM arguments are set either in $LIBERTY/usr/servers/$SERVER/jvm.options for a particular JVM or in $LIBERTY/etc/jvm.options as defaults for all servers.

Put each option on its own line. The file supports commented lines that start with #.

Verbose Garbage Collection

Consider enabling Java verbose garbage collection on all JVMs, including production. This will help with performance analysis, OutOfMemoryErrors, and other post-mortem troubleshooting. Benchmarks show an overhead of less than 1% and usually less than 0.5%.

  • OpenJ9 or IBM Java -Xverbosegclog:
    -Xverbosegclog:logs/verbosegc.%seq.log,20,50000
  • HotSpot -Xloggc:
    • Java >= 9:
      -Xlog:safepoint=info,gc:file=logs/verbosegc.log:time,level,tags:filecount=5,filesize=20M
    • Java 8:
      -Xloggc:logs/verbosegc.log
      -XX:+UseGCLogFileRotation
      -XX:NumberOfGCLogFiles=5
      -XX:GCLogFileSize=20M
      -XX:+PrintGCDateStamps
      -XX:+PrintGCDetails

Monitor for high pause times and that the proportion of time in GC pauses is less than ~5-10% using the GCMV tool.

Logs and Trace

See WebSphere Liberty Logging Documentation and OpenLiberty Logging Documentation.

Always provide or analyze both console.log and messages.log as they may have different sets of messages. Ideally, when gathering Liberty server logs, gather the entire logs folder under $LIBERTY/usr/servers/$SERVER/logs, as well as the server.xml under $LIBERTY/usr/servers/$SERVER/.

Liberty uses a Java retransformation agent to supports some of its logging capabilities. On IBM Java < 7.1, the mere presence of a retransformation agent will cause the VM to double the amount of class native memory used, whether any individual class is transformed or not. On IBM Java >= 7.1, additional memory is allocated on-demand only for the classes that are retransformed.

WAS traditional Cross-Component Trace (XCT) and Request Metrics are not available in Liberty.

messages.log

messages.log includes WAS messages equal to or above the INFO threshold (non-configurable), java.util.logging.Logger messages equal to or above traceSpecification (default *=info), System.out, and System.err, with timestamps.

On server startup, any old messages.log is first rotated (unless the bootstrap property com.ibm.ws.logging.newLogsOnStart=false is set).

To specify the maximum file size (in MB) and maximum number of historical files:

<logging maxFiles="X" maxFileSize="Y" />

For those experienced with WAS traditional, messages.log is like the combination of SystemOut.log and SystemErr.log; however, unless the bootstrap property com.ibm.ws.logging.newLogsOnStart=false is set, unlike WAS traditional, Liberty doesn't append to the existing SystemOut.log on restart by default.

console.log

console.log includes native stdout, native stderr, WAS messages (except trace) equal to or above the threshold set by consoleLogLevel (by default, AUDIT), System.out plus System.err (if copySystemStreams is true, which it is by default) and without timestamps (unless consoleFormat is simple). The console.log is always truncated on server startup when using $LIBERTY/bin/server start and does not support maximum size nor rollover.

For those experienced with WAS traditional, by default, console.log is like the combination of native_stdout.log, native_stderr.log, and SystemOut.log and SystemErr.log messages above AUDIT without timestamps.

If you would like to use console.log for native stdout and native stderr only and use messages.log for everything else (note: a couple of AUDIT messages will still show up before the logging configuration is read):

<logging copySystemStreams="false" consoleLogLevel="OFF" />

Starting with Liberty 20.0.0.5, consoleFormat supports the simple value that adds timestamps. For example:

$ docker run --rm -e "WLP_LOGGING_CONSOLE_FORMAT=simple" -it open-liberty:latest
Launching defaultServer (Open Liberty 20.0.0.6/wlp-1.0.41.cl200620200528-0414) on Eclipse OpenJ9 VM, version 1.8.0_252-b09 (en_US)
[6/24/20 19:21:53:874 GMT] 00000001 com.ibm.ws.kernel.launch.internal.FrameworkManager           A CWWKE0001I: The server defaultServer has been launched.

trace.log

Diagnostic trace is enabled with traceSpecification. For example:

<logging traceSpecification="*=info" maxFileSize="250" maxFiles="4" />

For a more compact format, add traceFormat="BASIC".

If trace is enabled, trace.log includes WAS messages, java.util.logging.Logger messages, WAS diagnostic trace, System.out, and System.err, with timestamps.

Sending trace to native stdout:

<logging traceSpecification="*=info" traceFileName="stdout" maxFileSize="0" traceFormat="BASIC" />

Request Timing

Consider enabling and tuning request timing which will print a warning and stack trace when requests exceed a time threshold. Even in the case that you use a monitoring product to perform slow request detection, the built-in Liberty detection is still valuable for easy searching, alerting, and providing diagnostics to IBM support cases.

The slowRequestThreshold (designated X below) is the main configuration and it should be set to your maximum expected application response time in seconds, based on business requirements and historical analysis. Make sure to include the s after the value to specify seconds.

The hungRequestThreshold is an optional configuration that works similar to slowRequestThreshold, but if it is exceeded, Liberty will produce three thread dumps, one minute apart, starting after a request exceeds this duration.

Performance test the overhead and adjust the thresholds and sampleRate as needed to achieve acceptable overhead.

<featureManager><feature>requestTiming-1.0</feature></featureManager>
<requestTiming slowRequestThreshold="Xs" hungRequestThreshold="600s" sampleRate="1" />

Some benchmarks have shown that requestTiming has an overhead of about 3-4% when using a sampleRate of 1:

The requestTiming-1.0 feature, when activated, has been shown to have a 4% adverse effect on the maximum possible application throughput when measured with the DayTrader application. While the effect on your application might be more or less than that, you should be aware that some performance degradation might be noticeable.

As of this writing, requestTiming does not track asynchronous servlet request runnables though a feature request is opened.

Example output:

[10/1/20 13:21:42:235 UTC] 000000df com.ibm.ws.request.timing.manager.SlowRequestManager         W TRAS0112W: Request AAAAqqZnfKN_AAAAAAAAAAB has
been running on thread 000000c2 for at least 5000.936ms. The following stack trace shows what this thread is currently running.
          at java.lang.Thread.sleep(Native Method) [...]

    The following table shows the events that have run during this request.
    Duration     Operation
    5003.810ms + websphere.servlet.service | pd | com.ibm.pd.Sleep?durationms=10000

Event Logging

Consider enabling and tuning event logging to log excessively long component times of application work. This is slightly different than request timing. Consider the following case: You've set the requestTiming threshold to 10 seconds which will print a tree of events for any request taking more than 10 seconds. However, what if a request occurs which has three database queries of 1 second, 2 seconds, and 6 seconds. In this case, the total response time is 9 seconds, but the one query that took 6 seconds is presumably concerning, so event logging can granularly monitor for such events.

<featureManager>
  <feature>eventLogging-1.0</feature>
</featureManager>
<eventLogging includeTypes="websphere.servlet.service" minDuration="500ms" logMode="exit" sampleRate="1" />

Example output:

[10/1/15 14:10:57:962 UTC] 00000053 EventLogging I END requestID=AAABqGA0rs2_AAAAAAAAAAA # eventType=websphere.servlet.service # contextInfo=pd
| com.ibm.pd.Sleep?durationms=10000 # duration=10008.947ms

Binary Logging

Consider using Binary Logging. In benchmarks, binaryLogging reduced the overhead of logs and trace by about 50%. Note that binary logging does not use less disk space and in fact will use more disk space; the performance improvements occur for other reasons.

Add the following additional lines to bootstrap.properties for the server (if no such file exists, create the file in the same directory as server.xml):

websphere.log.provider=binaryLogging-1.0
com.ibm.hpel.log.purgeMaxSize=100
com.ibm.hpel.log.outOfSpaceAction=PurgeOld
com.ibm.hpel.trace.purgeMaxSize=2048
com.ibm.hpel.trace.outOfSpaceAction=PurgeOld

Thread Pools

Most application work in Liberty occurs in a single thread pool named "Default Executor" (by default). The <executor /> element in server.xml may be used to configure this pool; however, unless there are observed problems with throughput, it is generally not recommended to tune nor even specify this element. If maxThreads is not configured, Liberty dynamically adjusts the thread pool size between coreThreads and maxThreads based on observed throughput:

In most environments, configurations, and workloads, the Open Liberty thread pool does not require manual configuration or tuning. The thread pool self-tunes to determine how many threads are needed to provide optimal server throughput. [...] However, in some situations, setting the coreThreads or maxThreads attributes might be necessary. The following sections describe these attributes and provide examples of conditions under which they might need to be manually tuned.

  • coreThreads: This attribute specifies the minimum number of threads in the pool. [...]
  • maxThreads: This attribute specifies the maximum number of threads in the pool. The default value is -1, which is equal to MAX_INT, or effectively unlimited.

If there is a throughput problem, test with a maximum number of threads equal to $cpus * 2. If this or another explicit value is better, consider opening a support case to investigate why the auto-tuning is not optimal. For example:

<executor maxThreads="64" />

Diagnostic trace to investigate potential executor issues: com.ibm.ws.threading.internal.ThreadPoolController=all

As of this writing, the default value of coreThreads is $cpus * 2.

Additional details:

HTTP

If the applications don't use resources in META-INF/resources directories of embedded JAR files, then set <webContainer skipMetaInfResourcesProcessing="true" />.

Liberty (starting in 18.0.0.2) uses DirectByteBuffers for HTTP reading and writing just like WAS traditional; however, there is only a global pool rather than ThreadLocal pools, and the DBB sizes and bucket sizes may be configured with, for example:

<bytebuffer poolSizes="32,1024,8192,16384,24576,32768,49152,65536" poolDepths="100,100,100,20,20,20,20,20" />

Keep Alive Connections

Max requests per connection

By default, for Liberty < 21.0.0.6, for HTTP/1.0 and HTTP/1.1 (but not HTTP/2.0), Liberty closes an incoming HTTP keep alive connection after 100 requests (see maxKeepAliveRequests). This may cause a significant throughput impact, particularly with TLS (in one benchmark, ~100%). To disable such closure of sockets, set maxKeepAliveRequests="-1":

<httpOptions maxKeepAliveRequests="-1" />

This is the default as of Liberty 21.0.0.6.

Idle timeouts

In general, for servers with incoming LAN network traffic from clients using persistent TCP connection pools (e.g. a reverse proxy like IHS/httpd or web service client), increase the idle timeout (see persistTimeout) to avoid connections getting kicked out of the client connection pool. Try values less than 575 hours. For example:

<httpOptions persistTimeout="575h" />

Error codes closing keep-alive connections

If an HTTP response returns what's internally considered an "error code" (HTTP 400, 402-417, or 500-505; any StatusCodes instance with the third parameter set to true); then, after the response completes, if the socket is a keep-alive socket, it will be closed. This may impact throughput if an application is, for example, creating a lot of HTTP 500 error responses and thus any servers with incoming LAN network traffic from clients using persistent TCP connection pools (e.g. a reverse proxy like IHS/httpd or web service client) will have to churn through more sockets than otherwise (particularly impactful for TLS handshakes). As an alternative to minimizing such responses, test using HTTP/2 instead.

HTTP Access Logs

HTTP access logging with response times may be used to track the number and response times of HTTP(S) requests in an NCSA format using %D (in microseconds). The example also uses %{R}W which is the time until the first set of bytes is sent in response (in microseconds) which is often a good approximation for application response time (as compared to %D which is end-to-end response time including client and network).

For example:

<httpEndpoint httpPort="9080" httpsPort="9443">
  <accessLogging filepath="${server.output.dir}/logs/http_access.log" maxFileSize="100" maxFiles="4" logFormat="%h %u %t &quot;%r&quot; %s %b %D %{R}W" />
</httpEndpoint>

Starting with Liberty 21.0.0.11, the %{remote}p option is available to print the ephemeral local port of the client which may be used to correlate to network trace:

<httpEndpoint httpPort="9080" httpsPort="9443">
  <accessLogging filepath="${server.output.dir}/logs/http_access.log" maxFileSize="100" maxFiles="4" logFormat="%h %u %t &quot;%r&quot; %s %b %D %{R}W %{remote}p %p" />
</httpEndpoint>

If you are on older versions of Liberty, ensure you have APARs PI20149 and PI34161.

See also:

HTTP Sessions

By default, Liberty sets allowOverflow="true" for HTTP sessions, which means that maxInMemorySessionCount is not considered and HTTP sessions are unbounded which may cause OutOfMemoryErrors in the default configuration without session persistence. If allowOverflow is disabled, maxInMemorySessionCount should be sized taking into account the maximum heap size, the average HTTP session timeout, and the average HTTP session heap usage.

HTTP Session Database Persistence

If using HTTP session database persistence, configure the <httpSessionDatabase /> element along the lines of WAS traditional tuning guidelines.

If using time-based writes, the writeInterval is the amount of time to wait before writing pending session updates. Note that there is a separate timer which controls the period at which a thread checks for any updates exceeding the writeInterval threshold. If the writeInterval is less than 10 seconds, then the timer is set to the value of the writeInterval; otherwise, the timer is set to a fixed 10 seconds, which means that writes may occur up to 10 seconds after the writeInterval threshold has passed.

HTTP Response Compression

Starting with Liberty 20.0.0.4, HTTP responses may be automatically compressed (gzip, x-gzip, deflate, zlib or identity) with the httpEndpoint compression element. For example, without additional options, the element compresses text/* and application/javascript response MIME types:

<httpEndpoint httpPort="9080" httpsPort="9443">
  <compression />
</httpEndpoint>

This will reduce response body size but it will increase CPU usage.

Web Response Cache

The Web Response Cache configures HTTP response caching using the distributedMap. Example:

<featureManager>
  <feature>webCache-1.0</feature>
</featureManager>

<distributedMap id="baseCache" memorySizeInEntries="2000" />

The configuration is specified in a cachespec.xml in the application. For details, see ConfigManager and the WAS traditional discussion of cachespec.xml. The response cache uses the baseCache.

HTTP/2

Large request bodies

If an HTTP/2 client is sending large request bodies (e.g. file uploads), then consider testing additional tuning in httpOptions under httpEndpoint available since 23.0.0.4:

  • limitWindowUpdateFrames: "Specifies whether the server waits until half of the HTTP/2 connection-level and stream-level windows are exhausted before it sends WINDOW_UPDATE frames."
  • settingsInitialWindowSize: "Specifies the initial window size in octets for HTTP/2 stream-level flow control."
  • connectionWindowSize: "Specifies the window size in octets for HTTP/2 connection-level flow control."

Monitoring

Consider enabling, gathering, and analyzing WAS statistics on, at least, thread pools, connection pools, number of requests, average response time, and CPU and Java heap usage. This is useful for performance analysis and troubleshooting. The following options are not mutually exclusive:

  • mpMetrics: Expose statistics through a REST endpoint:
    1. Enable mpMetrics. For example:
      <featureManager>
        <feature>mpMetrics-4.0</feature>
      </featureManager>
    2. Gather the data through the /metrics REST endpoint through Prometheus or direct requests.
  • monitor-1.0: Expose statistics through Java MXBeans:
    1. Enable monitor-1.0:
      <featureManager>
        <feature>monitor-1.0</feature>
      </featureManager>
    2. Enable JMX access through localConnector-1.0 and/or restConnector-2.0
    3. Gather the data through monitoring products, JConsole, or a Java client (example: libertymon).
    4. Key MXBeans are ThreadPoolStats, JvmStats, ServletStats, SessionStats, and ConnectionPool. If all MXBeans are enabled (as they are by default if monitor-1.0 is configured), benchmarks show about a 4% overhead. This may be reduced by limiting the enabled MXBeans; for example:
      <monitor filter="ThreadPoolStats,ServletStats,ConnectionPool,..." />
  • JAX-RS Distributed Tracing for applications with JAX-RS web services:
    1. Enable mpOpenTracing-1.3:
      <featureManager>
        <feature>mpOpenTracing-1.3</feature>
      </featureManager>
    2. Connect to Jaeger (easier because no user feature is required) or Zipkin.

mpMetrics

When mpMetrics-x.x (or the convenience microProfile-x.x feature) is enabled, the monitor-1.0 feature is implicitly enabled to provide vendor metrics that include metrics for thread pools, connection pools, web applications, HTTP sessions, and JAX-WS. This adds a small performance overhead (especially the thread pool metrics). If only a subset of these metrics are needed, reduce the overhead by adding a filter that explicitly configures the available metrics:

  • Only JAX-RS:
    <monitor filter="REST" />
  • Only some subset of the vendor metrics (pick and choose from the following list):
    <monitor filter="JVM,ThreadPool,WebContainer,Session,ConnectionPool,REST" />
  • No JAX-RS nor vendor metrics and instead only base and application metrics:
    <monitor filter=" "/> <!-- space required -->

Centralized Logging

Consider sending log data to a central service such as Elastic Stack. This is useful for searching many logs at once.

Java Database Connectivity (JDBC)

Review common JDBC tuning, in particular:

  • maxPoolSize: The maximum connections to the DB for this pool.
  • statementCacheSize: Maximum number of cached prepared statements per connection.
  • purgePolicy: Whether to purge all connections when one connection has a fatal errors. Defaults to EntirePool. Consider FailingConnectionOnly.
  • numConnectionsPerThreadLocal: Cache DB connections in ThreadLocals on the DefaultExecutor pool. Consider testing with an explicit maxThreads size for the pool.
  • isolationLevel: If application semantics allow it, consider reducing the isolationLevel.

Connection pool idle and aged timeouts

For maximum performance, connections in the pool should not time out due to the idle timeout (maxIdleTime) nor the aged timeout (agedTimeout). To accomplish this, modify connectionManager configuration:

  1. Set minPoolSize to the same value as maxPoolSize or set maxIdleTime="-1", and
  2. Ensure agedTimeout is not specified or is set to -1, and
  3. Ensure any intermediate firewalls to the database do not have idle or age timeouts or configure client operating system TCP keep-alive timeouts to below these values, and
  4. Ensure the database does not have idle or age timeout.

For example:

<connectionManager minPoolSize="50" maxPoolSize="50" reapTime="-1" />

The reason to do this is that connection creation and destruction may be expensive (e.g. TLS, authentication, etc.). Besides increased latency, in some cases, this expense may cause a performance tailspin in the database that may make response time spikes worse; for example, something causes an initial database response time spike, incoming load in the clients continues apace, the clients create new connections, and the process of creating new connections causes the database to slow down more than it otherwise would, causing further backups, etc.

The main potential drawback of this approach is that if there is a firewall between the connection pool and the database, and the firewall has an idle or age timeout, then the connection may be destroyed and cause a stale connection exception the next time it's used. This may fail the request and purge the entire connection pool if purgePolicy="EntirePool". The main ways to avoid this are either to configure the firewall idle or age timeouts similar to above, or tune the TCP keepalive settings in the client or database operating systems below the timeouts.

Similarly, some databases may have their own idle or age timeouts. The database should be tuned similarly. For example, IBM DB2 does not have such connection timeouts by default.

Some people use connection pool usage as a proxy of database response time spikes. Instead, monitor database response times in the database or using Liberty's ConnectionPool MXBean and its InUseTime statistic.

Admin Center

The Admin Center is commonly put on port 9443, for example https://localhost:9443/adminCenter/

<featureManager>
  <feature>adminCenter-1.0</feature>
</featureManager>
<quickStartSecurity userName="wsadmin" userPassword="wsadmin" />

Sampling Profiler

Consider enabling a sampling profiler, even in production. This does have a cost but provides very rich troubleshooting data on what Java code used most of the CPU, what monitors were contended, and periodic thread information. Benchmarks for Health Center showed an overhead of <2%. Gauge the overhead in a performance test environment.

  • OpenJ9 or IBM Java:
    1. Add the following to jvm.options and restart:
      -Xhealthcenter:level=headless
    2. After each time the JVM gracefully stops, a healthcenter*.hcd file is produced in the current working directory (e.g. $LIBERTY/usr/servers/$SERVER/).
  • HotSpot Mission Control:
    1. Add the following to jvm.options and restart:
      -XX:+FlightRecorder
      -XX:StartFlightRecording=filename=jmcrecording.jfr,settings=profile
    2. After each time the JVM gracefully stops, a *.jfr file is produced in the current working directory (e.g. $LIBERTY/usr/servers/$SERVER/).

Start-up

If a cdi feature is used (check the CWWKF0012I message) and the applications don't have CDI annotations in embedded archives, disable such scanning to improve startup times with <cdi12 enableImplicitBeanArchives="false"/>.

Jandex

Consider creating Jandex index files to pre-compute class and annotation data for application archives.

Startup Timeout Warning

Liberty profile has a fixed timeout of 30 seconds for applications to start. After the 30 second timeout expires two things happen: a message is output to the logs saying the application didn't start quickly enough; during server startup the server will stop waiting for the application to start and claim to be started, even though the application is not yet ready.

To set this value to (for example) one minute add the following to server.xml:

<applicationManager startTimeout="1m"/>

https://www-01.ibm.com/support/docview.wss?uid=swg1PI51375

Startup order

Starting with Liberty 20.0.0.6, the application element has an optional startAfter attribute that allows controlling the order of application startup. Example:

<application id="APP1" location="APP1.war"/>
<webApplication id="APP2" location="APP2.war" startAfterRef="APP1"/>
<enterpriseApplication id="APP3" location="APP3.ear"/>
<application id="APP4" location="APP4.ear" startAfterRef="APP2, APP3"/>

OSGi Feature Startup Times

Edit wlp/usr/servers/<serverName>/bootstrap.properties:

osgi.debug=

Edit wlp/usr/servers/<serverName>/.options:

org.eclipse.osgi/debug/bundleStartTime=true
org.eclipse.osgi/debug/startlevel=true

Example output:

10 ms for total start time event STARTED - osgi.identity; type="osgi.bundle"; version:Version="1.3.42.202006040801"; osgi.identity="com.ibm.ws.kernel.service" [id=5]
13 ms for total start time event STARTED - osgi.identity; type="osgi.bundle"; version:Version="1.0.42.202006040801"; osgi.identity="com.ibm.ws.org.eclipse.equinox.metatype" [id=6]

The bundles have some dependencies so there are a series of start-levels to support the necessary sequencing, but all bundles within a given start-level are started in parallel, not feature-by-feature. You may disable parallel activation to get more accurate times but then you have to figure out all the bundles the feature enables.

Idle CPU

Idle CPU usage may be decreased if dynamic configuration and application updates are not required:

<applicationMonitor dropinsEnabled="false" updateTrigger="disabled"/>
<config updateTrigger="disabled"/>

Alternatively, you may still support dynamic updates through MBean triggers that has lower overhead than the default polling:

<applicationMonitor updateTrigger="mbean" pollingRate="999h" />
<config updateTrigger="mbean" monitorInterval="999h" />

When the MBean is triggered, the update occurs immediately (i.e. the pollingRate and monitorInterval values can be set as high as you like).

Authentication Cache

If using security and the application allows it, consider increasing the authentication cache timeout from the default of 10 minutes to a larger value using, for example, <authCache timeout="30m" />.

If using security and the authentication cache is becoming full, consider increasing the authentication cache maxSize from the default of 25000 to a larger value using, for example, <authCache maxSize="100000" />.

LDAP

If using LDAP, consider increasing various cache values in <attributesCache />, <searchResultsCache />, and <contextPool /> elements. For example:

<ldapRegistry [...]>
  <ldapCache>
    <attributesCache size="4000" sizeLimit="4000" timeout="2400s" />
    <searchResultsCache resultsSizeLimit="4000" size="4000" timeout="2400s" />
  </ldapCache>
  <contextPool preferredSize="6" />
</ldapRegistry>

Web Services

JAX-RS

JAX-RS Client

The JAX-RS 2.0 and 2.1 clients are built on top of Apache CXF. The JAX-RS 3.0 and above clients are built on top of RESTEasy. Upgrading to JAX-RS 3.0 and above may require some application changes.

For applications, it is best to re-use anything that you can. For example, you should try to re-use Client instances if you are invoking multiple different endpoints (although other implementations may not be, the Client implementation in Liberty is thread safe). If you are invoking the same endpoint, but possibly with different parameters, then you should re-use the WebTarget. If you are invoking the same request (same parameters, cookies, message body, etc.), then you can re-use the Invocation.Builder. Re-using as much as possible prevents waste and also improves performance as you are no longer creating new clients, targets, etc. Make sure to close Response objects after using them.

Connection and Read Timeouts

Timeouts may be specified via the com.ibm.ws.jaxrs.client.connection.timeout and com.ibm.ws.jaxrs.client.receive.timeout properties in JAX-RS 2.0, via the connectTimeout() and readTimeout() methods in JAX-RS 2.1, or via server.xml using the webTarget element and setting the connectionTimeout and/or receiveTimeout attributes.

Keep-Alive Connection Pools

The JAX-RS V2 client in synchronous mode uses connection pooling through the JDK's HttpURLConnection. In particular, this means that if the server does not respond with a Keep-Alive response header, then the connection will time-out after 5 seconds (which is tunable in recent versions of Java).

The JAX-RS V2 client in asynchronous mode uses connection pooling through Apache HttpClient. This HttpClient may be tuned with client.setProperty or property calls such as org.apache.cxf.transport.http.async.MAX_CONNECTIONS, org.apache.cxf.transport.http.async.MAX_PER_HOST_CONNECTIONS, org.apache.cxf.transport.http.async.CONNECTION_TTL, and org.apache.cxf.transport.http.async.CONNECTION_MAX_IDLE, although these have high default values.

It may be possible (although potentially unsupported) to switch from a JAX-RS V2 synchronous client into an asynchronous client with a client.setProperty or property call with use.async.http.conduit=true.

JSP

By default, Liberty compiles JSPs on first access (if a cached compile isn't already available). If you would like to compile all JSPs during server startup instead, use the following (these are re-compiled every startup):

<jspEngine prepareJSPs="0"/>
<webContainer deferServletLoad="false"/>

JSF

MyFaces JSF Embedded JAR Search for META-INF/*.faces-config.xml

By default, the Liberty Apache MyFaces JSF implementation searches JSF-enabled applications for META-INF/*.faces-config.xml files in all JARs on the application classpath. A CPU profiler might highlight such tops of stacks of a form such as:

  java.util.jar.JarFile$1.nextElement
  java.util.jar.JarFile$1.nextElement
  org.apache.myfaces.view.facelets.util.Classpath._searchJar
  org.apache.myfaces.view.facelets.util.Classpath._searchResource
  org.apache.myfaces.view.facelets.util.Classpath.search
  [...]FacesConfigResourceProvider.getMetaInfConfigurationResources
  [...]

When an embedded faces-config.xml file is found, a message is written to messages.log with a wsjar: prefix, so this would be a simple way to check if such embedded resource searches are needed or not. For example:

Reading config : wsjar:file:[...]/installedApps/[...]/[...].ear/lib/bundled.jar!/META-INF/faces-config.xml

If your applications only use a faces-config.xml within the application itself and do not depend on embedded faces-config.xml files within JARs on the application classpath, then you can disable these searches with org.apache.myfaces.INITIALIZE_SKIP_JAR_FACES_CONFIG_SCAN=true if on JSF >= 2.3. This may be set in WEB-INF/web.xml, META-INF/web-fragment.xml, or globally as a JVM property. For example, in jvm.options:

-Dorg.apache.myfaces.INITIALIZE_SKIP_JAR_FACES_CONFIG_SCAN=true

If you want to disable globally using the JVM property but some applications do require embedded faces-config.xml files, then use the above property and then enable particular applications in WEB-INF/web.xml or META-INF/web-fragment.xml.

EJB

The Liberty EJB implementation is a fork of the Apache Geronimo Yoko ORB.

Yoko Timeouts

  • -Dyoko.orb.policy.connect_timeout=MILLISECONDS (default -1 which is no timeout)
  • -Dyoko.orb.policy.request_timeout=MILLISECONDS (default -1 which is no timeout)
  • -Dyoko.orb.policy.reply_timeout=MILLISECONDS (default -1 which is no timeout)
  • -Dyoko.orb.policy.timeout=MILLISECONDS: Change connect_timeout and request_timeout together (default -1 which is no timeout)

Remote Interface Optimization

If an application uses a remote EJB interface and that EJB component is available within the same JVM, as of this writing, the Yoko ORB that Liberty uses does not have an optimization to automatically use the local interface (as is done with WAS traditional using its "local optimization" [different from prefer local]) and this will drive the processing of the remote EJB on a separate thread. If an EJB is not @Asynchronous, consider running such EJBs on the same thread by using the local interface or no-interface instead of the remote interface.

Messaging

Liberty provides various forms of JMS messaging clients and connectors.

Activation Specifications

If using JMS MDBs, tune activation specifications' maxConcurrency to control the maximum concurrent MDB invocations and maxBatchSize to control message batch delivery size.

Embedded Messaging Server

The embedded messaging server (wasJmsServer) is similar to the WAS traditional SIB messaging with the following differences:

  1. There is no messaging bus in Liberty. A single messaging engine can run in each JVM but there is no cluster concept that will present the messaging engines, queues and topics within them as belonging to a single clustered entity.
  2. There is no high availability fail-over for the messaging engine. The client JVM is defined to access the queues and topics in messaging engines running in specific JVMs running at a specific hostname rather than anywhere in the same cell as in WAS traditional.
  3. There are other minor differences where Liberty sticks more closely to the JMS specification. Examples:
    1. JMSContext.createContext allows a local un-managed transaction to be started in WAS traditional (although this is often an application design error) but this is not allowed in Liberty.
  4. WAS traditional SIB APIs (com.ibm.websphere.sib) are not provided in Liberty.

The WebSphere Application Server Migration Toolkit helps discover and resolve differences.

Database Persistence

JPA 2.0 and before uses OpenJPA. JPA 2.1 and later uses EclipseLink.

JakartaEE

Classloading

The directory ${shared.config.dir}/lib/global (commonly, $WLP/usr/shared/lib/global/) is on the global classpath, unless the application specifies a classloader element, in which case a commonLibraryRef of global can be added to still reference that directory.

The ContainerClassLoader is the super class of the AppClassLoader. If an application has an EAR with one or more WARs, there will be two AppClassLoaders (one for the WAR module that is a child loader of the EAR's loader). If it's just a standalone WAR, then only one AppClassLoader.

The WAR/EAR AppClassLoader is a child of a GatewayClassLoader that represents the access to Liberty's OSGI bundle classloaders. The GW loader only allows class loads that are in API packages (i.e. javax.servlet.* if the servlet feature is enabled, javax.ejb.* if the ejb feature is enabled, etc.).

Each WAR/EAR AppClassLoader also has a child loader called ThreadContextClassLoader that basically has two parents - the associated AppClassLoader and it's own Gateway ClassLoader that can load additional classes from Liberty bundles that are in packages marked for thread-context that allows Liberty to load classes using the thread's context classloader without allowing an application class to directly depend on it.

AppClassLoader will use the Class-Path entry in a MANIFEST.MF on a JAR, but it is not required. The classpath for a WAR is all classes in the WEB-INF/classes directrory plus all of the jar files in WEB-INF/lib - then if there are any private shared libraries associated with the app, then those class entries are added to the classpath too.

Liberty doesn't allow shared library references from WARs within an EAR. The references are only handled at the application scope (whether that's an EAR or a standalone WAR). Therefore, you can't separately reference a library for each WAR like you could in WAS traditional. The alternative is to put the jars from the shared library into the WARs' WEB-INF/lib directories.

JARs placed in ${LIBERTY}/usr/${SERVER}/lib/global should be accessible by all applications through Class.forName.

Passing Configuration

  1. MicroProfile Config

  2. Use a jndiEntry in server.xml; for example:

    <variable name="myVariable" value="myValue"/>
    <jndiEntry jndiName="myEntry" value="${myVariable}"/>

    Then the application can do new InitialContext().lookup("myEntry");.

    Note that this may also be used to pass Liberty configuration such as, for example, ${com.ibm.ws.logging.log.directory} for the log directory.

dnf/yum/apt-get repositories

See https://openliberty.io/blog/2020/04/09/microprofile-3-3-open-liberty-20004.html#yum

Security

Links:

Tracing login: *=info:com.ibm.ws.security.*=all:com.ibm.ws.webcontainer.security.*=all and search for performJaasLogin Entry and performJaasLogin Exit

Failed login delays

APAR PH38929 introduced in 21.0.0.10 performs a delay from 0-5 seconds (a different random number in this range each time) on a failed login. Consider enabling and reviewing the HTTP access log to find such delays by checking for delays with an HTTP 401 or 403 response code. If your original concern was about response times spikes in your monitoring, then you may consider leaving this behavior as-is, and removing such 401 or 403 responses from your response time monitoring. Otherwise, if your security team reviews the failed logins and the potential of user enumeration attacks and decides it is okay to reduce or eliminate the delay, this may be done with failedLoginDelayMin and failedLoginDelayMax, although you should also continue to monitor for failed login attempts.

Basic Extensions using Liberty Libraries (BELL)

The bell feature allows packaging a ServletContainerInitializer in a jar file with META-INF/services and run for every app that gets deployed.

<bell libraryRef="scilib" />

JConsole

JConsole is a simple monitoring utility shipped with the JVM.

Note that while JConsole does have some basic capabilities of writing statistics to a CSV, this is limited to a handful of JVM statistics from the main JConsole tabs and is not available for the MXBean data. Therefore, for practical purposes, JConsole is only useful for ad-hoc, live monitoring.

To connect remotely with the restConnector, launch the client JConsole as follows:

jconsole -J-Djava.class.path=$JDK/lib/jconsole.jar:$JDK/lib/tools.jar:$LIBERTY/clients/restConnector.jar -J-Djavax.net.ssl.trustStore=$LIBERTY/usr/servers/server1/resources/security/key.jks -J-Djavax.net.ssl.trustStorePassword=$KEYSTOREPASSWORD -J-Djavax.net.ssl.trustStoreType=jks

Then use a URL such as service:jmx:rest://localhost:9443/IBMJMXConnectorREST and enter the administrator credentials.

  1. Start JConsole: WLP/java/{JAVA}/bin/jconsole
  2. Choose the JVM to connect to:
  3. You may be asked to automatically switch to the secure port:
  4. Review the overview graphs:
  5. Click the MBeans tab to review enabled data:
  6. You may also export some of the data by right clicking and creating a CSV file:
  7. You may also execute various MBeans, for example:

Accessing statistics through code:

MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
ObjectName name = new ObjectName("WebSphere:type=REST_Stats,name=restmetrics/com.example.MyJaxRsService/endpointName");
RestStatsMXBean stats = JMX.newMXBeanProxy(mbs, name, RestStatsMXBean.class);
long requestCount = stats.getRequestCount();
double responseTime = stats.getResponseTime();

JConsole with HotSpot

You may need options such as the following for local connections with HotSpot:

-Dcom.sun.management.jmxremote.port=9005
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false

Find Release for a Pull Request

  1. Find the line [...] merged commit X into OpenLiberty:integration [...]. Take the value of X. For example, for PR 10332, X is c7eb966.
  2. Click on the link for that commit and review the tags at the top. For example, commit c7eb966 shows tags starting at 20.0.0.2 which is the first release it's available.
  3. Alternatively, from the command line, search for tags with that commit: git tag --contains X. For example, PR 10332 is available starting in 20.0.0.2:
    git tag --contains c7eb966
    gm-20.0.0.2
    gm-20.0.0.3
    gm-20.0.0.4
    gm-20.0.0.5
    gm-20.0.0.6
    gm-20.0.0.7
    gm-20.0.0.8
    gm-20.0.0.9

Java Support

Liberty requires a certain minimum version of Java. As of this writing, the only feature that requires a JDK is localConnector-1.0; all other features only need a JRE.

Quick Testing

If you have Docker installed, you may quickly test some configuration; for example:

$ echo '<server><config updateTrigger="mbean" monitorInterval="60m" /><logging traceSpecification="*=info:com.ibm.ws.config.xml.internal.ConfigFileMonitor=all" traceFormat="BASIC" traceFileName="stdout" maxFileSize="250" /></server>' > configdropin.xml && docker run --rm -v $(pwd)/configdropin.xml:/config/configDropins/overrides/configdropin.xml -it open-liberty:latest | grep -e CWWKF0011I -e "Configuration monitoring is enabled"
[9/14/20 21:16:27:371 GMT] 00000026 ConfigFileMon 3   Configuration monitoring is enabled. Monitoring interval is 3600000
[9/14/20 21:16:31:048 GMT] 0000002a FeatureManage A   CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 6.851 seconds.

^C

FFDC

Liberty will keep up to the last 500 FFDC files. These are generally small files, useful for post-mortem debugging, and don't need to be manually cleaned up.

z/OS

Monitoring on z/OS

WebSphere Liberty provides SMF 120 records for understanding performance aspects of various processes in a Liberty server. Note that there is a small throughput overhead for enabling SMF recording of approximately 3% (your mileage may vary).

Additional background:

zIIPs/zAAPs

In general, WebSphere Liberty on z/OS is mostly Java so it mostly offloads to zIIPs (other than small bits such as creating WLM enclaves, SAF, writing SMF records, etc.). Even if application processing hands-off to non-zIIP-eligible native code (e.g. third party JNI), recent versions of z/OS (with APAR OA26713) have a lazy-switch design in which short bursts of such native code may stay on the zIIP and not switch to GCPs. For non-zIIP-eligible native code such as the type 2 DB2 driver, some of that may use zAAPs and total processor usage compared to type 4 depends on various factors and may be lower.

JAXB

JAXB may be used to marshal and unmarshal Java classes to and from XML, most commonly with web service clients or endpoints using JAX-WS such as through the xmlWS, xmlBinding, jaxws, or jaxb features.

If you observe that JAXBContext.newInstance is impacting performance, consider:

  1. Package a jaxb.index file for every package that does not contain an `ObjectFactory` class.
  2. Consider faster instantiation performance over faster sustained unmarshalling/marshalling performance:
    • If using Liberty's xmlWS/xmlBinding: -Dorg.glassfish.jaxb.runtime.v2.runtime.JAXBContextImpl=true
    • If using Liberty's jaxws/jaxb: -Dcom.sun.xml.bind.v2.runtime.JAXBContextImpl=true
  3. If creating a JAXBContext directly, consider using a singleton pattern which is thread safe.

Timed Operations

Timed operations was introduced before requestTiming and is largely superseded by requestTiming, although requestTiming only uses simple thresholds. Unless the more complex response time triggering is interesting, use requestTiming instead.

When enabled, the timed operation feature tracks the duration of JDBC operations running in the application server. In cases where operations take more or less time to execute than expected, the timed operation feature logs a warning. Periodically, the timed operation feature will create a report, in the application server log, detailing which operations took longest to execute. If you run the server dump command, the timed operation feature will generate a report containing information about all operations it has tracked.

To enable timed operations, add the timedOperations-1.0 feature to the server.xml file.

The following example shows a sample logged message:

[3/14/13 14:01:25:960 CDT] 00000025 TimedOperatio W TRAS0080W: Operation websphere.datasource.execute: jdbc/exampleDS:insert into cities values ('myHomeCity', 106769, 'myHomeCountry') took 1.541 ms to complete, which was longer than the expected duration of 0.213 ms based on past observations.

http://www.ibm.com/support/knowledgecenter/SSD28V_8.5.5/com.ibm.websphere.wlp.core.doc/ae/cwlp_timeop.html

<featureManager>
  <feature>timedOperations-1.0</feature>
</featureManager>

Education