General WAS Traditional Performance Problem
- Make sure the logs are capturing as much as possible:
- Administrative Console > Troubleshooting > Logs and Trace > server name > JVM Logs (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/utrb_jvmlogs.html). These can also be changed dynamically on the Runtime tab.
- Maximum size = 100MB
Maximum Number of Historical Log Files = 5
- Ensure verbose garbage collection is enabled: http://www-01.ibm.com/support/docview.wss?uid=swg21114927. On certain operating systems and WAS versions, you may enable verbosegc dynamically at runtime. Otherwise, you will need to restart to apply the change. See the Java chapters for details.
- Ensure that PMI is enabled either with the "Basic" level (this is the default) or with a "Custom" level (see WAS chapter on which counters are recommended): https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_prfstartadmin.html
- Enable PMI logging to files, either with a monitoring product such as ITCAM or with the built-in TPV logger (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_tpvlogdata.html):
- Important note: all of these steps must be done after every application server restart. This can be automated with wsadmin: https://raw.githubusercontent.com/kgibm/problemdetermination/master/scripts/was/tpvlogging.py
- Login to the Administrative Console and go to: Monitoring and Tuning > Performance Viewer > View Logs
- Select all relevant application servers and click "Start Monitoring"
- Click each application server
- Click on server > Settings > Log
- Duration = 300000
Maximum File Size = 50
Maximum Number of Historical Files = 5
Log Output Format = XML - Click Apply
- Click server > Summary Reports > Servlets
- Click "Start Logging"
- Enable PMI logging to files, either with a monitoring product such as ITCAM or with the built-in TPV logger (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_tpvlogdata.html):
- For IBM Java, enable IBM Health Center in headless mode:
- Update to the latest Health Center agent in the WAS java directory: https://www.ibm.com/support/knowledgecenter/SS3KLZ/com.ibm.java.diagnostics.healthcenter.doc/topics/installingagent.html
- Choose one of these methods to start Health Center (http://www-01.ibm.com/support/docview.wss?uid=swg21657760):
- Start it dynamically: ${WebSphere}/java/bin/java -jar ${WebSphere}/java/jre/lib/ext/healthcenter.jar ID=${PID} -Dcom.ibm.java.diagnostics.healthcenter.data.collection.level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.files.max.size=104857600 -Dcom.ibm.java.diagnostics.healthcenter.headless.files.to.keep=10
- Restart the JVM adding the following generic JVM arguments: -Xhealthcenter:level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.files.max.size=104857600 -Dcom.ibm.java.diagnostics.healthcenter.headless.files.to.keep=10
- If there is a web server in front of WAS, see the Web Server recipes.
- Archive and truncate any existing logs for each server in (${WAS}/profiles/${PROFILE}/logs/${SERVER}/*) and also archive and remove the FFDC logs (${WAS}/profiles/${PROFILE}/ffdc/*).
- Reproduce the problem.
- Gather the Performance, Hang, or High CPU issue MustGather for your operating system:
- Linux: http://www-01.ibm.com/support/docview.wss?uid=swg21115785
- AIX: http://www-01.ibm.com/support/docview.wss?uid=swg21052641
- Windows: http://www-01.ibm.com/support/docview.wss?uid=swg21111364
- Solaris: http://www-01.ibm.com/support/docview.wss?uid=swg21115625
- HP-UX: http://www-01.ibm.com/support/docview.wss?uid=swg21127574
- Gather periodic thread dumps (see the WAIT tool in Java - Profilers). This is accomplished through the Performance MustGathers above.
- After the problem has been reproduced, gracefully stop the application servers (this is needed to produce Health Center logs).
- Gather:
- Server logs under ${WAS}/profiles/${PROFILE}/logs/${SERVER}/: SystemOut*.log, SystemErr*.log, native_stderr.log, native_stdout.log
- FFDC logs under ${WAS}/profiles/${PROFILE}/logs/ffdc/*
- Javacores, heapdumps, and system dumps: ${WAS}/profiles/${PROFILE}/javacore* ${WAS}/profiles/${PROFILE}/heapdump* ${WAS}/profiles/${PROFILE}/core*
- PMI logs: ${WAS}/profiles/${PROFILE}/logs/tpv/*
- Health Center logs: ${WAS}/profiles/${PROFILE}/*.hcd
- server.xml for each server: ${WAS}/profiles/${PROFILE}/config/cells/${CELL}/nodes/${NODE}/servers/${SERVER}/server.xml
- The output of the Performance MustGather
- Review all WAS logs for any errors, warnings, etc. (see WAS - Basics).
- Review IHS messages in access_log, error_log, and plugin log to see if requests are coming in and if there are errors (check response codes). Also review mpmstats to see what the threads are doing.
- Review verbosegc for garbage collection overhead.
- Review thread dumps
- Review patterns (e.g. WAIT tool) and check for deadlocks and monitor contention (e.g. TMDA tool).
- Review operating system data for WAS and IHS nodes
- If CPU time is high, review if it's user or system.
- Review per-process and per-thread CPU data for details.
- Check virtualization steal time
- Check run queue length and any blocked threads
- Check for hundreds or thousands of swap-ins
- If high, check memory statistics such as file cache, free memory, etc.
- If CPU time is high, review if it's user or system.
- Review PMI data for the key performance indicators such as the WebContainer thread pool ActiveCount, database connection pool usage, servlet response times, etc. (see WAS - PMI). Try to isolate the problem to particular requests, database queries, etc (duration or volume).
- Review Health Center data
- Review hot self and tree methods and monitor contention.
- If using a database, review the response times. Try to isolate the problem to particular queries (duration or volume). Check for lock contention.
Previous Section (WAS Traditional Recipes) | Next Section (Large Topologies Recipe) | Back to Table of Contents