General WAS Traditional Performance Problem

  1. Make sure the logs are capturing as much as possible:
    1. Administrative Console > Troubleshooting > Logs and Trace > server name > JVM Logs (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/utrb_jvmlogs.html). These can also be changed dynamically on the Runtime tab.
    2. Maximum size = 100MB
      Maximum Number of Historical Log Files = 5
  2. Ensure verbose garbage collection is enabled: http://www-01.ibm.com/support/docview.wss?uid=swg21114927. On certain operating systems and WAS versions, you may enable verbosegc dynamically at runtime. Otherwise, you will need to restart to apply the change. See the Java chapters for details.
  3. Ensure that PMI is enabled either with the "Basic" level (this is the default) or with a "Custom" level (see WAS chapter on which counters are recommended): https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_prfstartadmin.html
    1. Enable PMI logging to files, either with a monitoring product such as ITCAM or with the built-in TPV logger (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_tpvlogdata.html):
      1. Important note: all of these steps must be done after every application server restart. This can be automated with wsadmin: https://raw.githubusercontent.com/kgibm/problemdetermination/master/scripts/was/tpvlogging.py
      2. Login to the Administrative Console and go to: Monitoring and Tuning > Performance Viewer > View Logs
      3. Select all relevant application servers and click "Start Monitoring"
      4. Click each application server
      5. Click on server > Settings > Log
      6. Duration = 300000
        Maximum File Size = 50
        Maximum Number of Historical Files = 5
        Log Output Format = XML
      7. Click Apply
      8. Click server > Summary Reports > Servlets
      9. Click "Start Logging"
  4. For IBM Java, enable IBM Health Center in headless mode:
    1. Update to the latest Health Center agent in the WAS java directory: https://www.ibm.com/support/knowledgecenter/SS3KLZ/com.ibm.java.diagnostics.healthcenter.doc/topics/installingagent.html
    2. Choose one of these methods to start Health Center (http://www-01.ibm.com/support/docview.wss?uid=swg21657760):
      1. Start it dynamically: ${WebSphere}/java/bin/java -jar ${WebSphere}/java/jre/lib/ext/healthcenter.jar ID=${PID} -Dcom.ibm.java.diagnostics.healthcenter.data.collection.level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.files.max.size=104857600 -Dcom.ibm.java.diagnostics.healthcenter.headless.files.to.keep=10
      2. Restart the JVM adding the following generic JVM arguments: -Xhealthcenter:level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.files.max.size=104857600 -Dcom.ibm.java.diagnostics.healthcenter.headless.files.to.keep=10
  5. If there is a web server in front of WAS, see the Web Server recipes.
  6. Archive and truncate any existing logs for each server in (${WAS}/profiles/${PROFILE}/logs/${SERVER}/*) and also archive and remove the FFDC logs (${WAS}/profiles/${PROFILE}/ffdc/*).
  7. Reproduce the problem.
  8. Gather the Performance, Hang, or High CPU issue MustGather for your operating system:
    1. Linux: http://www-01.ibm.com/support/docview.wss?uid=swg21115785
    2. AIX: http://www-01.ibm.com/support/docview.wss?uid=swg21052641
    3. Windows: http://www-01.ibm.com/support/docview.wss?uid=swg21111364
    4. Solaris: http://www-01.ibm.com/support/docview.wss?uid=swg21115625
    5. HP-UX: http://www-01.ibm.com/support/docview.wss?uid=swg21127574
  9. Gather periodic thread dumps (see the WAIT tool in Java - Profilers). This is accomplished through the Performance MustGathers above.
  10. After the problem has been reproduced, gracefully stop the application servers (this is needed to produce Health Center logs).
  11. Gather:
    1. Server logs under ${WAS}/profiles/${PROFILE}/logs/${SERVER}/: SystemOut*.log, SystemErr*.log, native_stderr.log, native_stdout.log
    2. FFDC logs under ${WAS}/profiles/${PROFILE}/logs/ffdc/*
    3. Javacores, heapdumps, and system dumps: ${WAS}/profiles/${PROFILE}/javacore* ${WAS}/profiles/${PROFILE}/heapdump* ${WAS}/profiles/${PROFILE}/core*
    4. PMI logs: ${WAS}/profiles/${PROFILE}/logs/tpv/*
    5. Health Center logs: ${WAS}/profiles/${PROFILE}/*.hcd
    6. server.xml for each server: ${WAS}/profiles/${PROFILE}/config/cells/${CELL}/nodes/${NODE}/servers/${SERVER}/server.xml
    7. The output of the Performance MustGather
  12. Review all WAS logs for any errors, warnings, etc. (see WAS - Basics).
  13. Review IHS messages in access_log, error_log, and plugin log to see if requests are coming in and if there are errors (check response codes). Also review mpmstats to see what the threads are doing.
  14. Review verbosegc for garbage collection overhead.
  15. Review thread dumps
    1. Review patterns (e.g. WAIT tool) and check for deadlocks and monitor contention (e.g. TMDA tool).
  16. Review operating system data for WAS and IHS nodes
    1. If CPU time is high, review if it's user or system.
      1. Review per-process and per-thread CPU data for details.
    2. Check virtualization steal time
    3. Check run queue length and any blocked threads
    4. Check for hundreds or thousands of swap-ins
      1. If high, check memory statistics such as file cache, free memory, etc.
  17. Review PMI data for the key performance indicators such as the WebContainer thread pool ActiveCount, database connection pool usage, servlet response times, etc. (see WAS - PMI). Try to isolate the problem to particular requests, database queries, etc (duration or volume).
  18. Review Health Center data
    1. Review hot self and tree methods and monitor contention.
  19. If using a database, review the response times. Try to isolate the problem to particular queries (duration or volume). Check for lock contention.

Previous Section (WAS Traditional Recipes) | Next Section (Large Topologies Recipe) | Back to Table of Contents