- Make sure the logs are capturing as much as possible:
- Administrative Console } Troubleshooting } Logs and Trace } server
name } JVM
Logs. These can also be changed dynamically on the Runtime tab.
- For example, Maximum size = 100MB and Maximum Number of Historical
Log Files = 5
- Ensure verbose
garbage collection is enabled. This may be enabled at runtime.
Otherwise, you will need to restart to apply the change.
- Ensure that PMI
is enabled either with the "Basic" level (this is the default) or
with a "Custom" level (see WAS
chapter on which counters are recommended)
- Enable PMI logging to files, either with a monitoring product or
with the
built-in TPV logger:
- Important note: all of these steps must be done
after every application server restart. This can be automated with a wsadmin
script
- Login to the Administrative Console and go to: Monitoring and Tuning
} Performance Viewer } View Logs
- Select all relevant application servers and click "Start
Monitoring"
- Click each application server
- Click on server } Settings } Log
- Duration = 300000
Maximum File Size = 50
Maximum Number of Historical Files = 5
Log Output Format = XML
- Click Apply
- Click server } Summary Reports } Servlets
- Click "Start Logging"
- For IBM Java, enable IBM Health Center in headless mode:
- Choose one of these methods to start Health Center:
- Restart the JVM adding the following generic JVM arguments:
-Xhealthcenter:level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.files.max.size=104857600 -Dcom.ibm.java.diagnostics.healthcenter.headless.files.to.keep=10
- Start it dynamically:
$WEBSPHERE/java/bin/java -jar $WEBSPHERE/java/jre/lib/ext/healthcenter.jar ID=$PID -Dcom.ibm.java.diagnostics.healthcenter.data.collection.level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.files.max.size=104857600 -Dcom.ibm.java.diagnostics.healthcenter.headless.files.to.keep=10
- If there is a web server in front of WAS, see the Web Server recipes.
- Archive and truncate any existing logs for each server in
$WEBSPHERE/profiles/$PROFILE/logs/$SERVER/*
- Reproduce the problem.
- Gather the Performance, Hang, or High CPU issue MustGather for your
operating system:
- Linux
- AIX
- Windows
- z/OS
- Solaris
- HP-UX
- After the problem has been reproduced, gracefully stop the
application servers (to produce Health Center HCD files).
- Gather:
- Server logs under
$WEBSPHERE/profiles/$PROFILE/logs/$SERVER/
:
SystemOut*.log SystemErr*.log native_stderr.log native_stdout.log
- FFDC logs under
$WEBSPHERE/profiles/$PROFILE/logs/ffdc/*
- Javacores, heapdumps, and system dumps, if any:
$WEBSPHERE/profiles/$PROFILE/javacore* $WEBSPHERE/profiles/$PROFILE/heapdump* $WEBSPHERE/profiles/$PROFILE/core*
- PMI logs:
$WEBSPHERE/profiles/$PROFILE/logs/tpv/*
- Health Center logs, if any:
$WEBSPHERE/profiles/$PROFILE/*.hcd
server.xml
for each server:
$WEBSPHERE/profiles/$PROFILE/config/cells/$CELL/nodes/$NODE/servers/$SERVER/server.xml
- The output of the Performance MustGather
- Review all WAS logs for
any errors, warnings, etc.
- Review verbosegc for garbage collection overhead.
- Review thread dumps
- Review patterns and check for deadlocks and monitor contention (e.g.
the TMDA
tool).
- Review operating system data for WAS and IHS nodes
- If CPU time is high, review if it's user or system.
- Review per-process and per-thread CPU data for details.
- Check virtualization steal time
- Check run queue length and any blocked threads
- Check for memory swap-ins
- If high, check memory statistics such as file cache, free memory,
etc.
- Review PMI data for the key performance indicators such as the
WebContainer thread pool ActiveCount, database connection pool usage,
servlet response times, etc. (see WAS - PMI). Try to isolate the problem
to particular requests, database queries, etc (duration or volume).
- If using a database, review the response times in the connection
pool. Try to isolate the problem to particular queries (duration or
volume).
- Review Health Center
data
- If using web servers, review IHS messages in
access_log
, error_log
, and the plugin log to
see if requests are coming in and if there are errors (i.e. HTTP
response codes). Also review mpmstats
in
error_log
to see what the threads are doing.
Previous Section (WAS traditional Recipes) |
Next Section (Large Topologies Recipe) |
Back to Table of Contents