- Periodically monitor WAS logs for warning and error messages.
- Set the maximum size of JVM
logs to at least 256MB and maximum number of historical files to at
least 4.
- Set the maximum size of diagnostic
trace to at least 256MB and maximum number of historical files to at
least 4.
- Change the hung thread
detection threshold and interval to something smaller that is tuned
for each application, and enable a limited number of thread dumps when
these events occur. For example:
- com.ibm.websphere.threadmonitor.threshold=30
- com.ibm.websphere.threadmonitor.interval=1
- com.ibm.websphere.threadmonitor.dump.java=15
- com.ibm.websphere.threadmonitor.dump.java.track=3
 
- Unless com.ibm.websphere.threadmonitor.intervalhas
been set very low, consider enabling periodic thread pool statistics
logging with the diagnostic trace*=info:Runtime.ThreadMonitorHeartbeat=detail
- Monitor for increases in the Countcolumn in the FFDC summary file
(${SERVER}_exception.log) for each server, because only the
first FFDC will print a warning to the logs.
- Review relevant timeout values such as JDBC, HTTP, etc.
- A well-tuned WAS is a better-behaving WAS, so also review the WAS traditional tuning
recipes.
- Review the Troubleshooting
Operating System Recipes and Troubleshooting Java
Recipes.
- Review all warnings and errors in System*.log(or usinglogViewerif HPEL is enabled) before and during the
problem. A regular expression search is" [W|E] ". One
common type of warning is an FFDC warning which points to a matching
file in the FFDC logs directory.
- If you're on Linux or use cygwin, use the following command:
find . -name "*System*" -print0 | xargs -0 grep " [W|E] " | grep -v -e supposedly_benign_message1 -e supposedly_benign_message2
 
 
- Review all JVMmessages innative_stderr.logbefore and during the problem. This may
include things such as OutOfMemoryErrors. The filename of such artifacts
includes a timestamp of the formYYYYMMDD.
- Review any strange messages in native_stdout.logbefore
and during the problem.
- If verbose garbage collection is enabled, review verbosegc in
native_stderr.log(IBM Java),native_stdout.log(HotSpot Java), or anyverbosegc.logfiles (if using-Xverbosegclogor-Xloggc) in the IBM
Garbage Collection and Memory Visualizer Tool and ensure that the
proportion of time in garbage collection for a relevant period before
and during the problem is less than 5 - 10%
- Review any javacore*.txtfiles in the IBM
Thread and Monitor Dump Analyzer tool. Review the causes of the
thread dump (e.g. user-generated, OutOfMemoryError, etc.) and review
threads with large stacks and any monitor contention.
- Review any heapdump*.phdandcore*.dmpfiles in the Eclipse Memory
Analyzer Tool
- Consider increasing the value of server_region_stalled_thread_threshold_percentso that a servant is only abended when a large percentage of threads are
taking a long time. Philosophies on this differ,
but consider a value of 10.
- Set control_region_timeout_delayto give some time for work to finish before the servant is abended; for
example, 5.
- Set control_region_timeout_dump_actionto gather useful diagnostics when a servant is abended; for example,
IEATDUMP
- Consider reducing the control_region_$PROTOCOL_queue_timeout_percentvalues so that requests time out earlier if they queue for a long time;
for example, 10.
- If necessary, apply granular
timeouts to particular requests
- Run listTimeoutsV85.pyto review and tune timeouts.
Previous Section (Troubleshooting Memory Leaks) | 
Next Section (WAS traditional Dynamic Diagnostic Trace Recipe) | 
Back to Table of Contents