Troubleshooting Memory Leaks

  1. Analyze verbosegc in GCMV. If there is a positive slope in the plot "Used heap (after global collection)", then there may be a leak.
    1. The default plot of "Used heap (after collection)" for generational collectors may sometimes look like a leak if there hasn't been a global collection recently, thus why it's best to only look at heap usage after global collections.
    2. There are cases where a positive slope after global collections is not a leak such as SoftReference caches
    3. Consider the magnitude of the heap growth relative to the heap size. Small relative growths may be reasonable. Caches may need to be populated up to some limit before they stabilize.
  2. If there's evidence of a leak, take an OS core dump (IBM Java) or HPROF dump (HotSpot Java) and load into the Eclipse Memory Analyzer Tool. Things to consider:
    1. Review the largest objects (e.g. a leak in some cache)
    2. Run the leak suspect report
    3. Run the IBM Extensions for Memory Analyzer Classloader Leak Detection under WAS } ClassLoaders
    4. Perform a general review of the dump (class histogram, top consumers, etc.)
  3. If a single core dump is inconclusive, take two or more OS core dumps (IBM Java) or HPROF dumps (HotSpot Java) from the same process and compare them in MAT to find the growth(s). The more time between dumps the better to make finding the growth(s) easier. Ideally, use a monitoring tool to track heap usage after full GC and take the second dump after a relative growth of > 10%.
  4. The most common leaks are:
    1. Large objects (byte arrays, etc.)
    2. Java collections such as Maps and Lists, often a bug removing items or a cache. One technique the tool uses in the leak suspect report, but which can also be run manually under Leak Identification > Big Drops in Dominator Tree, is to find a large difference between the retained heap of an object and its largest retained reference. For example, imagine a HashMap that retains 1GB and the leak is due to a bug removing objects so objects continue to be added to the HashMap. It is common in such a case for every individual object to be small.
  5. Proactive:
    1. Use a monitoring tool to track heap usage after full GC and alert if heap usage is above 70% and gather dumps.
    2. If using WAS traditional, Memory Leak and Excessive Memory Usage Health Condition
    3. If using Java ODR, Configure Memory Overload Protection and put a server into maintenance mode to investigate
    4. If using WAS traditional, Application ClassLoader Leak Detection