IBM Maximo

MBO PhantomReferences

Maximo may use PhantomReferences for cleaning up MBO-related objects which are indirectly created by application database requests. PhantomReferences are basically like finalizers that allow cleanup code to run for an object that is about to be garbage collected. However, the Java specification states that PhantomReference processing is non-deterministic:

If the garbage collector determines at a certain point in time that the referent of a phantom reference is phantom reachable, then at that time or at some later time it will enqueue the reference.

Therefore, it is possible that the rate of PhantomReference generation exceeds the rate at which they can be marked, queued, and cleared, and thus the PhantomReferences themselves can build up and put pressure on the memory and garbage collection. There are no IBM Java tuning options to control the aggressiveness of PhantomReference marking, queuing, and clearing; however, since OpenJ9 0.35 (IBM Java 8.0.7.20), phantom reference processing is more aggressive.

If you are experiencing long GC times due to this issue, here are some ideas:

  1. Review if the application activity is expected. For example, are there excessively large or unbounded database queries that will drive creation of large/complex MBO object graphs that will indirectly drive lots of PhantomReferences?
  2. Run a test with -Xgc:concurrentSlack=macrofrag to see if it helps
  3. Run a test with -Xgc:concurrentSlack=macrofrag -Xgc:concurrentSlackFragmentationAdjustmentWeight=50 to see if it helps
  4. Horizontally scale to more nodes+JVMs to distribute the PhantomReference processing
  5. Test reducing -Xmx to induce cleaning up PhantomReferences more often so that the worst case pause time of cleaning up a lot of queued PhantomReferences is not too high
  6. If the above steps do not help or are not feasible in the short term, then customer could periodically restart JVMs to clean up the PhantomReferences
  7. Test increasing -Xmx (if there is available RAM) if the JVM needs to run for longer before restarting

Notes for identifying this issue:

  1. Verbosegc will show spikes in PhantomReference count some time before GC spikes and PhantomReferences cleared during GC spikes
  2. In a core dump, the class histogram will show large retained sets for java.util.Hashtable. Running merge shortest paths to GC roots on these objects and excluding weak references will show a lot of memory through phantom references in the phantomList static object in the class psdi.mbo.Mbo. This phantomList may have a lot of objects.
  3. In a core dump, if we look at the static field Mbo.phantomList, for example, that has strong references to phantom references, but the actual referents are not strongly reachable. Therefore, the phantom reference should be put onto the Mbo.phantomQueue (which will then drive the Maximo code to remove the phantom reference from the phantomList) but Java may be slow to mark the referents as only phantomly reachable (as evidenced by the reference queue being empty). This then may drive the accumulation of phantom references and referents in the heap and drives high GC pause times. As an example, if we look at one phantom reference psdi.mbo.MboCounter, its referent may be some object, but if we check whether this is strongly referenced, then it is not strongly reachable (disregarding WeakReference paths).

Ideally, PhantomReference usage should be minimized or eliminated.