Sizing OpenJ9 Native Memory
If running in a memory-constrained environment, review the following diagnostic and sizing guidance for OpenJ9 native memory.
It's not uncommon for a heavy application to use 500MB or more of native memory outside the Java heap. In such cases, this can be reduced as discussed below but that may come at a performance cost.
Diagnostics
- On Linux, consider limiting
arenas with the environment variable
MALLOC_ARENA_MAX=1
and restart. - If using IBM Java 8 and there's an opportunity to restart the JVM,
restart with the following option for additional "Standard Class
Libraries" native memory accounting in javacores (minimal performance
overhead):
-Dcom.ibm.dbgmalloc=true
- Gather operating system statistics on resident process memory usage.
A single snapshot at peak workload is an okay start but periodic
snapshots over time using a
script provide a better picture.
- Linux examples:
- With
/proc
:$ PID=... $ grep VmRSS /proc/$PID/status VmRSS: 201936 kB
- With
ps
(in KB):$ PID=... $ ps -p $PID -o rss RSS 201936
- With
top
and review theRES
column in KB (change-d
for the interval been outputs in seconds and-n
for the number of intervals):$ PID=... $ top -b -d 1 -n 1 -p $PID top - 19:10:46 up 5:43, 0 users, load average: 0.01, 0.05, 0.02 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 15950.9 total, 13429.6 free, 473.5 used, 2047.8 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 15239.3 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 default 20 0 7832760 201936 55304 S 6.2 1.2 0:06.86 java
- With
- Linux examples:
- Gather javacores of the process. A single javacore at peak workload
is an okay start but periodic javacores over time using a script provide a
better picture.
- Linux examples:
- Use
kill -QUIT
(assuming default settings that only produce a javacore):$ PID=... $ kill -QUIT $PID
- Use
- Linux examples:
- Gather detailed per-process memory mappings. A single snapshot at
peak workload is an okay start but periodic snapshots over time using a script provide a
better picture.
- Linux example:
$ PID=... $ cat /proc/$PID/smaps
- Linux example:
- If possible, ensure verbose garbage collection is enabled; for
example:
-Xverbosegclog:verbosegc.%seq.log,20,50000
Review and Sizing
- Review the
NATIVEMEMINFO
section in the javacores. Note that these are virtual memory allocations and not necessarily resident. Review all of them with a particular focus on:Java Heap
: The native allocation(s) for the Java heap itself (-Xmx
/-XX:MaxRAMPercentage
). Even if-Xms
(or-XX:InitialRAMPercentage
) is less than-Xmx
/-XX:MaxRAMPercentage
and heap usage is less than-Xmx
/-XX:MaxRAMPercentage
, you should always assume the entire-Xmx
/-XX:MaxRAMPercentage
native memory will be touched (and thus resident) at some point because even if application workload never reaches that amount of live Java heap usage, most modern garbage collectors are generational which almost always means trash will accumulate in the tenured region until a full GC, and thus most or all of the Java heap is likely to become resident given enough time.Classes
: This is the native backing of classes loaded in the Java heap. If this is excessively large, gather a system dump of the process and check for classloader memory leaks with the Eclipse Memory Analyzer Tool.Threads
: Each thread has two stacks both of which live in native memory. In some cases, very large stacks and/or a customized maximum stack size (-Xss
) can inflate this number, but more often a large value here simply reflects a large number of threads that can be reduced or may be due to a thread leak. Review threads and thread stacks in the javacore and consider reducing thread pool maximums. To investigate a thread leak, gather a system dump of the process and review with the Eclipse Memory Analyzer Tool).JIT
: Some growth in this is expected over time but should level out at the maximum specified by-Xcodecachetotal
,-Xjit:dataTotal
and-Xjit:scratchSpaceLimit
(see below). Note that defaults in recent versions are relatively large (upwards of 550MB or more at peak, primarily driven by the code cache and spikes in JIT compilation).Direct Byte Buffers
: These are native memory allocations driven by Java code and may have different drivers. To investigate what's holding on to DirectByteBuffers, gather a system dump of the process, review with the Eclipse Memory Analyzer Tool) and run the query, IBM Extensions } Java SE Runtime } DirectByteBuffers.Unused <32bit allocation regions
: Available space within the-Xmcrs
value (for compressed references)
- Review the
1STSEGMENT
lines in the javacores usingget_memory_use.pl
to break down some the resident amounts of some of the above virtual amounts. - Review verbose garbage collection for a lot of phantom reference
processing which may be a symptom of spikes in DirectByteBuffers. Even
if Direct Byte Buffer usage in
NATIVEMEMINFO
above is relatively low, there may have been a spike in DirectByteBuffer memory usage which, in general, will only be returned to libc free lists rather than going back to the operating system. - Check the javacore and per-process memory mappings for non-standard
native libraries (e.g. loaded with
-agentpath
,-agentlib
, and-Xrun
) and consider testing without each library. - Consider tuning the following options:
- Maximum Java heap size:
-Xmx_m
or-XX:MaxRAMPercentage=_
- Maximum size of threads and thread pools (e.g.
<executor coreThreads="_" />
for Liberty or maximum thread pool sizes for WebSphere Application Server traditional) - Maximum JIT code cache:
-XX:codecachetotalMaxRAMPercentage=X
or-Xcodecachetotal_m
(default 256MB) - Maximum JIT data size (in KB):
-Xjit:dataTotal=_
- JIT scratch space limit (in KB):
-Xjit:scratchSpaceLimit=_
(default 256MB) - Maximum shared class cache size (though it must be destroyed
first to reduce an existing one):
-Xscmx_m
- Number of garbage collection helper threads:
-Xgcthreads_
- Number of JIT compilation threads:
-XcompilationThreads_
- Maximum stack size:
-Xss_k
- Maximum Java heap size:
- Consider using a JITServer (a.k.a. IBM Semeru Cloud Compiler) to move most JIT compilation memory demands to another process.
Detailed Diagnostics
- If you suspect a leak, monitor unfreed native memory allocations:
- Linux >= 4.1: eBPF
memleak.py
- IBM
Java 8 LinuxNativeTracker
- For IBM Semeru Runtimes, open a support case to ask for a custom build of LinuxNativeTracker.
- Linux >= 4.1: eBPF
- If unaccounted memory remains, gather all of the same information
above as well as a system dump of
the process and cross-reference per-process memory maps to known JVM
virtual memory allocations to find unaccounted for memory.
- Note that
NATIVEMEMINFO
may be dumped from a system dump using the!nativememinfo
command injdmpview
. - Fragmentation in C libraries is also possible. Use a native debugger script (e.g. for Linux glibc) to walk the in-use and free lists and search for holes in memory.
- Note that