Linux perf Recipe
- Install
perfif it's not installed. - Prepare the Java process:
- For an IBM Java or IBM Semeru Runtimes JVM:
- For IBM Java >= 8.0.7.20 or IBM Semeru Runtimes >= v8.0.352 /
11.0.17.0 / 17.0.5.0, restart the Java process with
-XX:+PerfTool - For older versions of IBM Java or IBM Semeru Runtimes, restart the
Java process with
-Xjit:perfToolwhile making sure to combine with commas with any pre-existing-Xjitoptions
- For IBM Java >= 8.0.7.20 or IBM Semeru Runtimes >= v8.0.352 /
11.0.17.0 / 17.0.5.0, restart the Java process with
- For a HotSpot JVM:
- Restart the Java process with
-XX:+UnlockDiagnosticVMOptions -XX:+PreserveFramePointer -XX:+ShowHiddenFrames - For a HotSpot JVM >= Java 16, also add the option
-XX:+DumpPerfMapAtExit - For a HotSpot JVM < Java 16:
- Also add the option
-XX:-UseCodeCacheFlushing - Compile
perf-map-agentinto some directory
- Also add the option
- Restart the Java process with
- For an IBM Java or IBM Semeru Runtimes JVM:
- Run all of the following commands as
root - During the performance problem, run one of the following commands.
Change
60to the number of seconds you want to gather data for:- For IBM Java and IBM Semeru Runtimes running on top of an Intel
processor that is Haswell or later (see
cat /proc/cpuinfoand reference Intel Processor names), use the following, although note that LBR has a limited stack depth, so use the next option if you need longer stacks:date +'%Y-%m-%d %H:%M:%S.%N %Z' &>> diag_starttimes_$(hostname).log; cat /proc/uptime &>> diag_starttimes_$(hostname).log; perf record --call-graph lbr -F 99 -a -g -- sleep 60 - For IBM Java and IBM Semeru Runtimes running on any other processor
or if you're not sure what the processor is:
date +'%Y-%m-%d %H:%M:%S.%N %Z' &>> diag_starttimes_$(hostname).log; cat /proc/uptime &>> diag_starttimes_$(hostname).log; perf record --call-graph dwarf,65528 -F 99 -a -g -- sleep 60 - For a HotSpot JVM:
date +'%Y-%m-%d %H:%M:%S.%N %Z' &>> diag_starttimes_$(hostname).log; cat /proc/uptime &>> diag_starttimes_$(hostname).log; perf record --call-graph fp -F 99 -a -g -- sleep 60
- For IBM Java and IBM Semeru Runtimes running on top of an Intel
processor that is Haswell or later (see
- Wait for the above command to complete
- If running a HotSpot JVM:
- For a HotSpot JVM >= Java 16, stop the JVM gracefully to produce
the
/tmp/perf-$PID.mapfile. If you can't do this, then, instead, follow the instructions above for a HotSpot JVM < Java 16 usingperf-map-agent - For a HotSpot JVM < Java 16, run
bin/create-java-perf-map.sh $PIDfrom theperf-map-agentdirectory to create the/tmp/perf-$PID.mapfile
- For a HotSpot JVM >= Java 16, stop the JVM gracefully to produce
the
- Run the following command from the directory where
perf recordwas run:perf script > diag_perfscript_$(hostname)_$(date +%Y%m%d_%H%M%S_%N).txt - Gather a thread dump. This is very low overhead with the process
pausing for generally about 10ms to 100ms.
kill -3 $PID - Optionally, for IBM Java and IBM Semeru Runtimes processes, gather
an operating sytem core dump of the process if the security,
disk and
performance
risks are acceptable (the process may pause for up to 30 seconds or
more) and the process and operating system are configured for it (e.g.
core and file
ulimits,
kernel.core_patterntruncation settings, etc.) using one of various mechanisms and then runjextract(IBM Java) orjpackcore(Semeru) on it; for example:$JDK/bin/jpackcore core*.dmp - Run the following command from the directory where
perf recordwas run; replace$LOGS_DIRwith the location of the logs of the JVM (including stdout/stderr, verbosegc, server/application logs, etc.),$THREAD_DUMPS_DIRwith the location where thread dumps were produced, and$OS_CORE_DUMPS_DIRif a core dump was produced:tar czvf diag_perf_$(hostname)_$(date +%Y%m%d_%H%M%S).tar.gz perf.data* diag_perfscript* /proc/kallsyms /boot/System.map-$(uname -r) /tmp/perf*map $LOGS_DIR $THREAD_DUMPS_DIR/javacore*.txt $OS_CORE_DUMPS_DIR/core*.dmp.zip - Upload
diag_perf_*.tar.gz
If you want to do basic analysis of the perf output
yourself:
- Top 10 CPU-using stack frames:
cat diag_perfscript*txt | awk 'go { go=0; print; } /cpu-clock:/ || /cycles:/ { go=1; }' | sort | uniq -c | sort -nr | head - Create FlameGraphs:
git clone https://github.com/brendangregg/FlameGraphcd FlameGraphcat diag_perfscript*txt | ./stackcollapse-perf.pl > out.perf-folded./flamegraph.pl --width 1024 out.perf-folded > perf.svg./flamegraph.pl --reverse --width 1024 out.perf-folded > perf-reverse.svg- Open
perf.svgandperf-reverse.svgin your browser
Notes:
- For IBM Java and IBM Semeru Runtimes, if not all symbols are
resolved, try again with the additional option
-Xlp:codecache:pagesize=4k
For background, see Linux perf.