J9 Health Center Enable at Runtime of Limited Duration on z/OS
The Health Center agent is shipped with IBM Java and IBM Semeru Runtimes on z/OS.
The following instructions are for z/OS (in this context, z/Linux is not considered z/OS). For non-z/OS platforms, see alternate instructions.
- By default, late attach is disabled. Restart the target process with
the additional generic JVM argument
-Dcom.ibm.tools.attach.enable=yes
- Find the decimal
PID
of the target JVM. WebSphere traditional shows thePID
inSYSPRINT
in theBBOJ0051I
message. In the following example, it is16843066
:Trace: 2024/01/04 16:18:54.968 02 t=7E5E78 c=UNK key=P8 tag= (13007004) SourceId: com.ibm.ws390.orb.CommonBridge ExtendedMessage: BBOJ0051I: PROCESS INFORMATION: STC00089/BBOS001S, ASID=76(0x4c), PID=16843066(0x101013a)
- Find the path to Java of the target JVM. WebSphere traditional shows
this path in
SYSPRINT
in aBBOJ0077I
message forjava.home
. In the following example, it is/WebSphere/ND/AppServer/java64
:Trace: 2024/01/04 16:18:54.972 02 t=7E5E78 c=UNK key=P8 tag= (13007004) SourceId: com.ibm.ws390.orb.CommonBridge.printProperties ExtendedMessage: BBOJ0077I: java.home = /WebSphere/ND/AppServer/java64
- Find the owner of the started task of the target JVM. In the
following example in
D.DA
, it isASSR1
:NP JOBNAME StepName ProcStep JobID Owner C Pos DP Real Paging SIO BBOS001S BBOS001S BBOPASR STC00089 ASSR1 IN C9 93T 0.00 35.64
- Create the following JCL, replacing
USER_REPLACEME
with the owner of the started task, both instances ofJAVAPATH_REPLACEME
with the path to Java,PID_REPLACEME
with the PID, and30
with the number of minutes to run (see notes below on duration considerations). EnsureCAPS OFF
when editing and that every line of theSTDPARM
excluding the last line ends with a space (//SHLLJOB1 JOB (ACCOUNT),NOTIFY=&SYSUID,REGION=0M,CLASS=A,MSGCLASS=H, // MSGLEVEL=(1,1),USER=USER_REPLACEME //SHLLSTEP EXEC PGM=BPXBATCH //BPXPRINT DD SYSOUT=* //STDOUT DD SYSOUT=* //STDERR DD SYSOUT=* //STDPARM DD * SH /JAVAPATH_REPLACEME/bin/java -jar /JAVAPATH_REPLACEME/lib/ext/healthcenter.jar ID=PID_REPLACEME level=headless -Dcom.ibm.java.diagnostics.healthcenter.headless.run.number.of.runs=1 -Dcom.ibm.java.diagnostics.healthcenter.headless.run.duration=30 /*
- Submit the job.
- If you receive the error,
LOGON/JOB INITIATION - SUBMITTER IS NOT AUTHORIZED BY USER
, then consider allowing surrogate job submission; for example:RDEFINE SURROGAT ASSR1.SUBMIT UACC(NONE) OWNER(ASSR1) PERMIT ASSR1.SUBMIT CLASS(SURROGAT) ID(MSTONE1) ACCESS(READ) SETROPTS RACLIST(SURROGAT) REFRESH
- If you receive the error,
- Confirm in
D.ST
in theSHLLJOB1
job that the output at the bottom looks similar to the following, specifically theSuccessfully enabled Health Center agent
line:IEF033I JOB/SHLLJOB1/STOP 2024004.1259 CPU: 0 HR 00 MIN 00.00 SEC SRB: 0 HR 00 MIN 00.00 SEC Successfully enabled Health Center agent in VM: 16843066 Health Center properties used by agent in target VM: -- listing properties -- com.ibm.java.diagnostics.healthcenter.agent.port=1972 com.ibm.java.diagnostics.healthcenter.data.collection.level=HEADLESS
- Confirm in
D.DA
of the target JVM inSYSOUT
that Health Center has started. For example:[Thu Jan 4 17:59:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: 4.0.7 [Thu Jan 4 17:59:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: Headless data collection has started [Thu Jan 4 17:59:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: Each data collection run will last for 30 minutes [Thu Jan 4 17:59:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: Agent will run for 1 collections [Thu Jan 4 17:59:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: Agent will keep last 5 hcd files [Thu Jan 4 17:59:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: Headless collection output directory is /SY1/var/WebSphere/home/WSSR1
- After the time has elapsed, refresh the JVM's
SYSOUT
to confirm that theHCD
file was created. For example:[Thu Jan 4 18:29:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: Creating hcd import file /SY1/var/WebSphere/home/WSSR1/healthcenter040124_175909_16843066_1.hcd [Thu Jan 4 18:29:09 2024] com.ibm.diagnostics.healthcenter.headless INFO: hcd import file /SY1/var/WebSphere/home/WSSR1/healthcenter040124_175909_16843066_1.hcd created
- FTP the
HCD
file(s) inBIN
mode.
Warnings and notes:
- Every time a new HCD collection is started, the agent starts to look
up method address to method name mappings for all loaded methods at the
start of the HCD. By default, the agent queries up to 3,000 unresolved
method name mappings every 5 seconds. Therefore, for proper profiled
method name evaluation, the minimum duration per-HCD should be specified
based on the number of loaded methods. It is common for an enterprise
application to load hundreds of thousands of methods, so a minimum of
5-10 minutes is a good start. Tracing to show this behavior and the
number of methods is
-Dcom.ibm.diagnostics.healthcenter.logging.methodlookup=debug -Dcom.ibm.diagnostics.healthcenter.logging.MethodLookupProvider=debug
and search forcom.ibm.diagnostics.healthcenter.methodlookup.debug DEBUG: N methods to lookup
. Methods created during the HCD interval are captured seperately. - If the JVM could not be stopped gracefully, gather the temporary
files from a subdirectory of the output called
tmp_${STARTDAY}${STARTMONTH}${STARTYEAR}_${STARTHOUR}${STARTMINUTES}${STARTSECONDS}_
- Use the additional JVM option
-Dcom.ibm.java.diagnostics.healthcenter.headless.output.directory=$DIR
to redirect Health Center files to a different directory instead of the working directory. - Note that this does not work with Liberty if some jndi-1.0-related features are loaded and there is a request for enhancement.
- If using Liberty and you specify
-Xtrace:buffers={2m,dynamic}
to minimize Health Center method metadata loss, since Liberty defaults to an unlimited maximum thread pool designed to maximize throughput, consider capping this with<executor maxThreads="N" />
based on available native memory to avoid native memory exhaustion, or use a smaller-Xtrace
buffer size such as-Xtrace:buffers={128k,dynamic}
(or lower). - We have observed that some monitoring agents cause problems with
Health Center. Consider removing other monitoring agents that use
-agentpath
while using HealthCenter, engage IBM and the agent company support teams to investigate, or use-Xbootclasspath/p
tohealthcenter.jar
and-agentpath
tolibhealthcenter.so
.
For details, see the Health Center chapter.