IBM i
IBM i product documentation: http://www-01.ibm.com/support/knowledgecenter/ssw_ibm_i/welcome
IBM Java on IBM i runs in PASE mode, so most of its behavior is the same as on AIX: http://www-01.ibm.com/support/knowledgecenter/ssw_ibm_i_72/rzalf/rzalfwhatispase.htm?lang=en
IBM i Recipe
- CPU core(s) should not be consistently saturated.
- Generally, physical memory should never be saturated and the operating system should not page memory out to disk.
- Input/Output interfaces such as network cards and disks should not be saturated, and should not have poor response times.
- TCP/IP and network tuning, whilst sometimes complicated to investigate, may have dramatic effects on performance.
- Operating system level statistics and optionally process level statistics should be periodically monitored and saved for historical analysis.
- Review operating system logs for any errors, warnings, or high volumes of messages.
- Review snapshots of process activity, and for the largest users of resources, review per thread activity.
- If the operating system is running in a virtualized guest, review the configuration and whether or not resource allotments are changing dynamically.
- Enable Collection Services for performance data.
- If there is sufficient network capacity for the additional packets, consider reducing the default TCP keepalive timer (CHGTCPA TCPKEEPALV) from 2 hours to a value less than intermediate device idle timeouts (e.g. firewalls).
- Test disabling delayed ACKs
Also review the general topics in the Operating Systems chapter.
Central Processing Unit (CPU)
IBM Systems Workload Estimator and processStats: https://www-912.ibm.com/estimator
Use Collection Services performance data to gather detailed performance information: http://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/tprf_collectionservices.html
IBM iDoctor for IBM i PEX Analyzer
- Simplifies the collection and enhances the analysis of all types of PEX data, which includes, PROFILE, STATS and TRACE data.
- Provides the details necessary for the low-level analysis of processor utilization, DASD operations, file space usage, waits, file opens and much more.
IBM iDoctor for IBM i Job Watcher
- Provides real-time, non-intrusive, detailed and summarized views of job/thread/task performance data.
- It's the step to take to avoid a system wide trace or to ensure that a trace will yield useful data.
- It's a super WRKSYSACT that displays both "running" and "waiting" components for a job.
More information on CPU and call stacks can be found using the iDoctor tools including Performance Explorer and Job Watcher: https://www-912.ibm.com/i_dir/idoctor.nsf and http://www-01.ibm.com/support/docview.wss?uid=nas8N1012932
OS CPU Profiling
Profiling the CPU on the IBM i can be done on a global or individual job (JVM) basis. This is used with the Performance Explorer (PEX) tool. The process to gather the data is as follows:
1. Add the PEX definition with the *PMCO events (http://www-01.ibm.com/support/knowledgecenter/ssw_ibm_i_72/rzaha/profperf.htm?lang=en):
ADDPEXDFN DFN(JVMCPU) TYPE(*TRACE) JOB((*ALL/*ALL/JVMNAME *ALL)) MAXSTG(1000000) INTERVAL(1) TRCTYPE(*SLTEVT) SLTEVT(*YES) BASEVT((*PMCO *NONE *FORMAT2))
2. Gather the data using the above PEX definition:
STRPEX SSNID(TRACE1) DFN(JVMCPU)
3. Wait 5-10 minutes while the JVM is using high CPU, and then end the collection:
ENDPEX SSNID(TRACE1) DTALIB(QPEXDATA)
4. Print the PEX report, first by program, and next by statement:
PRTPEXRPT MBR(TRACE1) LIB(QPEXDATA) TYPE(*PROFILE) PROFILEOPT(*SAMPLECOUNT *PROGRAM)
PRTPEXRPT MBR(TRACE1) LIB(QPEXDATA) TYPE(*PROFILE) PROFILEOPT(*SAMPLECOUN T *STATEMENT)
5. This produces two spool files to show the breakout of CPU. Here is a histogram showing the breakdown:
All Jobs/Tasks CPU. . . : 41212922
Jobs in Collection CPU : 41212922
Job CPU . . . . . . . . : 38813410 94.2 %
Task CPU. . . . . . . . : 2399512 5.8 %
Task ID Job/Task Name Pool Priority Existence Elapsed Time (us) CPU (us) CPU %
00000000000008E1 WQLWI7 QWEBQRYADM 976015 2 171 Y Y 211403580 37479915 90.94
Cnt % %
-
16116 23.1 23.1 libj9jit23.so MethodMetaData.c/getNextMap
4539 6.5 29.5 JITC/java/util/StringTokenizer.scanToken(I)I
4081 5.8 35.4 libj9vm23.so strsup.s/copyCharsToUnicode
3226 4.6 40.0 libj9jit23.so MethodMetaData.c/findMapsAtPC
2788 4.0 44.0 libj9jit23.so MethodMetaData.c/matchingRange
1975 2.8 46.8 JITC/java/io/PrintWriter.println(Ljava/lang/String;)V
1740 2.5 49.3 JITC/java/lang/Throwable.printStackTrace(Ljava/io/PrintWriter;)V
1605 2.3 51.6 libj9jit23.so ArrayCopy.s/__arrayCopy
1058 1.5 53.1 libj9gc23.so ScavengerWriteBarrier.cpp/J9WriteBarrierStore
1049 1.5 54.6 libpthreads.a(shr_xpg5.o) pth_mutex.c/_mutex_lock
1041 1.5 56.1 libj9vm23.so strsup.s/convertCharsToString
1002 1.4 57.5 libj9gc23.so j9memclr.c/J9ZeroMemory
996 1.4 59.0 JITC/ibi/webfoc/wfutil/WFTracestackLocation.vectorizeStringBuffer(LjZ)Ljava/util/Vector;
904 1.3 60.2 libpthreads.a(shr_xpg5.o) pth_locks_ppc_mp.s/global_unlock_ppc_mp
869 1.2 61.5 libj9vm23.so optinfo.c/getLineNumberForROMClassFromROMMethod
859 1.2 62.7 libj9gc23.so MarkingScheme.cpp/scanObject__16MM_MarkingSchemeFP14MM_EnvironmentP8J9Object
623 0.9 63.6 libj9vm23.so segment.c/segmentSearchComparator
559 0.8 64.4 libpthreads.a(shr_xpg5.o) pth_locks_ppc_mp.s/global_lock_ppc_mp
553 0.8 65.2 libj9vm23.so strsup.s/convertCharsToString
543 0.8 66.0 libj9vm23.so findmethod.c/findROMClassInSegment
522 0.7 66.7 libj9vm23.so mthutil.s/nextROMMethod
515 0.7 67.5 libj9vm23.so strsup.s/convertCharsToString
Per thread CPU usage
Gathering per thread CPU usage can be done in a variety of ways. The best is to use the WRKJVMJOB command. Example:
1. WRKJVMJOB
2. Take option 11 to display threads. This shows the total CPU (seconds) for each thread.
Job . . . . . . : TJH80EXP PID . . . . . . : 82839
User . . . . . . : QEJBSVR JDK . . . . . . : 1.6.0
Number . . . . . : 946396 Bits . . . . . . : 32
Type options, press Enter.
10=Display call stack
Total Aux
Thread Name Status CPU I/O
00000087 P=704863:O=0:CT THDW 10.336 8277
0000008A JIT Compilatio > THDW 76.830 809
0000008B JIT Compilatio > THDW 67.357 6
0000008C JIT Compilatio > THDW 42.743 3
0000008E IProfiler THDW 4.275 0
0000008F Signal Dispatc > THDW 64.984 0
00000091 Concurrent Mar > THDW 7.643 2790
00000092 GC Slave THDW 3.263 44
00000093 GC Slave THDW 3.172 38
00000094 GC Slave THDW 3.665 46
More...
Another option would be to take option 13 instead of 11. This produces a spool file that can be displayed and sent to support.
Physical Memory (RAM)
Memory Pool Tuning and Faulting Guidelines: http://www-01.ibm.com/support/docview.wss?uid=nas8N1014941
Input/Output (I/O)
WRKDSKSTS shows the status of the disk drives. Look for "hot" drives indicating high %Busy. Units consistently above 30% busy will have slow IO response times.
Work with Disk Status RCHM199B
09/09/13 12:27:05
Elapsed time: 00:00:22
Size % I/O Request Read Write Read Write %
Unit Type (M) Used Rqs Size (K) Rqs Rqs (K) (K) Busy
1 4327 61744 87.7 .0 4.0 .0 .0 4.0 .0 0
2 4327 61744 87.7 .3 4.5 .2 .0 4.0 6.0 0
3 4327 61744 87.7 .0 4.0 .0 .0 .0 4.0 0
4 4327 61744 87.7 .0 4.0 .0 .0 .0 4.0 0
5 4327 61744 87.7 .0 4.0 .0 .0 .0 4.0 0
6 4327 61744 87.7 .1 8.0 .0 .1 .0 8.0 0
7 4327 61744 87.7 .1 4.0 .0 .0 4.0 4.0 0
8 4327 61744 87.7 .1 4.0 .0 .1 4.0 4.0 0
F11 shows another view and the current status. Look for DEGRADED or FAILED units. This example shows they are all ACTIVE. No issues.
--Protection--
Unit ASP Type Status Compression
1 1 DPY ACTIVE
2 1 DPY ACTIVE
3 1 DPY ACTIVE
4 1 DPY ACTIVE
5 1 DPY ACTIVE
6 1 DPY ACTIVE
7 1 DPY ACTIVE
8 1 DPY ACTIVE
Networking
Tune TCP/IP buffer sizes. Use CHGTCPA to tune them up to 8096 KB: http://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/tprf_tunetcpip.html
Using Collection Services Performance Data
WAS provides scripts to enable collection services for performance data: http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/tprf_collectionservices.html
Gathering Javacores using WRKJVMJOB
Gathering javacores, core dumps, heap dumps, and JVM summary data is very simple on the IBM i. The WRKJVMJOB utility allows you to do all of this.
1. Execute WRKJVMJOB
2. This produces a list of all the JVMs active on the system
Work with JVM Jobs RCHM199B
09/09/13 12:11:42
Active JVMs on system: 22
Type options, press Enter.
5=Work with 7=Display job log 8=Work with spooled files
9=Display GC information 11=Display threads 12=Dump 13=Print
Opt Job Name User Number Function Status
QSRVMON QSYS 842707 JVM-ServiceMon THDW
QP0ZSPWT HENDERAN 862730 JVM-WSPreLaunc TIMW
BENNIEDMGR QEJBSVR 911766 PGM-jvmStartPa THDW
NODEAGENT QEJBSVR 911778 PGM-jvmStartPa THDW
BENNIENODE QEJBSVR 911779 PGM-jvmStartPa THDW
SHU85EONE QEJBSVR 916849 PGM-jvmStartPa THDW
STIMSERVER QEJBSVR 934284 PGM-jvmStartPa THDW
BENNIE QEJBSVR 937798 PGM-jvmStartPa THDW
DMGR QEJBSVR 941298 PGM-jvmStartPa THDW
12 TJH80EXP QEJBSVR 946396 PGM-jvmStartPa THDW
3. From this list, you can select option 12 to dump. By default, option 12 performs a javacore dump. To produce a different type of dump, you can select 12 next to the JVM, then hit F4 to prompt the command. This will allow you to change. Note the type of dump. (*JAVA = javacore, *SYSTEM = Core dump, *HEAP = heapdump.phd file is produced)
Generate JVM Dump (GENJVMDMP)
Type choices, press Enter.
Job name . . . . . . . . . . . . > TJH80EXP Name
User . . . . . . . . . . . . . > QEJBSVR Name
Number . . . . . . . . . . . . > 946396 000000-999999
Type . . . . . . . . . . . . . . *JAVA *JAVA, *SYSTEM, *HEAP
+ for more values
4. The dumps produced (javacore, heapdump, core dump) will be placed in the JVMs user home directory. The joblog for the JVM will show the location of the file. For example:
DSPJOBLOG JOB(946396/QEJBSVR/TJH80EXP)
JVMDUMP010I Java dump written to /QIBM/UserData/WebSphere/AppServer/V8/Express/profiles/tjh80exp/javacore.20130909.121650.82839.0001.txt
JVM Monitoring
Viewing the Application Server JVM can be done through WRKACTJOB. This command shows the total CPU seconds, CPU %, and IO for the job based on the elapsed time:
Work with Active Jobs RCHM199B
09/09/13 11:40:35
CPU %: 2.2 Elapsed time: 00:00:10 Active jobs: 339
Type options, press Enter.
2=Change 3=Hold 4=End 5=Work with 6=Release 7=Display message
8=Work with spooled files 13=Disconnect ...
--------Elapsed---------
Opt Subsystem/Job Type Pool Pty CPU Int Rsp AuxIO CPU %
QWAS8 SBS 2 0 .0 0 .0
TJH80EXP BCH 2 20 3454.9 0 .0
F11 shows further views, including number of threads, status, and function.
Opt Subsystem/Job User Number Type CPU % Threads
QWAS8 QSYS 894103 SBS .0 2
TJH80EXP QEJBSVR 946396 BCH .0 74
Current
Opt Subsystem/Job User Type CPU % Function Status
QWAS8 QSYS SBS .0 DEQW
TJH80EXP QEJBSVR BCH .0 PGM-jvmStartPa THDW
WRKSYSSTS shows the memory pool activity for the JVM. The WRKACTJOB above shows the WebSphere server "TJH80EXP" is running in system pool 2. The example output of WRKSYSSTS below shows system pool 2 as having 28,626MB allocated. The page faults are in faults/second, and split between DB and Non-DB faults. This is based on elapsed time.
WRKSYSSTS ASTLVL(*ADVANCED)
Work with System Status RCHM199B
09/09/13 11:51:52
% CPU used . . . . . . . : 2.0 System ASP . . . . . . . : 493.9 G
% DB capability . . . . : .0 % system ASP used . . . : 87.7574
Elapsed time . . . . . . : 00:07:58 Total aux stg . . . . . : 493.9 G
Jobs in system . . . . . : 3211 Current unprotect used . : 15970 M
% perm addresses . . . . : .032 Maximum unprotect . . . : 22252 M
% temp addresses . . . . : .569
Sys Pool Reserved Max ----DB----- --Non-DB--- Act- Wait- Act-
Pool Size M Size M Act Fault Pages Fault Pages Wait Inel Inel
1 1187.55 606.00 +++++ .0 .0 .0 .0 10.5 .0 .0
2 28626.03 11.30 820 .0 .4 .0 .2 39221 .0 .0
3 13319.48 .56 1140 .0 .0 .0 .0 558.7 .0 .0
4 .25 .00 5 .0 .0 .0 .0 .0 .0 .0
The above shows very low page fault rate based on almost 8 minutes elapsed time. Also note the Wait-Inel and Act-Inel counts as being 0. A higher value indicates the max act value is too low for the amount of threads active in the pool. This would cause performance problems.
F11 again shows the pool names. System pool 2 is the *BASE pool. This is the default pool for IBM i batch processes, including WebSphere
Sys Pool Reserved Max Paging
Pool Size M Size M Act Pool Subsystem Library Option
1 1187.55 606.00 +++++ *MACHINE *FIXED
2 28626.03 11.30 820 *BASE *CALC
3 13319.48 .56 1140 *INTERACT *CALC
4 .25 .00 5 *SPOOL *FIXED