WAS Clasic on z/OS
For background, see https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102030
This chapter outlines any z/OS specific performance notes for WAS. See the z/OS operating systems chapter for prerequisite knowledge. Unless otherwise noted, everything covered in the general WAS section still applies.
Keep the number of nodes per local partition (LPAR) between one or two nodes with a maximum of four nodes per LPAR. Spread a cell or cluster over at least two LPARs. Using multiple LPARs ensures hardware redundancy as well, while still allowing the cluster to be upgraded on a per node basis.
IBM recommends that you install as much of the WebSphere Application Server for z/OS code in the Link Pack Area (LPA) as is reasonable. Also, ensure that you have eliminated any unnecessary STEPLIBs which can affect performance. If you must use STEPLIBs, verify that any STEPLIB DDs in the controller and servant procs do not point to any unnecessary libraries. (http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.nd.multiplatform.doc/ae/rprf_tunezprogloc.html)
The first place to review is your CTRACE configuration. Ensure that all components are either set to MIN or OFF. To display the CTRACE options for all components on your system, issue the following command from the operator console: D TRACE,COMP=ALL
To change the setting for an individual component to its minimum tracing value, use the following command, where xxx is the component ID: TRACE CT,OFF,COMP=xxx
This configuration change eliminates the unnecessary overhead of collecting trace information that is not needed. Often during debug, CTRACE is turned on for a component and not shut off when the problem is resolved.
Ensure that you are not collecting more SMF data than you need. Review the SMFPRMxx settings to ensure that only the minimum number of records is collected.
The Transaction Service RLS_LOGSTREAM_COMPRESS_INTERVAL custom property can be set to a value larger than the default value if the Transaction Service is the only application component using a logstream. If none of your components are configured to use a logstream, you can set this property to 0 (zero) to disable this function. (http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.nd.multiplatform.doc/ae/tprf_tuneappserv.html)
If you find long garbage collection pause times but the normal components of a pause (mark, sweep, compact, exclusiveaccess) do not add up to the total time, then this is usually caused by the Virtual Lookaside Facility (VLF) caching being disabled or not working efficiently.
"Ensure that ras_trace_defaultTracingLevel=0 or 1, and that ras_trace_basic and ras_trace_detail are not set." (http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.nd.multiplatform.doc/ae/rprf_tuneztrace.html).
Address Spaces
Each application server is split into two or more address spaces: a control region and one or more servant regions. The control region handles incoming traffic and distributes it to the servant regions where the application work is performed. It is a best practice to use ${X} as the control region name and ${X}S for the servant region names. For example, WBESR12 and WBESR12S.
Servants
WebSphere allows you to configure a minimum and maximum number of servants for a server. WLM will dynamically adjust the number of servants within the specified range, up or down based on what’s needed to meet the goals for the system. WLM does this for work running in WebSphere and for work elsewhere on the system.
To set the minimum value, consider how many servants you want to start automatically when the server is started and how many you want WLM to keep available. In determining the maximum value, consider how many servants you can support on your system. Also, consider the number of available connectors for applications in WebSphere and elsewhere in the system.
But what if something changes someday and the minimum just is not enough? Or, you reach the configured maximum and need more servants? To change the values, you must update the configuration and recycle the server. But if you are running at peak utilization and decide you need to increase the maximum number of servants; recycling the whole server is probably going to hurt more than just not having enough servants. It would be nice to be able to dynamically change the number of servants without a recycle.
In Version 7, we introduced a new MODIFY command to let you do that. If the server is not configured as single-servant, you can change the current minimum and maximum number of servants. You enter the command as follows:
MODIFY server,WLM_MIN_MAX=(minimum,maximum)
Specify these values as decimal numbers. Obviously, the minimum must be less than the maximum.
Your changes are in effect until the next time you recycle the server, in which case, the values in the configuration are used instead. To make your changes permanent, you need to update the configuration.
In general, WLM responds quickly to your request. If the minimum number of servants is not already running, WLM starts more. Increasing the maximum value, however, might not have any immediate effect. Further, decreases in values might also not cause an immediate change because of WLM’s opinion as to how many servants it needs. Some situations, such as session data pinned to the servant, might prevent WLM from reducing the number of currently active servants. Of course, unless you’ve committed your min and max values to memory, you would probably like to have a quick way to see what you are currently configured for. We added a new command to allow you to do that.
MODIFY server,DISPLAY,WLM
Start servants in parallel: wlm_servant_start_parallel=1 (http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/da939fa8cdf48510862575a1007461d2/$FILE/WP101464%20-%20WebSphere%20zOS%20Hidden%20Gems2.pdf)
All of the various custom properties, environment variables, etc. that are set in the WAS configuration (e.g. through the admin console) ultimately get generated into the was.env file (located under profiles/default/config/cells/cellname/nodes/nodename/servers/servername/was.env) which is read when the address space starts.
Control Region
The default value of worker threads in a control region is 25. This can be changed to a higher value as required by setting customer property was.controlThreads as follows:
Application servers > server_name > Container Services > ORB Service > Custom Properties > was.controlThreads
To verify how many control region threads you are using, you can check the following message in the control region joblog:
BBOM0001I control_region_thread_pool_size: 25.
Starting in WAS 8.0.0.8 and 8.5.5.2 (PM85194), use the property control_region_thread_pool_maximum_size to allow growth of this pool, or set to 0 to allow dynamic calculation of the size (see also PI50098).
Daemon
"Stopping a Daemon server will stop all servers for that cell on that... image. This is because of the way servers for that cell... access key LPA modules. It's done "through" the Daemon server. Stopping the Daemon server means the servers can no longer access those modules, so they too stop." (http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/133ee6209f082b978625711d006dbd65/$FILE/WP100396%20-%20Essentials%20of%20Test%20Production%20and%20Maintenance.pdf)
Thread Pools
Most work in the servant is handled by the ORB thread pool. The maximum size of this pool is controlled by the ORB workload profile setting: http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.nd.multiplatform.doc/ae/tgrid_zosthreads.html
- IOBOUND: Default - Number of threads is 3 * Number of processors. Specifies more threads in applications that perform I/O-intensive processing on the z/OS operating system. The calculation of the thread number is based on the number of processors. IOBOUND is used by most applications that have a balance of processor intensive and remote operation calls. A batch job is an example that uses the IOBOUND profile.
- CPUBOUND: Number of threads is the number of processors. Specifies that the application performs processor-intensive operations on the z/OS operating system, and therefore, would not benefit from more threads than the number of processors. The calculation of the thread number is based on the number of processors. Use the CPUBOUND profile setting in processor intensive applications, like compute-intensive (CI) jobs, XML parsing, and XML document construction, where the vast majority of the application response time is spent using the processor.
- LONGWAIT: Number of threads is 40. Specifies more threads than IOBOUND for application processing. LONGWAIT spends most of its time waiting for network or remote operations to complete. Use this setting when the application makes frequent calls to another application system, like Customer Information Control System (CICS®) screen scraper applications, but does not do much of its own processing.
- In WebSphere Application Server for z/OS V7 you can choose Workload profile CUSTOM and then set property servant_region_custom_thread_count to the number of servant threads you want up to a limit of 100.
BBOO0234I SERVANT PROCESS THREAD COUNT IS X
WAS 7 on z/OS introduced the ability to interrupt hung threads: http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/3c02b79e79ea32fd8625751a005d7f63/$FILE/WP101374%20-%20WebSphere%20zOS%20V7%20Dispatch%20Timeout%20Improvements.pdf. This was improved in WAS 8: http://w3-03.ibm.com/support/techdocs/atsmastr.nsf/3af3af29ce1f19cf86256c7100727a9f/d7bb7aa1f7be24128625791e00830412/$FILE/WP101992%20-%20WebSphere%20zOS%20V8%20Hidden%20Gems.pdf
Joblogs
Type ? next to the WAS servant region in the SDSF.DA or SDFS.ST panels. Roughly speaking, SYSPRINT is equivalent to SystemOut.log and SYSOUT is equivalent to SystemErr.log + native_stderr.log
Common things to look for in WAS joblogs:
- Search for the word HOST by typing F HOST and F5 to repeat search
- Hostname: com.ibm.CORBA.LocalHost = ZTESTB2.PDL.POK.IBM.COM
- Search for the word LEVEL by typing F LEVEL and F5 to repeat search
- WAS Level: BBOM0007I CURRENT CB SERVICE LEVEL IS build level 6.1.0.32 (AM24112) release WAS61.ZNATV date 10/10/10 19:40:16.
- Search for the word cell_name
- Cell name: cell_name: wbecell.
- Search for the word PROCEDURE by typing F PROCEDURE and F5 to repeat
- PROCLIB: PROCEDURE WBESS62 WAS EXPANDED USING SYSTEM LIBRARY USER.S12.PROCLIB
- Search for the word WAS_HOMe by typing F WAS_HOME and F5 to repeat
- WAS_HOME: BBOM0001I adjunct_region_jvm_properties_file: /S12/wbe61/wbes12/AppServer/profiles/default/config/cells/ws/wbenode2/servers/wbesr12/adjunct.jvm.options.
Timeouts
WLM
WebSphere has several different types of work running in its address spaces. It is classified using classification rules under 3 different workloads:
- For STC workloads the WebSphere address spaces control regions and servant regions would be given an aggressive Velocity goal equal to or slightly less than DB2, IMS, or MQ and a goal equal to or slightly higher than CICS.
- For OMVS workloads the WebSphere address spaces control regions and servant regions would be given an aggressive Velocity goal so that at start-up the BPXBATCH facility used to run our applyPTF.sh script does not slow startup of the server.
- For CB workloads the WebSphere Servant Regions are given a Response time with percentile goal close to but not to exceed 90% of the work in .5 seconds. Even though WebSphere servers are long running tasks, typically Velocity goals are used for long running tasks, the actual transactions within WebSphere are very short lived HTTP type transactions. Response times with percentile goals are used for these short lived transactions.
The report classes associated with the classification rule for each workload would be unique.
Workload CB is enclave work or WLM queue managed WebSphere work. Almost all WebSphere work happens here after the initial startup of the address spaces.
STC work also occurs in the WebSphere address spaces:
- The processing necessary to start the address spaces before the first enclave is created is STC workload.
- Any spawned threads from the application will not be enclave, WLM, or CB managed work and will run under STC.
- Address space functions such as JES related activities will be STC workload.
- An argument can be made that says Garbage Collection activities run under STC workload.
Rarely use discretionary classification for WebSphere.
OMVS work also occurs in the WebSphere Address Spaces. During startup a program called BPXBATCH is executed to run a script in the JCL. This script called applyPTF.sh checks to see if any service has been applied. If service has been applied this script executes any post install actions necessary. If startup is very slow, you may want to investigate a classification rule for OMVS. If the combination of applyPTF.sh and lack of classification are the cause of the slow startup, adding a classification rule may fix the slow start problem.
More on WAS with WLM: http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/f1ec690b6bee04cd8625778800595c69/$FILE/WP101740%20-%20WAS_and_zOS_WLM_v8.pdf
WebSphere creates a WLM enclave for all requests that get dispatched in a servant. An enclave has an associated WLM service class and report class. The service class is used by WLM to help make decisions about assigning resources to the servant to ensure the requests meet the goals defined in the service class. The report class is used to separate information about the requests in reports generated by RMF (or other similar products). To determine which service class and report class to assign to the enclave, WLM looks at classification information provided by WebSphere when the enclave is created.
One piece of information provided is called a transaction class. This is just an eight character name assigned to the request. WebSphere supports an XML file pointed to by the variable wlm_classification_file to determine what transaction class to use. The XML file allows you to specify a different transaction class (and thus indirectly a different service class and report class) for different applications or even parts of applications.
The XML file also allows you to specify a transaction class to be used when classifying requests that are internally generated. Sometimes the controller needs to dispatch something in its own servant. For example, this could be the dispatch of a management bean (MBean). To separate internal work from application work, you might want to put these requests in their own report class. To do that you simply specify the ‘internal’ clause in the XML file and provide a transaction class name that WLM will recognize (based on the rules you provide in the WLM configuration) and assign service and report classes appropriately.
SMF 120
Details on 120-9 records: http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/52cd84f130ab4165862574e2004b4d54/$FILE/ATT957XG.pdf/WP101342%20-%20Understanding%20SMF%20Record%20Type%20120%20Subtype%209.pdf
MODIFY Command
/F ADDRESSSPACE,... COMMANDS... or /MODIFY ADDRESSSPACE,... COMMANDS (http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.nd.multiplatform.doc/ae/rxml_mvsmodify.html)
- Display Java heap information: /F ADDRESSSPACE,JVMHEAP
- Generate a javacore: /F ADDRESSSPACE,JAVACORE
The joblog will show where it is written: JVMDUMP007I JVM Requesting Java Dump using '/var/WebSphere/home/ZPSRG/javacore.20090309.205027.50397255.txt
In version 8 we added an option to these commands to specify the ASID of the servant region you want dumped. Just add a “ASIDX=” after the command with the appropriate ASID (in hex) of the servant region you want to dump. For example (http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/2e8a87fddebbb3328625791e0083041f/$FILE/WP101992%20-%20WebSphere%20zOS%20V8%20Hidden%20Gems.pdf):
/F server,JAVACORE,ASIDX=F4 - PAUSELISTENERS, will cause the target server to close its listener ports and stop taking new requests
- DISPLAY,SERVERS modify command has been enhanced to also report the 'state' of the server. There are four possibilities: ACTIVE, ENDING, PAUSED/STOPPING, and RECOVERY. ACTIVE seems pretty obvious. Basically ACTIVE means it isn't any of the other states; it could be up or it could be initializing. ENDING means that the server is on its way down. PAUSED/STOPPING means either you have issued PAUSELISTENERS or STOPped the server. It is kind of the same thing. In both cases the server is not taking new work, but there is a possibility work is still in-flight inside the server. The only difference is if we are stopping, then once the work completes the server will end. Finally, RECOVERY means that the server has been started to recover in-flight transactions and will automatically shut down once that is done. No new work will be taken.
BBOO0182I SERVER ASID SYSTEM LEVEL STATE
BBOO0183I WAS00 /ZWASAXXX 6Fx SY1 8.0.0.0 (ff1106.32) ACTIVE
BBOO0183I BBON001 /BBON001 58x SY1 8.0.0.0 (ff1106.32) ACTIVE
BBOO0183I BBOC001 /BBOS001 5Bx SY1 8.0.0.0 (ff1106.32) PAUSED/STOPPING
BBOO0183I BBODMGR /BBODMGR 57x SY1 8.0.0.0 (ff1106.32) ACTIVE - Way back in WebSphere Version 5 we introduced the DISPLAY,WORK command (underneath the MVS 'Modify' command for the WAS controller). This pretty cool command lets you see how much work had been processed by the server since it started and how much work was actually in the server at the time. You could even look server region by server region and see how work was spreading (or not) across them. (http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/ec31a38f42faf8c486257c27005f7a64/$FILE/WP102371%20-%20New%20Functions%20in%20WAS%20zOS%20Maintenance%20Stream.pdf)
Previous Section (Session Initiation Protocol (SIP)) | Next Section (WAS Liberty) | Back to Table of Contents