Object Request Broker (ORB) and Remote Method Invocation (RMI)

For IBM JVMs, additionally see the ORB section in the IBM Java chapter.

ORB pass by reference (com.ibm.CORBA.iiop.noLocalCopies) may cause a significant throughput improvement (in one benchmark, 50-60%).

The Object Request Broker (ORB) pass by reference option determines if pass by reference or pass by value semantics should be used when handling parameter objects involved in an EJB request. This option can be found in the administrative console by navigating to Servers => Application Servers => server_name => Object Request Broker (ORB). By default, this option is disabled and a copy of each parameter object is made and passed to the invoked EJB method. This is considerably more expensive than passing a simple reference to the existing parameter object.

To summarize, the ORB pass by reference option basically treats the invoked EJB method as a local call (even for EJBs with remote interfaces) and avoids the requisite object copy. If remote interfaces are not absolutely necessary, a slightly simpler alternative which does not require tuning is to use EJBs with local interfaces. However, by using local instead of remote interfaces, you lose the benefits commonly associated with remote interfaces, location transparency in distributed environments, and workload management capabilities.

The ORB pass by reference option will only provide a benefit when the EJB client (that is, servlet) and invoked EJB module are located within the same classloader. This requirement means that both the EJB client and EJB module must be deployed in the same EAR file and running on the same application server instance. If the EJB client and EJB modules are mapped to different application server instances (often referred to as split-tier), then the EJB modules must be invoked remotely using pass by value semantics.

Set com.ibm.CORBA.ServerSocketQueueDepth to 511 (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_tuneappserv.html). If this value is reached, subsequent connection attempts will receive connection refused errors after a connection timeout period (and potentially implicit retries).

The thread pool size is dependent on your workload and system. In typical configurations, applications need 10 or fewer threads per processor. (Servers > Server Types > Application servers > server_name > Container services > ORB service > Thread pool)

Each inbound and outbound request through the ORB requires a thread from the ORB thread pool. In heavy load scenarios or scenarios where ORB requests nest deeply, it is possible for a Java virtual machine (JVM) to have all threads from the ORB thread pool attempting to send requests. Meanwhile, the remote JVM ORB that processes these requests has all threads from its ORB thread pool attempting to send requests. As a result, progress is never made, threads are not released back to the ORB thread pool, and the ORB is unable to process requests. As a result, there is a potential deadlock. Using the administrative console, you can adjust this behavior through the ORB com.ibm.websphere.orb.threadPoolTimeout custom property.

http://www.ibm.com/support/knowledgecenter/SSAW57_8.0.0/com.ibm.websphere.nd.doc/info/ae/ae/rorb_tims.html

Monitor and tune the ORB service thread pool: http://www.ibm.com/support/knowledgecenter/SSAW57_8.0.0/com.ibm.websphere.nd.doc/info/ae/ae/rorb_tims.html

Monitor and tune the connection cache size (com.ibm.CORBA.MaxOpenConnections): http://www.ibm.com/support/knowledgecenter/SSAW57_8.0.0/com.ibm.websphere.nd.doc/info/ae/ae/rorb_tims.html. Ideally, this should be greater than or equal to the maximum number of concurrent connections, but not so large as to cause too many threads (or in such a case, JNI Reader Threads could be used instead).

By default, the option to "prefer local" (meaning to prefer sending requests to EJBs on the same node, if available) is enabled; however, the deployment manager must be running for it to function: http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/urun_rwlm_cluster_create1.html?lang=en

Running with Java security enabled will reduce performance. For example: http://www-01.ibm.com/support/docview.wss?uid=swg21661691

EJBs

If the Performance Monitoring Infrastructure (PMI) counters show a high rate of ejbStore methods being called, then the EJB container cache size may need to be increased: https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rprf_ejbcontainer.html

Run the EJB Cache trace to ensure the cache sizes are tuned optimally: https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tejb_tunecash.html

If there is significant heap pressure from stateful session beans (check heapdumps), consider specifying a timeout that the application can handle using -Dcom.ibm.websphere.ejbcontainer.defaultStatefulSessionTimeout=$MINUTES (https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rprf_ejbcontainer.html)

If PMI shows that most bean instances are being used in the pool, consider increasing the pool size for that application: https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rprf_ejbcontainer.html. For example, com.ibm.websphere.ejbcontainer.poolSize="*=,3000"

JNI Reader Threads

In general, switching to JNI Reader Threads is only recommended when a very large number of concurrent users/connections is required. Instead of the default one thread per connection (and each client will have at least 2-3 connections: one for bootstrap, one for the listener, and potentially one for TLS), JNI reader threads only require a handful of threads (usually 4 is enough, which is the default), each one of which handles up to 1,024 connections simultaneously using asynchronous I/O.

By default, the ORB uses a Java thread for processing each inbound connection request it receives. As the number of concurrent requests increases, the storage consumed by a large number of reader threads increases and can become a bottleneck in resource-constrained environments. Eventually, the number of Java threads created can cause out-of-memory exceptions if the number of concurrent requests exceeds the system's available resources.

To help address this potential problem, you can configure the ORB to use JNI reader threads where a finite number of reader threads, implemented using native OS threads instead of Java threads, are created during ORB initialization. JNI reader threads rely on the native OS TCP/IP asynchronous mechanism that enables a single native OS thread to handle I/O events from multiple sockets at the same time. The ORB manages the use of the JNI reader threads and assigns one of the available threads to handle the connection request, using a round robin algorithm. Ordinarily, JNI reader threads should only be configured when using Java threads is too memory-intensive for your application environment.
Each JNI thread can handle up to 1024 socket connections and interacts directly with the asynchronous I/O native OS mechanism, which might provide enhanced performance of network I/O processing.

http://www.ibm.com/support/knowledgecenter/SSAW57_8.0.0/com.ibm.websphere.nd.doc/info/ae/ae/rorb_tims.html

If JNI Readers Threads are enabled, the default number (com.ibm.CORBA.numJNIReaders) is 4 which can handle up to 4,096 concurrent connections: http://www-01.ibm.com/support/knowledgecenter/SS7K4U_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/rorb_setg.html?cp=SSAW57_8.5.5&lang=en

Workload Management (WLM)

"Multiple application servers can be clustered with the EJB containers, enabling the distribution of enterprise bean requests between EJB containers on different application servers... EJB client requests are routed to available EJB containers in a round robin fashion based on assigned server weights." (http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/crun_srvgrp.html?lang=en)

WLM balances requests in the form of method calls/invocations. The "pattern problem" occurs when there is a pattern of method calls that correlates with the number of cluster members. For example, if there are two cluster members and an even number of method calls such as "create" and "invoke," it's possible that all the lightweight create requests execute on one server, and the heavyweight invoke requests execute on the other server. In that case the "load" on the servers (for example, measured in CPU utilization) is not equal among the servers. Workarounds to this problem include 1) changing the number of cluster members (for example, from even to odd), and 2) adjusting the weights of the cluster members to non-equal values (typically recommended for normalization are cluster weights of 19 and 23).

Java Naming and Directory Interface (JNDI)

By default the JNDI naming caches are unbounded and persist for the life of the JVM. There is one cache per provider URL. If applications use a large variety of names or large named objects, then the caches may use significant amounts of memory. Each cache can be made to timeout (on next access) using the -Dcom.ibm.websphere.naming.jndicache.maxcachelife=$minutes property: http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/rnam_jndi_settings.html?cp=SSAW57_8.5.5&lang=en. The caches can be completely disabled with -Dcom.ibm.websphere.naming.jndicache.cacheobject=none. These properties can be placed into the properties Hashtable used in creating the InitialContext.

You can find the size of all JNDI caches by gathering a heapdump or coredump. Open the IBM Memory Analyzer Tool, and click Open Query Browser > Show Retained Set. For the class row, type com.ibm.ws.naming.jcache.Cache and click OK. Review the sum of shallow heaps in the bottom right.

InitialContext

A javax.naming.InitialContext is the starting point to perform naming operations. There is significant processing in creating an InitialContext, so it is recommended to cache them. However, an InitialContext is not thread safe:

An InitialContext instance is not synchronized against concurrent access by multiple threads. (http://docs.oracle.com/javase/8/docs/api/javax/naming/InitialContext.html)

It is recommended to use ThreadLocals to create InitialContexts once. For example:

    private final ThreadLocal<InitialContext> jndiContext = new ThreadLocal<InitialContext>() {
        protected InitialContext initialValue() {
            try {
                final InitialContext context = new InitialContext();
                return context;
            } catch (NamingException e) {
                throw new RuntimeException(e);
            }
        };
    };

InitialContexts are often used to bind once at application startup (in which case a thread local is not needed); however, it is common practice to catch exceptions on object invocations and re-lookup a resource at runtime, in which case ThreadLocals should be used to avoid the cost of creating InitialContexts.