Resolving problems with too many open files

Background

Unix and Linux systems have a per-process limit on the number of open files. In some configurations, particularly with a large number of threads per child process, web server operations can fail due to reaching the limit. Sockets can no longer be created, client connections can no longer be accepted, files can no longer be opened, etc.

The problem is usually interimittent; the web server is continually closing files or sockets that it has opened, so the limit is reached briefly under the higher load, and the file descriptor usage can then lower, in a cycle.

Here are just a few examples of messages which can appear in the error log when this limit is reached:

    [crit] (24)Too many open files: SSL0600S: Unable to connect to session ID cache
    [error] (24)Too many open files: apr_accept: (client socket)
    [error] (23)Too many open files: apr_accept: (client socket)
    (24)Too many open files: file permissions deny server access: /opt/IBMHttpServer/htdocs/en_US/index.html

Steps to resolve the problem:

Linux OS with "(23)" in the error message

  1. Stop the web server.

  2. Linux Only: If "(23)" is in your error message, it means the system-wide file descriptor limits are exceeded. You'll need to (permanently) tune /proc/sys/fs/file-max to accomodate the total number of open files across all processes on the system: https://stackoverflow.com/questions/24862733/difference-between-linux-errno-23-and-linux-errno-24

    Consult your operating system documentation to determine how to make the change permanent (e.g. /etc/sysctl.conf).

  3. Start IBM HTTP Server.

Other operating systems, and Linux with "(24)" in the error message

  1. Stop the web server.

  2. Set the webservers limits to a higher value in $IHSROOT/bin/envvars. Note, it can be difficult to determine the current limits, but they might be the same as your current interactive shell.

        ulimit -H -n 4096
        ulimit -n 4096
    
  3. Start IBM HTTP Server.

Additional notes

  • When the limit is hit during the setup for a new client connection (the apr_accept error message above) the affected child process will perform a graceful shutdown; this graceful shutdown allows any pending client requests to complete uninterrupted.

  • Take the sum of the following terms for a rough estimate for the number of open files that may required for a given web server child process

    Hint: The value can regularly be well over ThreadsPerChild*2

    • ThreadsPerChild

    • Number of Listen directives

    • Number of of CustomLog/ErrorLog directives

    • maximum number of backend connections via the WebSphere Plugin

    • maximum outgoing connections for mod_proxy or mod_ibm_ldap

    • 5 (fixed overhead)

If your problem is a system-wide limit on open files, rather than per-process or per-user, the calculation is much more difficult.