# Coredumps on Unix A coredump is a special file which represents the memory image of a process. Many operating systems have the capability of saving a core dump when the application crashes. The core dump is an important part of diagnosing the cause of the crash, since the data which the application was accessing at the time is in the core dump, along with information about which part of the application was running at the time of the crash. There are various configuration requirements which must be met in order for the operating system to save a core dump when IBM HTTP Server crashes. This document describes the common configuration requirements. ## Quick checklist for selected platforms Later sections of this document provide more information. Here is a quick checklist to consider. For z/OS information, refer to [this document](zos_dumps.html) ### AIX 1. Modify httpd.conf and set CoreDumpDirectory directive to point to a location where the web server user id (e.g., *nobody* or *www*) can create files. Usually this is sufficient: ``` CoreDumpDirectory /tmp ``` Note: In some rare situations, core files will be larger than 2GB. They will be truncated unless the filesystem has large file support. By default, JFS filesystems don't support such files; large file support has to be enabled explicitly when the filesystem is created. Also check ulimit -f if your IHS processes are larger than 1GB to prevent the core files from being truncated to 1GB (the ulimit -f default). 2. Full core handling: Current versions of IBM HTTP Server do not enable full core files automatically because of the potential size. [AIX tuning](#FULLCORE) is required to enable them. 3. Stop IBM HTTP Server. 4. Open $IHS\_ROOT/bin/envvars in an editor and append: ``` ulimit -c unlimited ulimit -f unlimited ``` 5. Start IBM HTTP Server as normal 6. Check the IBM HTTP Server error log and make sure you don't see one of these messages, which would indicate that one of the steps above was skipped: ``` [notice] Core file limit is 0; core dumps will be not be written for server crashes [notice] CoreDumpDirectory not set; core dumps may not be written for child process crashes ``` (Levels of IBM HTTP Server prior to mid-2004 do not report these potential configuration problems.) ### Solaris 1. Modify httpd.conf and set CoreDumpDirectory directive to point to a location where the web server user id (e.g., *nobody* or *www*) can create files. Usually this is sufficient: ``` CoreDumpDirectory /tmp ``` Note: On Solaris, /tmp is often mounted on paging space (swap device). If there is a potential paging space shortage, create another directory on a physical file system, make sure that the web server user id can write to it, and set CoreDumpDirectory to point to that new directory. 2. Run the coreadm program to configure the operating system to write core dumps for programs like IBM HTTP Server which switch identity at startup: ``` # coreadm -e global-setid -e proc-setid -e global ``` 3. Stop IBM HTTP Server. 4. Open $IHS\_ROOT/bin/envvars in an editor and append: ``` ulimit -c unlimited ulimit -f unlimited ``` 5. Start IBM HTTP Server as normal 6. Check the IBM HTTP Server error log and make sure you don't see one of these messages, which would indicate that one of the steps above was skipped: ``` [notice] Core file limit is 0; core dumps will be not be written for server crashes [notice] CoreDumpDirectory not set; core dumps may not be written for child process crashes ``` (Levels of IBM HTTP Server prior to mid-2004 do not report these potential configuration problems.) ### Linux ##### systemd-coredump If `/proc/sys/kernel/core_pattern` passes the core to /usr/lib/systemd/systemd-coredump, the core file only exists as part of a larger archive and cannot directly be read with tools such as gdb. Prior to running any IHS collector, extract the native core using the following procedure: 1. Determine the crashing IHS process ID. This is usually noted in the IHS error log. 2. List the available systemd-coredumps: `coredumpctl list` and confirm it has an entry for the affected PID 3. Extract the core file to a location with enough space: `coredumpctl dump $PID --output /tmp/core.$PID` 4. Run the ihsdiag against /tmp/core.$PID rather than .lz4/.zst files under /var/lib/systemd/coredump/ ### Basic enablement in IHS 1. Modify httpd.conf and set CoreDumpDirectory directive to point to a location where the web server user id (e.g., *nobody* or *www*) can create files. Usually this is sufficient: ``` CoreDumpDirectory /tmp ``` Note: If `/proc/sys/kernel/core_pattern` is set, the core dump location can be overridden including being processed by an application which may choose to put the core file anywhere, or stop writing them if they occur too rapidly. If a program is listed, you may need to look at it's output in the system log or `dmesg` to find the location of IHS cores. 2. Stop IBM HTTP Server. 3. Open $IHS\_ROOT/bin/envvars in an editor and append: ``` ulimit -c unlimited ulimit -f unlimited ``` 4. Start IBM HTTP Server as normal 5. Check the IBM HTTP Server error log and make sure you don't see one of these messages, which would indicate that one of the steps above was skipped: ``` [notice] Core file limit is 0; core dumps will be not be written for server crashes [notice] CoreDumpDirectory not set; core dumps may not be written for child process crashes ``` (Levels of IBM HTTP Server prior to mid-2004 do not report these potential configuration problems.) ## General configuration issues - file permission Make sure that the process for which a coredump is needed has permission to write a coredump. For example, with Apache/IHS, the default location of the coredump is the Apache/IHS install directory or the directory specified by the CoreDumpDirectory directive. The user id associated with Apache/IHS must have permission to write files there. For most processes created by Apache/IHS, that user id and group id is specified by the User and Group directives in httpd.conf. This is often "nobody." A quick work-around to permission problems is to specify "CoreDumpDirectory /tmp" in httpd.conf. - available disk space Make sure there is plenty of room (possibly many megabytes) available on the partition/mount/volume where you expect the core file to be placed. If you get a core dump which is unusable for some reason, check available disk space with `df -k` on the partition/mount/volume containing the core after the core dump has been written to ensure that the system did not run out of space. Note that with Apache/IHS, the core file will almost always be placed in the directory specified by the CoreDumpDirectory directive. - operating system file size and core size limits (ulimits) Make sure your ulimit is set appropriately so that you don't hit a limit in the size of the core file (some default limits have the size limited to zero bytes :) ). There are two parts: 1) the hard limits imposed by your system or system administrator and 2) the soft limits you can manipulate via the shell. Please note that the limits in force for the user that starts the server (usually `root`) are what is important. When the server starts as `root` switches user ids, the limits in force do not change. ### hard system limits On AIX, a hard limit can be set per user in smit: ``` smitty user select "Change / Show Characteristics of a User" enter user name set "Hard CORE file size" ``` ### soft limits manipulated by your shell On all systems, soft limits can be manipulated by the shell. For bash or ksh, **`ulimit -a`** will display the limit and **`ulimit -c unlimited`** will let you get as much as your system \[administrator\] allows. On AIX, a soft limit can be set per user in smit. ``` smitty user select "Change / Show Characteristics of a User" enter user name set "Soft CORE file size" ``` Note that ulimit manipulation in the shell is still effective. ## Alternate locations for coredumps The default location for coredumps is the directory specified by the `ServerRoot` directory. When the web server is started as `root`, the child processes run under a different user id, which does not have permission to write to that directory. This is handled by using the `CoreDumpDirectory` directive to specify an alternate location, such as `/tmp`. Some platforms provide a mechanism for specifying an alternate coredump location. This will override the value of the `CoreDumpDirectory` directive. #### `syscorepath` command on AIX 5.2 and above AIX 5.2 and above provides the `syscorepath` command for specifying an alternate coredump directory which affects all applications on the system. If the web server was started without the `CoreDumpDirectory` directive and that is preventing core dumps from being written because the default directory has unsuitable permissions, the `syscorepath` command can be used to specify a directory with the appropriate permissions, and coredumps can then be written without restarting the web server. When `syscorepath` is used to specify an alternate directory, the file name of the coredump is no longer `core`, but instead includes the process id of the process which crashed, and the time of day that the crash occurred. Refer to the `syscorepath` manpage for further information. #### `coreadm` command on Solaris Solaris provides the `coreadm` command which controls several coredump settings, including an alternate coredump directory and the format of the name of the coredump. Refer to the `coreadm` manpage for further information. ## Issues with threaded programs ### Linux If a thread takes a synchronous signal (e.g., SIGSEGV, SIGABRT, SIGBUS) on Linux \< 2.4, the kernel won't take a coredump. A patch is available. With the Linux 2.4 kernel, if a thread crashes you'll get two coredumps: one for the main process, named core.pid, and one for the bad thread, named core.fakepid. ### AIX Make sure the "full core" option is enabled (see below). ## setuid() Issues When IHS or Apache starts as root on Unix-like systems, it switches identity to the user and group specified in the configuration file. Sticky-bit programs and programs which start as root and then set their user id to something else have special issues for getting coredumps on some operating systems. ### Solaris By default, Solaris does not create coredumps for setuid() programs. Look at the documentation for the coreadm program (`man coreadm`). When all types of core dumps are enabled, it will display something like this: ``` % coreadm global core file pattern: /coredumps/core.%f.%p init core file pattern: /coredumps/init-core.%f.%p global core dumps: enabled per-process core dumps: enabled global setid core dumps: enabled per-process setid core dumps: enabled global core dump logging: enabled ``` This will turn on most types of core dumps: ``` coreadm -e global-setid -e proc-setid -e global ``` This will set the global core file pattern: ``` coreadm -g /tmp/core.%f.%p ``` Note: when you include a directory in the core file pattern, Apache's CoreDumpDirectory directive cannot override that. ### FreeBSD You need to set the kern.sugid\_coredump variable via sysctl. ### Linux When an application does this switch on Linux, the kernel normally disables coredumps. The application can make a special syscall -- `prctl(PR_SET_DUMPABLE, 1)` -- which will enable coredumps for that application. This syscall works only on Linux 2.4 and later kernels. Important note: There are reports that some 2.4 kernels from some vendors may have the prctl() feature broken, such that a core dump is not written even when the prctl() call is issued. ### AIX no known setuid() issues ### HP-UX HP-UX prior to 11i has no known setuid() issues. With 11i, some extra configuration is required. Here is some information a customer received from HP-UX technical support: This is an issue involving programs that run first as root and then switch to another user. The solution is to poke the kernel. Specifically, set an undocumented kernel parameter called dump\_all (works for 11.11, but not for 11.0). Here's how to activate dump\_all: ``` # echo "dump_all/W 1" | adb -w /stand/vmunix /dev/kmem dump_all: 0 = 1 ``` To deactivate use: # echo "dump_all/W 0" | adb -w /stand/vmunix /dev/kmem dump_all: 1 = 0 ## Special Operating System Considerations ### AIX #### full core option AIX has a system-wide "full core" option which must be enabled in order for "user data" areas of memory to be written to the coredump. Without these areas of memory in the coredump, many types of problems cannot be diagnosed. It will also result in `dbx` having problems analyzing the coredump of a threaded process. It is very important to enable the "full core" option so that all the necessary information is in the coredump. Here is an example scenario from a dump which was not recorded properly because `Enable full CORE dump` was `false`: ``` [trawick@gorthaur platform_test]$ dbx ./a.out /tmp/core Type 'help' for help. warning: The core file is truncated. You may need to increasethe ulimit for file and coredump, or free some space on the filesystem. reading symbolic information ... [using memory image in /tmp/core] warning: Unable to access address 0xf0203a48 from core pthdb_session.c, 487: 1 PTHDB_CALLBACK (callback failed) k_thread.c, 2124: PTHDB_CALLBACK (callback failed) Segmentation fault in sig_coredump at line 24 24 kill(ap_my_pid, sig); (dbx) up warning: Unable to access address 0x8 from core not that many levels (dbx) ``` Once the "full core" option was enabled the proper information was recorded and dbx could be used to determine the cause of the segfault. ##### checking if full core is set Run this command: `lsattr -El sys0 -a fullcore` The desired output is: ``` fullcore true Enable full CORE dump True ``` If either the second word or last word of output are not "true" then the full core option is not currently enabled. (Under some conditions, the full core option may not take affect immediately if set from `smitty chgsys`.) ##### enabling full core Run this command to enable the option immediately: `chdev -l sys0 -a fullcore=true` **Important note:** If the full core option took effect after the crashing application was started, the application should be stopped and then started again so that full core dumps are written. Again, verify with the `lsattr` command above that the setting took effect. #### Large file support Occasionally, core dumps will exceed 2GB in size. Thus, the directory for coredumps must support large files. This is specified during the creation of the JFS filesystem. `# lsfs -q` should report *bf: true* #### an easy way to set all limits to unlimited for a user As root, edit `/etc/security/limits` and set everything to -1, then have the user log out and back in. Since the server is normally started as `root`, the user of interest is normally `root`. Here is what the settings for the user should look like: trawick: fsize = -1 core = -1 cpu = -1 data = -1 rss = -1 stack = -1 nofiles = -1 `ulimit -a` should show something like this: ``` trawick@tetra:~/wrk/port/testtool/platform_test% ulimit -a core file size (blocks) unlimited data seg size (kbytes) unlimited file size (blocks) unlimited max memory size (kbytes) unlimited open files unlimited pipe size (512 bytes) 64 stack size (kbytes) unlimited cpu time (seconds) unlimited max user processes 128 virtual memory (kbytes) unlimited ```