To manage checkpoints efficiently, the database server designates a set of fuzzy buffers that can be flushed in the background between checkpoints, thus reducing the number of buffers that must be flushed during checkpoints.
A checkpoint occurs in the following circumstances:
To specify checkpoint intervals in terms of time or the number of logical log records created, set the configuration parameters described in Specifying the Interval Between Checkpoints.
For information, see Managing the Logical Log.
For information, see Managing the Physical Log.
Make sure that pages are cleaned often enough for the sqlexec thread that executes a query or transaction to find available pages in shared memory buffers. If the sqlexec thread cannot find available pages in the buffer pool, it writes its data to disk (a foreground write) and waits for buffer pages to be freed.
Foreground writes should be eliminated or kept to a minimum. If foreground writes occur, increase the number of LRU queues or increase the size of the buffer pool. For information, see Specifying the Number of Least Recently Used Queues. To monitor the frequency of foreground writes, use xctl onstat -F. Use onstat -g ckp for checkpoint data including fuzzy writes.
The output of this command includes the following information:
At checkpoints, the page cleaners should be writing to chunks. Generally, database servers that run OLTP applications should generate higher numbers in the LRU Writes column and database servers that run DSS applications should generate higher numbers in the Chunk Writes column.
You adjust the interval between checkpoints primarily to manage tradeoffs between performance and fast recovery after emergency shutdown. Backups, restores, and fast recovery after emergency shutdown take less time if checkpoints occur often, but transaction performance improves if frequently used pages in buffers are flushed to disk less often.
To specify the interval between checkpoints, use the CKPTINVTL configuration parameter. Specify this interval as a number of seconds.
In most instances, fuzzy checkpoints are performed instead of full checkpoints to improve transaction throughput.
For information about when fuzzy checkpoints are performed and when full checkpoints are performed, refer to the discussion of checkpoints in the IBM Informix: Extended Parallel Server Administrator's Guide.
The database server writes a message to the message log to note the time that it completes a checkpoint. To read these messages, use onstat -m. Checkpoints also occur whenever the physical log becomes 75 percent full. However, with fuzzy checkpoints the physical log does not fill as rapidly because fuzzy operations in pages are not physically logged. If you set CKPTINTVL to a long interval, you can use physical-log capacity to trigger checkpoints based on actual database activity instead of at a fixed interval. Nevertheless, a long checkpoint interval can increase the time that is needed for recovery if the system fails.
Depending on your throughput and data-availability requirements, you can choose an initial checkpoint interval of five, ten, or fifteen minutes, with the understanding that checkpoints might occur more often if physical-logging activity requires them because checkpoints must occur whenever the physical log becomes 75 percent full. The default interval is five minutes.
Consider that a normal OLTP workload requires about one minute of recovery time for about five minutes of work. A very heavy OLTP workload might require one minute of recovery time for only three minutes of work. DSS workloads, because they are read-intensive, can achieve a one-minute recovery time with less frequent checkpoints.
The buffer pool is distributed among least-recently-used (LRU) queues. Each LRU queue consists of a set of dirty pages and a set of unchanged pages. The dirty pages are flushed to disk either by a background flusher thread or during a checkpoint.
The LRUS configuration parameter specifies the number of LRU queue pairs to set up within the shared-memory buffer pool. Configuring more LRU queues allows more page cleaners to operate. Unless you also increase the size of the BUFFERS parameter correspondingly, when you increase the number of LRU queues, you also reduce the size of each queue. For a single-processor system, the recommended setting for the LRUS parameter is a minimum of 4. For multiprocessor systems, set the LRUS parameter to a minimum of 4 or 4 * NUMCPUVPS, whichever is greater.
For information about the function and structure of the LRU queue pairs, see the IBM Informix: Extended Parallel Server Administrator's Reference.
To increase write-cache rates and bring them up to at least 85 percent, adjust the LRU_MAX_DIRTY and LRU_MIN_DIRTY configuration parameters. For information about increasing the corresponding read-cache rate, see Tuning BUFFERS to Improve the Read-Cache Rate.
The LRU_MAX_DIRTY and LRU_MIN_DIRTY configuration parameters to control how often pages are flushed to disk between full checkpoints.
To monitor the percentage of dirty pages in LRU queues, use xctl onstat -R. If the number of dirty pages consistently exceeds the LRU_MAX_DIRTY limit, you have too few LRU queues or too few page cleaners. First use the LRUS parameter to increase the number of LRU queues. If the percentage of dirty pages still exceeds LRU_MAX_DIRTY, use the CLEANERS parameter to increase the number of page cleaners.
The CLEANERS configuration parameter specifies the number of page-cleaner threads to run during checkpoints. Because the database server writes to chunks during checkpoints, the number of cleaners is determined by the average number of chunks to which checkpoints write.
For installations that support fewer than 20 disks, it is recommended one page-cleaner thread for each disk that contains database server data. For installations that support between 20 and 100 disks, it is recommended one page-cleaner thread for every two disks. For larger installations, it is recommended one page-cleaner thread for every four disks.