The optimizer uses the data-distribution statistics that are stored in the sysdistrib system catalog table to determine the lowest-cost query plan. Make sure that you keep these statistics up-to-date so that the optimizer can choose the best query plan.
Run UPDATE STATISTICS as follows when tables are updated to ensure that statistics are also updated.
For detailed information about the syntax and options of the UPDATE STATISTICS statement, refer to the IBM Informix: Guide to SQL Syntax.
For unindexed columns in tables, follow these general guidelines when you run UPDATE STATISTICS:
UPDATE STATISTICS MEDIUM uses a statistical sampling method to create distributions. In MEDIUM mode, UPDATE STATISTICS samples the columns with a default resolution of 1 and a default confidence level of 0.95. In HIGH mode, the default resolution is 0.5 and the confidence level is 0.99.
To bring the MEDIUM mode output closer to the HIGH mode output, use the RESOLUTION clause in the UPDATE STATISTICS statement. HIGH mode statistics are more significant than MEDIUM mode statistics, but in HIGH mode the database server scans the table and records data for each row, which might take a very long time on a large table. The statistical sampling method used by MEDIUM mode is faster. Experiment, and see if UPDATE STATISTICS MEDIUM output data is adequate to produce good query plans.
Occasionally, you might find that you need to run UPDATE STATISTICS in HIGH mode, but in general adjusting the RESOLUTION clause for MEDIUM mode produces good statistics faster.
If you know that a column is likely to contain many duplicate values or skewed data, run UPDATE STATISTICS MEDIUM with a lower resolution. You can specify a resolution as low as 0.005. The RESOLUTION you specify depends on the number of rows in the table and the amount of skewed or duplicated values you expect in the data.
You do not need to collect statistics on columns that are not used in joins and filters. Although in an environment in which users enter ad hoc queries, you cannot be completely sure what columns are used as filters, common sense and an understanding of the kind of information that users probably want to derive from the table data will help you identify such columns.
Make the UPDATE STATISTICS statement part of the script that you run to attach and detach fragments or to execute the batch job.
If columns are indexed, follow these guidelines:
For example, if index ix_1 is defined on columns a, b, c, and d, and index ix_2 is defined on columns a, b, e, and f, run UPDATE STATISTICS HIGH on column a by itself. Then run UPDATE STATISTICS HIGH on columns c and e. In addition, you can run UPDATE STATISTICS HIGH on column b, but this step is usually not necessary.
Because the statement constructs the index information statistics only once for each index, these steps ensure that UPDATE STATISTICS executes rapidly.