You can use the LOW keyword with the UPDATE STATISTICS statement for all non-index leading columns that are part of an index.
You can also use the MEDIUM or HIGH keywords with the UPDATE STATISTICS statement to specify the mode for data distributions on specific columns. These keywords indicate that the database server is to generate statistics about the distribution of data values for each specified column and place that information in a system catalog table called sysdistrib. If a distribution has been generated for a column, the optimizer uses that information to estimate the number of rows that match a query against a column. Data distributions in sysdistrib supersede values in the colmin and colmax column of the syscolumns system catalog table when the optimizer estimates the number of rows returned.
When you use data-distribution statistics for the first time, try to update statistics in MEDIUM mode for all your tables and then update statistics in HIGH mode for all columns that head indexes. This strategy produces statistical query estimates for the columns that you specify. These estimates, on average, have a margin of error less than percent of the total number of rows in the table, where percent is the value that you specify in the RESOLUTION clause in the MEDIUM mode. The default percent value for MEDIUM mode is 2.5 percent. (For columns with HIGH mode distributions, the default resolution is 0.5 percent.) For each table that your query accesses, build data distributions according to the following guidelines.
This step is a single UPDATE STATISTICS statement. The default parameters are sufficient unless the table is very large, in which case you should use a resolution of 1.0 and confidence of 0.99.
With the DISTRIBUTIONS ONLY option, you can execute UPDATE STATISTICS MEDIUM at the table level or for the entire system because the overhead of the extra columns is not large. Run the following UPDATE STATISTICS statement to create distributions for non-index join columns and non-index filter columns:
UPDATE STATISTICS MEDIUM DISTRIBUTIONS ONLY;
For example, suppose you have a table t1 with columns a, b, c, d, e, and f with the following indexes:
CREATE INDEX ix_1 ON t1 (a, b, c, d) ... CREATE INDEX ix_3 ON t1 (f) ...
Run the following UPDATE STATISTICS statements for the columns that head an index:
UPDATE STATISTICS HIGH FOR TABLE t1(a); UPDATE STATISTICS HIGH FOR TABLE t1(f);
These UPDATE STATISTICS HIGH statements replace the medium distributions that the previous step creates with high distributions for index columns.
For example, suppose you have the following indexes on table t1:
CREATE INDEX ix_1 ON t1 (a, b, c, d) ... CREATE INDEX ix_2 ON t1 (a, b, e, f) ... CREATE INDEX ix_3 ON t1 (f) ...
Step 2 executes UPDATE STATISTICS HIGH on column a and column f by. Then run UPDATE STATISTICS HIGH on columns c and e.
UPDATE STATISTICS HIGH FOR TABLE t1(c); UPDATE STATISTICS HIGH FOR TABLE t1(e);
In addition, you can run UPDATE STATISTICS HIGH on column b, but this step is usually not necessary.
UPDATE STATISTICS FOR TABLE t1(a,b,c,d); UPDATE STATISTICS FOR TABLE t1(a,b,e,f);
UPDATE STATISTICS HIGH FOR TABLE t2;
Because the statement constructs the statistics only once for each index, these steps ensure that UPDATE STATISTICS executes rapidly.
For additional information about data distributions and the UPDATE STATISTICS statement, see the IBM Informix: Guide to SQL Syntax.
Enterprise Edition Home | Express Edition Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]