Enterprise Edition Home | Express Edition Home | Previous Page | Next Page   Effect of Configuration on Memory Utilization > Parameters That Affect Memory Caches >

Data-Distribution Cache

The optimizer uses distribution statistics generated by the UPDATE STATISTICS statement in the MEDIUM or HIGH mode to determine the query plan with the lowest cost. The first time that the optimizer accesses the distribution statistics for a column, the database server retrieves the statistics from the sysdistrib system catalog table on disk. Once the database server has accessed the distribution statistics, it places that information in the data-distribution cache in memory.

Figure 8 shows how the database server accesses the data-distribution cache for multiple users. When the optimizer accesses the column distribution statistics for User 1 for the first time, the database server puts the distribution statistics in the data-distribution cache. When the optimizer determines the query plan for user 2, user 3 and user 4 who access the same column, the database server does not have to read from disk to access the data-distribution information for the table. Instead, it reads the distribution statistics from the data-distribution cache in shared memory.

Figure 8. Data-Distribution Cache
begin figure description - This figure is described in the surrounding text. - end figure description

The database server initially places pages for the sysdistrib system catalog table in the buffer pool as it does all other data and index pages. However, the data-distribution cache offers additional performance advantages. It:

Data-Distribution Configuration

The database server uses a hashing algorithm to store and locate information within the data-distribution cache. The DS_POOLSIZE controls the size of the data-distribution cache and specifies the total number of column distributions that can be stored in the data-distribution cache. To modify the number of buckets in the data-distribution cache, use the DS_HASHSIZE configuration parameter. The following formula determines the number of column distributions that can be stored in one bucket.

Distributions_per_bucket = DS_POOLSIZE / DS_HASHSIZE

To modify the number of distributions per bucket, change either the DS_POOLSIZE or DS_HASHSIZE configuration parameter.

For example, with the default values of 127 for DS_POOLSIZE and 31for DS_HASHSIZE, you can potentially store distributions for about 127 columns in the data-distribution cache. The cache has 31 hash buckets, and each hash bucket can have an average of 4 entries.

The values that you set for DS_HASHSIZE and DS_POOLSIZE, depend on the following factors:

If the size of the data-distribution cache is too small, the following performance problems can occur:

Monitoring the Data-Distribution Cache

To monitor the size and use of the data-distribution cache, run onstat -g dsc or use the ISA Performance -> Cache menu options. You might want to change the values of DS_HASHSIZE and DS_POOLSIZE if you see the following situations:

Figure 7 shows sample output for onstat -g dsc.

Figure 9. onstat -g dsc Output
onstat -g dsc



Distribution Cache:
    Number of lists             : 31
    DS_POOLSIZE                 : 127

Distribution Cache Entries:

list#id ref_cnt dropped? heap_ptr  distribution name
-----------------------------------------------------------------

5     0       0       0   aa8f820  vjp_stores@gilroy:virginia.orders.order_num

12    0       0       0   aa90820  vjp_stores@gilroy:virginia.items.order_num

15    0       0       0   a7e9a38  vjp_stores@gilroy:virginia.customer.customer_num

19    0       0       0   aa3bc20  vjp_stores@gilroy:virginia.customer.lname

21    0       0       0   aa3cc20  vjp_stores@gilroy:virginia.orders.customer_num

28    0       0       0   aa91820  vjp_stores@gilroy:virginia.customer.company


Total number of distribution entries: 6.
    Number of entries in use    : 0

The onstat -g dsc output has the following fields.

Field
Description
Number of Lists
Number of buckets or lists that DS_HASHSIZE specifies
DS_POOLSIZE
Number of column distributions allowed in data-distribution cache
Number of entries
Number of column distributions currently in the data-distribution cache
Number of entries in use
Number of column distributions currently in use
List #
Hash bucket number
Id
Not used
Ref cnt
Number of SQL statements currently referencing the data-distribution information for this column from the cache
Dropped?
Designation if the column distribution has been dropped with the DROP DISTRIBUTIONS keyword on the UPDATE STATISTICS statement.
Heap ptr
Heap pointer
Distribution name
Name of the table and column that the data-distribution information describes
Enterprise Edition Home | Express Edition Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]