Home | Previous Page | Next Page   Appendix A. Estimating Table and Index Size > Index Page-Size Estimates >

Estimating Bitmap Index Size

You can create bitmap versions of B-tree indexes, which can be used for tables that are updated, and of GK indexes, which can be used only for STATIC tables, which are not updated.

Conventional B-tree index and B-tree indexes with compressed bitmap leaf pages differ only in the way in which the duplicate list for each key is stored at the leaf level. In a B-tree index, duplicate values are stored in the same way as unique values, one in each slot of an index leaf page. In a bitmap index, duplicate values are stored as a bit map in a leaf page, with a bit mapped to each row of the table. For rows that contain the specific value, the corresponding bit is marked ON.

You might create a bitmap index if the column to be indexed contains many identical values. The more identical values that the column contains, the more efficient the index is, both in storage and processing. The more accurately you can estimate the number of identical values in the key columns, the more accurate your estimate of the index size and storage efficiency will be.

For example, if the indexed column can contain only two values, such as M and F, calculating the storage efficiency of a bitmap index is easy. On the other hand, if a column can contain fifty or sixty values, such as the ages of employees, calculating the efficiency of the index is harder, but a bitmap index on this column might still store key values more efficiently than a conventional B-tree index.

Follow these steps to determine the efficiency of a bitmap index on a table column and to estimate the disk space that the index will require:

  1. Use the maximum number of table rows that can fit in a table page of 255 slots to calculate the integer i, which is the first power of 2 that is greater than or equal to the number of rows on a data page.

    For example, if a data page can contain a maximum of 35 rows, i is 6, because 25 < 35 <= 26. In fact, the value of i is 6 for any number of rows between 33 and 64.

  2. Estimate the average distance between two rows that have keys with the same value, assuming that keys with identical values are distributed evenly in the table fragment:
  3. To calculate the integer, N, for a data fragment that contains a total number of pages, P, use the following formula:
    N = P * 2i

    For example, if P is 10,000 and i is 6, then N is 640,000.

  4. Given these formulas, use the estimate of the number of rows in the table and the number of duplicate key values to estimate the minimum space that is required for each bitmap at the leaf nodes of the index:

Storage efficiency increases as the distance between rows with the same key decreases. For example, if the average row identifier distance is 75, the space required for a key with 100 duplicates is 824 bytes.

bitmap_size = 24 + (8 * 100) = 824 bytes

In this example, the 824 bytes to store the bitmap is more than the space needed for the key in a conventional index (5 * 100 = approximately 500 bytes plus the size of the key value).

However, if the value of N is 640,000, and the row identifier distance between duplicate key records is 40, the space required for a key is 80,028 bytes.

bitmap_size = 28 + 640,000/8 = 80,028 bytes

In this case, the bitmap is much more efficient than the 3,200,000 bytes needed to store the duplicate row identifiers in a conventional index.

If the bitmap would be larger than the row identifier list of a conventional B-tree index, the database server uses the row identifier list representation for the key instead of the bitmap representation.

As with conventional B-tree indexes, only one bitmap value is permitted on an overflow page.

In addition, bitmaps are aligned on the leaf nodes on 4-byte boundaries. On each page, gaps of 0 to 3 bytes might exist between the end of the key and the beginning of its bitmap. Consider the previous example, where N is 640,000, the row identifier distance between duplicate-key records is 40, and the space that is required for the bitmap is 80,028 bytes. If each index leaf page contains 4,000 free bytes and if the key size is 5 bytes, the bitmap is broken up into 3992-byte pieces because the space required for the key is rounded up to 8 bytes.

Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]