Home | Previous Page | Next Page   Appendix A. Estimating Table and Index Size > Table Size Estimates >

Estimating Data Page Size

How you estimate the data pages of a table depends on whether the length of the table rows is fixed or variable.

Estimating the Size of Tables with Fixed-Length Rows

Perform the following steps to estimate the number of disk pages required for a table with fixed-length rows. A table with fixed-length rows has no columns of type VARCHAR or NVARCHAR.

To estimate the page size, row size, number of rows, and number of data pages
  1. Subtract 60 bytes from the page size for the header on each data page. The resulting amount is referred to as pageuse.

    The page size can be set to 2048, 4096, or 8092 bytes, as described in the IBM Informix: Administrator's Reference.

  2. To calculate the size of a row, add the widths of all the columns in the table definition. TEXT and BYTE columns each use 56 bytes. If you have already created the table, use the following SQL statement to obtain the size of a row:
    SELECT rowsize FROM systables
       WHERE tabname = 'table-name'; 
  3. Estimate the number of rows that the table should contain. This number is referred to as rows.

    The procedure for how to calculate the number of data pages that a table requires differs depending on whether the row size is less than or greater than pageuse.

  4. If the size of the row is less than or equal to pageuse, use the following formula to calculate the number of data pages. The trunc() function notation indicates that you should round down to the nearest integer.
    data_pages = rows / trunc(pageuse/(rowsize + 4))

    The maximum number of rows per page is 255, regardless of the size of the row.

  5. If the size of the row is greater than pageuse, the database server divides the row among pages. The page that contains the initial portion of a row is called the home page. Pages that contain subsequent portions of a row are called remainder pages. If a row spans more than two pages, some of the remainder pages are completely filled with data from that row. When the trailing portion of a row uses less than a page, it can be combined with the trailing portions of other rows to fill out the partial remainder page. The number of data pages is the sum of the home pages, the full remainder pages, and the partial remainder pages.
    1. Calculate the number of home pages. The number of home pages is the same as the number of rows.
      homepages = rows
    2. Calculate the number of full remainder pages. First, to calculate the size of the row remainder, use the following formula:
      remsize = rowsize - (pageuse + 8)

      If remsize is less than pageuse - 4, you have no full remainder pages. If remsize is greater than pageuse - 4, you can use remsize in the following formula to obtain the number of full remainder pages:

      fullrempages = rows * trunc(remsize/(pageuse - 8))
    3. Calculate the number of partial remainder pages. First calculate the size of a partial row remainder left after the home and full remainder pages for an individual row have been accounted for. In the following formula, the remainder() function notation indicates that you should take the remainder after division:
      partremsize = remainder(rowsize/(pageuse - 8)) + 4

      The database server uses page-size thresholds to determine how many partial remainder pages to use. If remsize is greater than pageuse - 4, use the following formula to calculate the ratio of the partial remainder to the page:

      partratio = partremsize/pageuse 

      If remsize is less than pageuse - 4, use remsize instead of partremsize in the formula.

      Use the appropriate formula from the following chart to calculate the number of partial remainder pages (partrempages) by using the value of partratio that you obtain.

      Partratio
      Formula to calculate the number of partial remainder pages
      Less than .1
      partrempages = rows/(trunc((pageuse/10)/remsize) + 1)
      Less than .33
      partrempages = rows /(trunc((pageuse/3)/remsize) + 1)
      .33 or larger
      partrempages = rows
    4. To calculate the total number of pages, use the following formula:
      tablesize = homepages + fullrempages + partrempages
Important:
Although the maximum size of a row that the database server accepts is approximately 32 kilobytes, performance deteriorates when a row exceeds the size of a page. For information on breaking up wide tables for improved performance, refer to IBM Informix: Database Design and Implementation Guide.

Estimating the Size of Tables with Variable-Length Rows

When a table contains one or more VARCHAR or NVARCHAR columns, its rows usually have varying lengths that reduce the precision of the calculations. You must estimate the typical size of each VARCHAR column, based on your understanding of the data, and use that value when you make your estimates.

Important:
When the database server allocates space to rows of varying size, it considers a page to be full when it does not contain enough space for an additional row of the maximum size.

To estimate the size of a table with variable-length rows, make the following estimates and choose a value between them, based on your understanding of the data:

The database server stores the size of the column in an extra byte that is added to each variable-length column.

To estimate the maximum number of data pages
  1. To calculate rowsize, add together the maximum values for all column widths.
  2. Use this value for rowsize and perform the calculations described in Estimating the Size of Tables with Fixed-Length Rows. The resulting value is called maxsize.
To estimate the projected number of data pages
  1. To calculate rowsize, add together typical values for each of your variable-width columns. It is suggested that you use the most frequently occurring width in a column as the typical width for that column. If you do not have access to the data or do not want to tabulate widths, you might choose a fraction of the maximum width, such as two-thirds.
  2. Use this value for rowsize and perform the calculations described in Estimating the Size of Tables with Fixed-Length Rows. The resulting value is called projsize.
Selecting an Intermediate Value for the Size of the Table

The actual table size should fall somewhere between projsize and maxsize. Based on your knowledge of the data, choose a value in that range that seems reasonable to you. The less familiar you are with the data, the higher your estimate should be.

Estimating Dbspace Pages for Simple Large Objects

When you estimate space required for a table, include space for simple large objects that are to be stored in that dbspace. Simple large objects include only TEXT and BYTE data.

To estimate the number of pages for simple large objects
  1. Calculate the usable portion of the page with the following formula:
    bpuse = 4096 - 32

    The 4096 bytes is the default size of a page. If you have set the PAGESIZE configuration parameter to a different number, use that number. The 32 bytes is for overhead.

  2. For each simple large object of size n, calculate the number of pages that the simple large object occupies (bpages_n) with the following formula:
    bpages1 = ceiling(bsize1/bpuse)
    bpages2 = ceiling(bsize2/bpuse)
    ...
    bpages_n = ceiling(bsize_n/bpuse)

    The ceiling() function notation indicates that you should round up to the nearest integer value.

  3. Add up the total number of pages for all simple large objects, as follows:
    total_bpages = bpages1 + bpages2 + ... + bpages_n 

Alternatively, you can base your estimate on the median size of a simple large object, which is the large object size that occurs most frequently. This method is less precise, but it is easier to calculate.

To estimate the number of pages based on the median size of the simple large objects
  1. Average the sizes of the Text or bYTE data to obtain the median size of a simple large object.
    msize = avg(bsize1 + bsize2 +
                ... + bsizen) / n
  2. Calculate the number of pages required for a simple large object of median size as follows:
    mpages = ceiling(msize/bpuse)
  3. Multiply this amount by the total number of simple large objects, as follows:
    total_bpages = bcount * mpages
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]