Enterprise Edition Home | Express Edition Home | Previous Page | Next Page   Table Performance Considerations > Estimating Table Size >

Estimating Data Pages

How you estimate the data pages of a table depends on whether that table contains fixed-length or variable-length rows.

Estimating Tables with Fixed-Length Rows

Perform the following steps to estimate the size (in pages) of a table with fixed-length rows. A table with fixed-length rows has no columns of VARCHAR or NVARCHAR data type.

To estimate the page size, row size, number of rows, and number of data pages
  1. Use onstat -b to obtain the size of a page.

    The buffer size field in the last line of this output displays the page size.

  2. Subtract 28 from this amount to account for the header that appears on each data page.

    The resulting amount is referred to as pageuse.

  3. To calculate the size of a row, add the widths of all the columns in the table definition. TEXT and BYTE columns each use 56 bytes.

    If you have already created your table, you can use the following SQL statement to obtain the size of a row:

    SELECT rowsize FROM systables WHERE tabname = 
       'table-name'; 
  4. Estimate the number of rows that the table is expected to contain.

    This number is referred to as rows.

    The procedure for calculating the number of data pages that a table requires differs depending on whether the row size is less than or greater than pageuse.

  5. If the size of the row is less than or equal to pageuse, use the following formula to calculate the number of data pages.

    The trunc() function notation indicates that you are to round down to the nearest integer.

    data_pages = rows / trunc(pageuse/(rowsize + 4))

    The maximum number of rows per page is 255, regardless of the size of the row.

    Important:
    Although the maximum size of a row that the database server accepts is approximately 32 kilobytes, performance degrades when a row exceeds the size of a page. For information about breaking up wide tables for improved performance, see Denormalizing the Data Model to Improve Performance.
  6. If the size of the row is greater than pageuse, the database server divides the row between pages.

    The page that contains the initial portion of a row is called the home page. Pages that contains subsequent portions of a row are called remainder pages. If a row spans more than two pages, some of the remainder pages are completely filled with data from that row. When the trailing portion of a row uses less than a page, it can be combined with the trailing portions of other rows to fill out the partial remainder page. The number of data pages is the sum of the home pages, the full remainder pages, and the partial remainder pages.

    1. Calculate the number of home pages.

      The number of home pages is the same as the number of rows:

      homepages = rows
    2. Calculate the number of full remainder pages.

      First calculate the size of the row remainder with the following formula:

      remsize = rowsize - (pageuse + 8)

      If remsize is less than pageuse - 4, you have no full remainder pages.

      If remsize is greater than pageuse - 4, use remsize in the following formula to obtain the number of full remainder pages:

      fullrempages = rows * trunc(remsize/(pageuse - 8))
    3. Calculate the number of partial remainder pages.

      First calculate the size of a partial row remainder left after you have accounted for the home and full remainder pages for an individual row. In the following formula, the remainder() function notation indicates that you are to take the remainder after division:

      partremsize = remainder(rowsize/(pageuse - 8)) + 4

      The database server uses certain size thresholds with respect to the page size to determine how many partial remainder pages to use. Use the following formula to calculate the ratio of the partial remainder to the page:

      partratio = partremsize/pageuse 

      Use the appropriate formula in the following table to calculate the number of partial remainder pages.

      partratio Value
      Formula to Calculate the Number of Partial Remainder Pages
      Less than .1
      partrempages =
      rows/(trunc((pageuse/10)/remsize) + 1)
      Less than .33
      partrempages =
      rows/(trunc((pageuse/3)/remsize) + 1)
      .33 or larger
      partrempages = rows
    4. Add up the total number of pages with the following formula:
      tablesize = homepages + fullrempages + partrempages

Estimating Tables with Variable-Length Rows

When a table contains one or more VARCHAR or NVARCHAR columns, its rows can have varying lengths. These varying lengths introduce uncertainty into the calculations. You must form an estimate of the typical size of each VARCHAR column, based on your understanding of the data, and use that value when you make the estimates.

Important:
When the database server allocates space to rows of varying size, it considers a page to be full when no room exists for an additional row of the maximum size.

To estimate the size of a table with variable-length rows, you must make the following estimates and choose a value between them, based on your understanding of the data:

To estimate the maximum number of data pages
  1. To calculate rowsize, add together the maximum values for all column widths.
  2. Use this value for rowsize and perform the calculations described in Estimating Tables with Fixed-Length Rows. The resulting value is called maxsize.
To estimate the projected number of data pages
  1. To calculate rowsize, add together typical values for each of your variable-width columns. It is suggested that you use the most frequently occurring width within a column as the typical width for that column. If you do not have access to the data or do not want to tabulate widths, you might choose to use some fractional portion of the maximum width, such as 2/3 (.67).
  2. Use this value for rowsize and perform the calculations described in Estimating Tables with Fixed-Length Rows. The resulting value is called projsize.

Selecting an Intermediate Value for the Size of the Table

The actual table size should fall somewhere between projsize and maxsize. Based on your knowledge of the data, choose a value within that range that seems most reasonable to you. The less familiar you are with the data, the more conservative (higher) your estimate should be.

Enterprise Edition Home | Express Edition Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]