Many data warehouse databases use a star schema. A star schema consists of a fact table and a number of dimensional tables. The fact table is generally large and contains the quantitative or factual information about the subject. A dimensional table describes an attribute in the fact table.
When a dimension needs lower-level information, the dimension is modeled by a hierarchy of tables, called a snowflake schema.
For more information on star schemas and snowflake schemas, see the IBM Informix: Database Design and Implementation Guide.
Queries that use tables in a star schema or snowflake schema can benefit from the proper index on the fact table. Consider the example of a star schema with one fact table named orders and four dimensional tables named customers, suppliers, products, and clerks. The orders table describes the details of each sale order, which includes the customer ID, supplier ID, product ID, and sales clerk ID. Each dimensional table describes an ID in detail. The orders table is large, and the four dimensional tables are small.
The following query finds the total direct sales revenue in the Menlo Park region (postal code 94025) for hard drives supplied by the Johnson supplier:
SELECT sum(orders.price) FROM orders, customers, suppliers,product,clerks WHERE orders.custid = customers.custid AND customers.zipcode = 94025 AND orders.suppid = suppliers.suppid AND suppliers.name = 'Johnson' AND orders.prodid = product.prodid AND product.type = 'hard drive' AND orders.clerkid = clerks.clerkid AND clerks.dept = 'Direct Sales'
This query uses a typical star join, in which the fact table joins with all dimensional tables on a foreign key. Each dimensional table has a selective table filter.
An optimal plan for the star join is to perform a cartesian product on the four dimensional tables and then join the result with the fact table. The following index on the fact table allows the optimizer to choose the optimal query plan:
CREATE INDEX ON orders(custid,suppid,prodid,clerkid)
Without this index, the optimizer might choose to first join the fact table with one dimensional table and then join the result with the remaining dimensional tables. The optimal plan provides better performance.
Enterprise Edition Home | Express Edition Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]