Example of Query-Plan Execution

Assume also that no indexes are on any of the three tables. Suppose that the optimizer chooses the customer-orders-items path and the nested-loop join for both joins (in reality, the optimizer usually chooses a hash join for two tables without indexes on the join columns). Figure 49 shows the query plan, expressed in pseudocode. For information about interpreting query plan information, see Query Plan Report.

Figure 49. A Query Plan in Pseudocode

for each row in the customer table do:
   read the row into C
   for each row in the orders table do:
      read the row into O
      if O.customer_num = C.customer_num then
         for each row in the items table do:
            read the row into I
            if I.order_num = O.order_num then
               accept the row and send to user
            end if
         end for
      end if
   end for
end for

This example does not describe the only possible query plan. Another plan merely reverses the roles of customer and orders: for each row of orders, it reads all rows of customer, looking for a matching customer_num. It reads the same number of rows in a different order and produces the same set of rows in a different order. In this example, no difference exists in the amount of work that the two possible query plans need to do.

Join with Column Filters

The presence of a column filter changes things. A column filter is a WHERE expression that reduces the number of rows that a table contributes to a join. The following example shows the preceding query with a filter added:

The expression O.paid_date IS NULL filters out some rows, reducing the number of rows that are used from the orders table. Consider a plan that starts by reading from orders. Figure 50 displays this sample plan in pseudocode.

Figure 50. Query Plan That Uses a Column Filter

for each row in the orders table do:
   read the row into O
   if O.paid_date is null then
      for each row in the customer table do:
         read the row into C
         if O.customer_num = C.customer_num then
            for each row in the items table do:
               read the row into I
               if I.order_num = O.order_num then
                  accept row and return to user
               end if
            end for
         end if
      end for
   end if
end for

Let pdnull represent the number of rows in orders that pass the filter. It is the value of COUNT(*) that results from the following query:

If one customer exists for every order, the plan in Figure 50 reads the following rows:

Figure 51 shows an alternative execution plan that reads from the customer table first.

Figure 51. The Alternative Query Plan in Pseudocode

for each row in the customer table do:
   read the row into C
   for each row in the orders table do:
      read the row into O
      if O.paid_date is null and
           O.customer_num = C.customer_num then
         for each row in the items table do:
            read the row into I
            if I.order_num = O.order_num then
               accept row and return to user
            end if
         end for
      end if
   end for

Because the filter is not applied in the first step that Figure 51 shows, this plan reads the following rows:

The query plans in Figure 50 and Figure 51 produce the same output in a different sequence. They differ in that one reads a table pdnull times, and the other reads a table SELECT COUNT(*) FROM customer times. By choosing the appropriate plan, the optimizer can save thousands of disk accesses in a real application.

Join with Indexes

The preceding examples do not use indexes or constraints. The presence of indexes and constraints provides the optimizer with options that can greatly improve query-execution time. Figure 52 shows the outline of a query plan for the previous query as it might be constructed using indexes.

Figure 52. Query Plan with Indexes

for each row in the customer table do:
   read the row into C
   look up C.customer_num in index on orders.customer_num
   for each matching row in the orders index do:
      read the table row for O
      if O.paid_date is null then
         look up O.order_num in index on items.order_num
         for each matching row in the items index do:
            read the row for I
            construct output row and return to user
         end for
      end if
   end for
end for

The keys in an index are sorted so that when the database server finds the first matching entry, it can read any other rows with identical keys without further searching, because they are located in physically adjacent positions. This query plan reads only the following rows:

This query plan achieves a great reduction in cost compared with plans that do not use indexes. An inverse plan, reading orders first and looking up rows in the customer table by its index, is also feasible by the same reasoning.

The physical order of rows in a table also affects the cost of index use. To the degree that a table is ordered relative to an index, the overhead of accessing multiple table rows in index order is reduced. For example, if the orders table rows are physically ordered according to the customer number, multiple retrievals of orders for a given customer would proceed more rapidly than if the table were ordered randomly.

In some cases, using an index might incur additional costs. For more information, see Index Lookup Costs.