The degree of parallelism for a query refers to the number of subplans that the database server executes in parallel to run the query. For example, a two-table join that six threads execute (with each thread executing one sixth of the required processing) has a higher degree of parallelism than one that two threads execute.
The database server determines the best degree of parallelism for each component of a PDQ query, based on various considerations: the number of available coservers, the number of virtual processors (VPs) on each coserver, the fragmentation of the tables that are being queried, the complexity of the query, and so forth.
The database server achieves a high degree of parallelism, so SQL operations are completely parallel. Completely parallel means that Extended Parallel Server processes multiple threads simultaneously on all CPU VPs across all coservers to speed execution of a single query.
The value of PDQPRIORITY does not determine when to use PDQ to process a query in parallel. Even when the value of PDQPRIORITY is 0, the database server executes a query in parallel across all CPU VPs on all coservers.
PDQ provides performance advantages on parallel-processing platforms composed of multiple computers.On a parallel-processing platform, PDQ distributes the execution of a query across available processors on all nodes that support coservers, and takes full advantage of the memory on each of those nodes.
When the connection coserver determines that a query requires access to data that is fragmented across coservers, the database server determines which additional coservers are required to participate in the query. It then divides the query plan into subplans for each of the participating coservers. This division is based on the fragmentation scheme of the tables and the availability of resources on the connection coserver and the participating coservers.
Extended Parallel Server distributes each subplan to the pertinent coservers and executes the subplans in parallel. Each subplan is processed simultaneously with the others. Because each subplan represents a smaller amount of processing time than the original query plan, the database server can drastically reduce the time that is required to process the query if each portion of the query had to be performed consecutively.
Parallel execution is extremely useful for decision support queries in which large volumes of data are scanned, joined, and sorted across multiple coservers.
For example, consider the following SQL request:
SELECT geo_id, sum(dollars) FROM customer a, cash b WHERE a.cust_id=b.cust_id GROUP BY geo_id ORDER BY SUM(dollars)
In this example, the connection and participating coservers perform the following tasks: