A common data warehouse problem is loading new rows into populated tables. New rows have the following two possibilities:
The older method of updating tables is to open a cursor and check each row to be loaded against the existing data. However, this method is inefficient in a database server that provides table fragmentation and parallel execution of operations.
Consider an example with an external table new_data and an existing table ship_data. The external table new_data has been created with the SAME AS ship_data clause so that the keys and column definitions match. For information about creating external tables and loading data from the, refer to the IBM Informix: Extended Parallel Server Administrator's Guide.
The efficiency of the method described in this section requires new_data to have significantly fewer rows than ship_data. However, this is usually the case when you add new rows to populated tables.
Follow these steps to update ship_data with new data from new_data:
INSERT INTO inter_data SELECT new_data.* FROM new_data, ship_data WHERE new_data.k = ship_data.k UPDATE ship_data SET ship_data.c1 = CASE WHEN inter_data.c1 is NULL THEN ship_data.c1 ELSE inter_data.c1 END, ... ship_data.cn = CASE WHEN inter_data.cn is NULL THEN ship_data.cn ELSE inter_data.cn END FROM ship_data, inter_data WHERE ship_data.k = inter_data.k;
INSERT INTO ship_data SELECT * from load_data WHERE NOT EXISTS ( SELECT 1 from inter_data WHERE inter_data.k = load_data.k);
This method requires two scans of the large table ship_data, one for the first insert statement and the second for the update statement. As a result, the total cost is approximately the same as two sequential scans of ship_data. If the number of rows in load_data is less than twenty times the number of rows in ship_data, it might help to have an index on the key in ship_data.
If the tables are fragmented, all of the operations are executed in parallel. For best performance, fragment tables by hash or hybrid expression and hash on the key column, so that no data is shipped across the network.
Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]