The user-defined statistics information in sysdistrib system catalog table is used in the following ways:
The dbschema utility with its -hd option displays statistics data for tables in your database. It can automatically display statistics for all columns with built-in data types (except TEXT and BYTE). It cannot automatically collect statistics for columns with user-defined data types because it does not know the structure of these data types.
For dbschema -hd to display statistics for a column with a user-defined data type, you must write a user-defined function named statprint( ) that generates text output of the statistics collected for your user-defined data type. The dbschema -hd command obtains the user-defined statistics from the encdat column of the sysdistrib system catalog table. The encdat column stores the statistics data in the stat opaque type. Therefore, dbschema must call the statprint( ) function for your user-defined data type to convert the statistics data from the stat data type to an LVARCHAR value that can be displayed.
To provide display statistics for your user-defined data type, you must:
When you declare your statistics-display function, it must have the following signature:
mi_lvarchar *statprint(udt_arg, stat_arg) udt_type *udt_arg; mi_statret *stat_arg;
BladeSmith automatically generates an OpaqueStatPrint( ) function (in which Opaque is the name of your opaque data type) with the following declaration:
mi_lvarchar *OpaqueStatCollect(Gen_dummy, Gen_bvin)
void *Gen_dummy;
mi_lvarchar *Gen_bvin;
If this declaration is not appropriate for your opaque type, you must customize the OpaqueStatPrint( ) function.
The statprint( ) function converts the statistics data stored in the stat data type to an LVARCHAR value that the database server can use to display information. The stat data type is a multirepresentational data type that the database server uses to store statistics data in the encdat column of the sysdistrib system catalog table.
As with any user-defined function, you register the statistics-display function with the CREATE FUNCTION statement. The registration of this function has the following requirements:
The database server handles routine resolution based on the data type of the first argument to statprint( ). If the name of your C statistics-collection function is not statprint( ), specify the C function name in the EXTERNAL NAME clause.
The following CREATE FUNCTION statement registers a statistics-collection function:
CREATE FUNCTION statprint(ll_arg longlong, num_rows stat) RETURNING LVARCHAR EXTERNAL NAME '/usr/udrs/bin/longlong.so(stat_print_ll)' LANGUAGE C;
After you register the statprint( ) function, make sure those with the DBA privilege and the table owner have the Execute privilege for the function.
For SQL statements that use user-defined data types, the optimizer can call custom selectivity and cost functions. Selectivity and cost functions might need to use statistics about the nature of the data in a column. When you create the statcollect( ) function that collects statistics for a UDT, the database server executes this function automatically when a user runs the UPDATE STATISTICS statement with the MEDIUM or HIGH keyword.
The statistics that the database server collects might require a smart large object for storage. The configuration parameter SBSSPACENAME specifies an sbspace for storing this information. If SBSSPACENAME is not set, the database server might not be able to collect the specified statistics.
The query optimizer can use data distributions when it assesses the selectivity of a query filter. The selectivity is the number of rows that the filter will return. For queries that involve columns with built-in data types, the database server uses data distributions to automatically determine selectivity for the following kinds of filters:
However, if the query involves columns with user-defined data types, you must provide the following information for the query optimizer to be able to determine the filter selectivity:
For user-defined types, these built-in operator functions do not automatically exist. You must write versions of these functions that handle your user-defined type.
Selectivity and cost functions might need to use statistics about the nature of the data in a column. If you want these selectivity functions to use data distributions, take the following actions:
For more information on how to write and register selectivity functions, see Writing Selectivity and Cost Functions.