The VARIANCE function returns an estimate of the population variance, as the standard deviation squared. VARIANCE calculates the following value:
(SUM(Xi2) - (SUM(Xi)2)/N)/(N - 1)
In this formula, Xi is each value in the column and N is the total number of non-NULL values in the column (unless all values are NULL, in which case the variance is logically undefined, and the VARIANCE function returns NULL).
You can apply the VARIANCE function only to numeric columns.
The following query estimates the variance of age values for a population:
SELECT VARIANCE(age) FROM u_pop WHERE u_pop.age > 0
As with the other aggregates, the VARIANCE function applies to the rows of a group when the query includes a GROUP BY clause, as in this example:
SELECT VARIANCE(age) FROM u_pop GROUP BY birth WHERE VARIANCE(age) > 0
As previously noted, VARIANCE ignores NULL values unless every qualified row is NULL for a specified column. If every value is NULL, then VARIANCE returns a NULL result for that column. (This typically indicates missing data, and is not necessarily a good estimate of underlying population variance.)
If N, the total number of qualified non-NULL column values, equals one, then the VARIANCE function returns zero (another implausible estimate of the true population variance). To omit this special case, you can modify the query. For example, you might include a HAVING COUNT(*) > 1 clause.
Although DATE values are stored internally as an integer, you cannot use the VARIANCE function on columns of data type DATE.
Enterprise Edition Home | Express Edition Home | [ Top of Page | Previous Page | Next Page | Contents | Index ]