APPROX_COUNT_DISTINCT() — Returns an approximate count of the number of distinct values for the specified column expression.
APPROX_COUNT_DISTINCT( column-expression )
The APPROX_COUNT_DISTINCT() function returns an approximation of the number of distinct values for the specified
column expression. APPROX_COUNT_DISTINCT(column-expression) is an alternative to the SQL
expression "COUNT(DISTINCT column-expression)
".
The reason for using APPROX_COUNT_DISTINCT() is because it can be significantly faster and use less temporary memory than performing a precise COUNT DISTINCT operation. This is particularly true when calculating a distinct count of a partitioned table across all of the partitions. The approximation usually falls within ±1% of the actual count.
You can use the APPROX_COUNT_DISTINCT() function on column expressions of decimal, timestamp, or any size integer datatype. You cannot use the function on floating point (FLOAT) or variable length (VARCHAR and VARBINARY) columns.