polars.DataFrame.approx_n_unique#

DataFrame.approx_n_unique() → DataFrame[source]#

Approximate count of unique values.

Deprecated since version 0.20.11: Use select(pl.all().approx_n_unique()) instead.

This is done using the HyperLogLog++ algorithm for cardinality estimation.

Examples

>>> df = pl.DataFrame(
...     {
...         "a": [1, 2, 3, 4],
...         "b": [1, 2, 1, 1],
...     }
... )
>>> df.approx_n_unique()  
shape: (1, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ u32 ┆ u32 │
╞═════╪═════╡
│ 4   ┆ 2   │
└─────┴─────┘