Skip to content

Approximate count of unique values

Source code

Description

This function is syntactic sugar for pl$col(…)$approx_n_unique(), and uses the HyperLogLog++ algorithm for cardinality estimation.

Usage

pl$approx_n_unique(...)

Arguments

Characters indicating the column names, passed to pl$col(). See ?pl_col for details.

Value

Expr

See Also

  • \$approx_n_unique()

Examples

library("polars")

df = pl$DataFrame(
  a = c(1, 8, 1),
  b = c(4, 5, 2),
  c = c("foo", "bar", "foo")
)

df$select(pl$approx_n_unique("a"))
#> shape: (1, 1)
#> ┌─────┐
#> │ a   │
#> │ --- │
#> │ u32 │
#> ╞═════╡
#> │ 2   │
#> └─────┘
df$select(pl$approx_n_unique("b", "c"))
#> shape: (1, 2)
#> ┌─────┬─────┐
#> │ b   ┆ c   │
#> │ --- ┆ --- │
#> │ u32 ┆ u32 │
#> ╞═════╪═════╡
#> │ 3   ┆ 2   │
#> └─────┴─────┘