Run any polars aggregation expression against the array’s elements
Description
This looks similar to $arr$eval(), but the key difference
is that $arr$agg() automatically
explodes the array if the expression inside returns a scalar (while
$arr$eval() always returns an
array).
Usage
<Expr>$arr$agg(expr)
Arguments
expr
|
Expression to run. Note that you can select an element with
pl$element(), pl$first(), and more. See
Examples.
|
Value
A polars expression
Examples
library("polars")
df <- pl$DataFrame(a = list(c(1, NA), c(42, 13), c(NA, NA)))$
cast(pl$Array(pl$Int64, 2))
# The column "null_count" has dtype u32 because `$null_count()` returns a
# scalar for each sub-array. Using `$arr$eval()` instead would return a
# column with dtype arr(u32).
df$with_columns(
null_count = pl$col("a")$arr$agg(pl$element()$null_count())
)
#> shape: (3, 2)
#> ┌───────────────┬────────────┐
#> │ a ┆ null_count │
#> │ --- ┆ --- │
#> │ array[i64, 2] ┆ u32 │
#> ╞═══════════════╪════════════╡
#> │ [1, null] ┆ 1 │
#> │ [42, 13] ┆ 0 │
#> │ [null, null] ┆ 2 │
#> └───────────────┴────────────┘
# The column "no_nulls" has dtype arr(u32) because the expression doesn't
# guarantee to return a scalar.
df$with_columns(
no_nulls = pl$col("a")$arr$agg(pl$element()$drop_nulls())
)
#> shape: (3, 2)
#> ┌───────────────┬───────────┐
#> │ a ┆ no_nulls │
#> │ --- ┆ --- │
#> │ array[i64, 2] ┆ list[i64] │
#> ╞═══════════════╪═══════════╡
#> │ [1, null] ┆ [1] │
#> │ [42, 13] ┆ [42, 13] │
#> │ [null, null] ┆ [] │
#> └───────────────┴───────────┘