Use multiple aggregations on columns. This can be combined with complete lazy API and is considered idiomatic polars.
// use lazy api rest parameter style
> df.groupBy('foo', 'bar')
> .agg(pl.sum('ham'), col('spam').tail(4).sum())
// use lazy api array style
> df.groupBy('foo', 'bar')
> .agg([pl.sum('ham'), col('spam').tail(4).sum()])
// use a mapping
> df.groupBy('foo', 'bar')
> .agg({'spam': ['sum', 'min']})
Return first n rows of each group.
Optional
n: numberNumber of values of the group to select
> df = pl.DataFrame({
> "letters": ["c", "c", "a", "c", "a", "b"],
> "nrs": [1, 2, 3, 4, 5, 6]
> })
> df
shape: (6, 2)
╭─────────┬─────╮
│ letters ┆ nrs │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════════╪═════╡
│ "c" ┆ 1 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 2 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "a" ┆ 3 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 4 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "a" ┆ 5 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "b" ┆ 6 │
╰─────────┴─────╯
> df.groupby("letters")
> .head(2)
> .sort("letters");
> >>
shape: (5, 2)
╭─────────┬─────╮
│ letters ┆ nrs │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════════╪═════╡
│ "a" ┆ 3 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "a" ┆ 5 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "b" ┆ 6 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 1 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 2 │
╰─────────┴─────╯
Do a pivot operation based on the group key, a pivot column and an aggregation function on the values column.
Starts a new GroupBy operation.