polars.LazyFrame.groupby

LazyFrame.groupby(by: Union[str, List[str], polars.internals.expr.Expr, List[polars.internals.expr.Expr]], maintain_order: bool = False) polars.internals.lazy_frame.LazyGroupBy[polars.internals.lazy_frame.LDF]

Start a groupby operation.

Parameters
by

Column(s) to group by.

maintain_order

Make sure that the order of the groups remain consistent. This is more expensive than a default groupby.

Examples

>>> df = pl.DataFrame(
...     {
...         "a": ["a", "b", "a", "b", "b", "c"],
...         "b": [1, 2, 3, 4, 5, 6],
...         "c": [6, 5, 4, 3, 2, 1],
...     }
... ).lazy()
# does NOT work:
# df.groupby("a")["b"].sum().collect()
#                ^^^^ TypeError: 'LazyGroupBy' object is not subscriptable
# instead, use .agg():
>>> df.groupby("a").agg(pl.col("b").sum()).collect()
shape: (3, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ c   ┆ 6   │
├╌╌╌╌╌┼╌╌╌╌╌┤
│ b   ┆ 11  │
├╌╌╌╌╌┼╌╌╌╌╌┤
│ a   ┆ 4   │
└─────┴─────┘