polars.LazyFrame.groupby#

LazyFrame.groupby(by: str | list[str] | polars.internals.expr.expr.Expr | list[polars.internals.expr.expr.Expr], maintain_order: bool = False) LazyGroupBy[LDF][source]#

Start a groupby operation.

Parameters:
by

Column(s) to group by.

maintain_order

Make sure that the order of the groups remain consistent. This is more expensive than a default groupby.

Examples

>>> df = pl.DataFrame(
...     {
...         "a": ["a", "b", "a", "b", "b", "c"],
...         "b": [1, 2, 3, 4, 5, 6],
...         "c": [6, 5, 4, 3, 2, 1],
...     }
... ).lazy()

The following does NOT work: # df.groupby(“a”)[“b”].sum().collect() # ^^^^ TypeError: ‘LazyGroupBy’ object is not subscriptable instead, use .agg(): >>> df.groupby(by=”a”, maintain_order=True).agg(pl.col(“b”).sum()).collect() shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ — ┆ — │ │ str ┆ i64 │ ╞═════╪═════╡ │ a ┆ 4 │ ├╌╌╌╌╌┼╌╌╌╌╌┤ │ b ┆ 11 │ ├╌╌╌╌╌┼╌╌╌╌╌┤ │ c ┆ 6 │ └─────┴─────┘