Compute aggregations for each group of a group by operation.
const df = pl.DataFrame({
foo: [1, 2, 2, 3, 3],
ham: [6.0, 6, 7, 8.0, 8.0],
bar: ["a", "b", "c", "c", "c"],
spam: ["a", "b", "c", "c", "c"],
});
// use lazy api rest parameter style
> df.groupBy('foo', 'bar').agg(pl.count('ham'), pl.col('spam')).sort(["foo", "bar"]);
shape: (4, 4)
┌─────┬─────┬─────┬────────────┐
│ foo ┆ bar ┆ ham ┆ spam │
│ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ str ┆ u32 ┆ list[str] │
╞═════╪═════╪═════╪════════════╡
│ 1.0 ┆ a ┆ 1 ┆ ["a"] │
│ 2.0 ┆ b ┆ 1 ┆ ["b"] │
│ 2.0 ┆ c ┆ 1 ┆ ["c"] │
│ 3.0 ┆ c ┆ 2 ┆ ["c", "c"] │
└─────┴─────┴─────┴────────────┘
> df.groupBy("bar").agg(pl.col("foo"), pl.col("ham"), pl.col("spam") ).sort("bar");
shape: (3, 4)
┌─────┬─────────────────┬─────────────────┬─────────────────┐
│ bar ┆ foo ┆ ham ┆ spam │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ list[f64] ┆ list[f64] ┆ list[str] │
╞═════╪═════════════════╪═════════════════╪═════════════════╡
│ a ┆ [1.0] ┆ [6.0] ┆ ["a"] │
│ b ┆ [2.0] ┆ [6.0] ┆ ["b"] │
│ c ┆ [2.0, 3.0, 3.0] ┆ [7.0, 8.0, 8.0] ┆ ["c", "c", "c"] │
└─────┴─────────────────┴─────────────────┴─────────────────┘
> const h = pl.col("ham");
> df.groupBy("bar").agg(h.sum().as("sum_ham"), h.min().as("min_ham"), h.max().as("max_ham"));
shape: (3, 4)
┌─────┬─────────┬─────────┬─────────┐
│ bar ┆ sum_ham ┆ min_ham ┆ max_ham │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f64 ┆ f64 │
╞═════╪═════════╪═════════╪═════════╡
│ a ┆ 6.0 ┆ 6.0 ┆ 6.0 │
│ b ┆ 6.0 ┆ 6.0 ┆ 6.0 │
│ c ┆ 23.0 ┆ 7.0 ┆ 8.0 │
└─────┴─────────┴─────────┴─────────┘
Return first n rows of each group.
Optionaln: numberNumber of values of the group to select
> df = pl.DataFrame({
> "letters": ["c", "c", "a", "c", "a", "b"],
> "nrs": [1, 2, 3, 4, 5, 6]
> })
> df
shape: (6, 2)
╭─────────┬─────╮
│ letters ┆ nrs │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════════╪═════╡
│ "c" ┆ 1 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 2 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "a" ┆ 3 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 4 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "a" ┆ 5 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "b" ┆ 6 │
╰─────────┴─────╯
> df.groupby("letters")
> .head(2)
> .sort("letters");
> >>
shape: (5, 2)
╭─────────┬─────╮
│ letters ┆ nrs │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════════╪═════╡
│ "a" ┆ 3 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "a" ┆ 5 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "b" ┆ 6 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 1 │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ "c" ┆ 2 │
╰─────────┴─────╯
Do a pivot operation based on the group key, a pivot column and an aggregation function on the values column.
Column to pivot.
Column that will be aggregated.
Starts a new GroupBy operation.