Skip to content

Pivot data from long to wide

Source code

Description

Pivot data from long to wide

Usage

<DataFrame>$pivot(
  on,
  ...,
  index,
  values,
  aggregate_function = NULL,
  maintain_order = TRUE,
  sort_columns = FALSE,
  separator = "_"
)

Arguments

on Name of the column(s) whose values will be used as the header of the output DataFrame.
Not used.
index One or multiple keys to group by.
values Column values to aggregate. Can be multiple columns if the on arguments contains multiple columns as well.
aggregate_function One of:
  • string indicating the expressions to aggregate with, such as ‘first’, ‘sum’, ‘max’, ‘min’, ‘mean’, ‘median’, ‘last’, ‘count’),
  • an Expr e.g. pl$element()$sum()
maintain_order Sort the grouped keys so that the output order is predictable.
sort_columns Sort the transposed columns by name. Default is by order of discovery.
separator Used as separator/delimiter in generated column names.

Value

DataFrame

Examples

library("polars")

df = pl$DataFrame(
  foo = c("one", "one", "one", "two", "two", "two"),
  bar = c("A", "B", "C", "A", "B", "C"),
  baz = c(1, 2, 3, 4, 5, 6)
)
df
#> shape: (6, 3)
#> ┌─────┬─────┬─────┐
#> │ foo ┆ bar ┆ baz │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ f64 │
#> ╞═════╪═════╪═════╡
#> │ one ┆ A   ┆ 1.0 │
#> │ one ┆ B   ┆ 2.0 │
#> │ one ┆ C   ┆ 3.0 │
#> │ two ┆ A   ┆ 4.0 │
#> │ two ┆ B   ┆ 5.0 │
#> │ two ┆ C   ┆ 6.0 │
#> └─────┴─────┴─────┘
df$pivot(
  values = "baz", index = "foo", on = "bar"
)
#> shape: (2, 4)
#> ┌─────┬─────┬─────┬─────┐
#> │ foo ┆ A   ┆ B   ┆ C   │
#> │ --- ┆ --- ┆ --- ┆ --- │
#> │ str ┆ f64 ┆ f64 ┆ f64 │
#> ╞═════╪═════╪═════╪═════╡
#> │ one ┆ 1.0 ┆ 2.0 ┆ 3.0 │
#> │ two ┆ 4.0 ┆ 5.0 ┆ 6.0 │
#> └─────┴─────┴─────┴─────┘
# Run an expression as aggregation function
df = pl$DataFrame(
  col1 = c("a", "a", "a", "b", "b", "b"),
  col2 = c("x", "x", "x", "x", "y", "y"),
  col3 = c(6, 7, 3, 2, 5, 7)
)
df
#> shape: (6, 3)
#> ┌──────┬──────┬──────┐
#> │ col1 ┆ col2 ┆ col3 │
#> │ ---  ┆ ---  ┆ ---  │
#> │ str  ┆ str  ┆ f64  │
#> ╞══════╪══════╪══════╡
#> │ a    ┆ x    ┆ 6.0  │
#> │ a    ┆ x    ┆ 7.0  │
#> │ a    ┆ x    ┆ 3.0  │
#> │ b    ┆ x    ┆ 2.0  │
#> │ b    ┆ y    ┆ 5.0  │
#> │ b    ┆ y    ┆ 7.0  │
#> └──────┴──────┴──────┘
df$pivot(
  index = "col1",
  on = "col2",
  values = "col3",
  aggregate_function = pl$element()$tanh()$mean()
)
#> shape: (2, 3)
#> ┌──────┬──────────┬──────────┐
#> │ col1 ┆ x        ┆ y        │
#> │ ---  ┆ ---      ┆ ---      │
#> │ str  ┆ f64      ┆ f64      │
#> ╞══════╪══════════╪══════════╡
#> │ a    ┆ 0.998347 ┆ null     │
#> │ b    ┆ 0.964028 ┆ 0.999954 │
#> └──────┴──────────┴──────────┘