Skip to content

Apply eager functions to columns of a DataFrame

Source code

Description

[Experimental]

A higher-order function to apply functions to selected columns of a DataFrame, similar to purrr::map_at. The selected columns will be materialized as Series before the function is applied, and the return value of the function will be converted back to a Series by as_polars_series.

It is recommended to use <dataframe>$with_columns() unless they are using expressions that are only possible on Series and not on Expr. This is almost never the case, except for a very select few functions that cannot know the output datatype without looking at the data.

Usage

<DataFrame>$map_columns(column_names, lambda)

Arguments

column_names Column names or selectors specifying columns to apply the function to.
lambda A function that will receive a Series as the first argument.

Value

A polars DataFrame

Examples

library("polars")

df1 <- pl$DataFrame(
  a = 1:4,
  b = c("10", "20", "30", "40"),
)

# Apply `<series>$shrink_dtype()` to the "a" column
df1$map_columns("a", \(s) s$shrink_dtype())
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a   ┆ b   │
#> │ --- ┆ --- │
#> │ i8  ┆ str │
#> ╞═════╪═════╡
#> │ 1   ┆ 10  │
#> │ 2   ┆ 20  │
#> │ 3   ┆ 30  │
#> │ 4   ┆ 40  │
#> └─────┴─────┘
# Convert the "b" column to integer by the base R function `as.integer()`
df1$map_columns("b", \(s) s$to_r_vector() |> as.integer())
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a   ┆ b   │
#> │ --- ┆ --- │
#> │ i32 ┆ i32 │
#> ╞═════╪═════╡
#> │ 1   ┆ 10  │
#> │ 2   ┆ 20  │
#> │ 3   ┆ 30  │
#> │ 4   ┆ 40  │
#> └─────┴─────┘
df2 <- pl$DataFrame(
  a = c('{"x":"a"}', NA, '{"x":"b"}', NA),
  b = c('{"a":1, "b": true}', NA, '{"a":2, "b": false}', NA),
)

# Apply `<series>$str$json_decode()` to both the "a" and "b" columns
df2$map_columns(c("a", "b"), \(s) s$str$json_decode())
#> shape: (4, 2)
#> ┌───────────┬───────────┐
#> │ a         ┆ b         │
#> │ ---       ┆ ---       │
#> │ struct[1] ┆ struct[2] │
#> ╞═══════════╪═══════════╡
#> │ {"a"}     ┆ {1,true}  │
#> │ null      ┆ null      │
#> │ {"b"}     ┆ {2,false} │
#> │ null      ┆ null      │
#> └───────────┴───────────┘
# Use a selector to apply the function to all columns
df2$map_columns(cs$all(), \(s) s$str$json_decode())
#> shape: (4, 2)
#> ┌───────────┬───────────┐
#> │ a         ┆ b         │
#> │ ---       ┆ ---       │
#> │ struct[1] ┆ struct[2] │
#> ╞═══════════╪═══════════╡
#> │ {"a"}     ┆ {1,true}  │
#> │ null      ┆ null      │
#> │ {"b"}     ┆ {2,false} │
#> │ null      ┆ null      │
#> └───────────┴───────────┘