Apply eager functions to columns of a DataFrame
Description
A higher-order function to apply functions to selected columns of a DataFrame, similar to purrr::map_at. The selected columns will be materialized as Series before the function is applied, and the return value of the function will be converted back to a Series by as_polars_series.
It is recommended to use <dataframe>$with_columns()
unless they are using expressions that are only possible on Series and
not on Expr. This is almost never the case, except for a very select few
functions that cannot know the output datatype without looking at the
data.
Usage
<DataFrame>$map_columns(column_names, lambda)
Arguments
column_names
|
Column names or selectors specifying columns to apply the function to. |
lambda
|
A function that will receive a Series as the first argument. |
Value
A polars DataFrame
Examples
library("polars")
df1 <- pl$DataFrame(
a = 1:4,
b = c("10", "20", "30", "40"),
)
# Apply `<series>$shrink_dtype()` to the "a" column
df1$map_columns("a", \(s) s$shrink_dtype())
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ i8 ┆ str │
#> ╞═════╪═════╡
#> │ 1 ┆ 10 │
#> │ 2 ┆ 20 │
#> │ 3 ┆ 30 │
#> │ 4 ┆ 40 │
#> └─────┴─────┘
# Convert the "b" column to integer by the base R function `as.integer()`
df1$map_columns("b", \(s) s$to_r_vector() |> as.integer())
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ i32 ┆ i32 │
#> ╞═════╪═════╡
#> │ 1 ┆ 10 │
#> │ 2 ┆ 20 │
#> │ 3 ┆ 30 │
#> │ 4 ┆ 40 │
#> └─────┴─────┘
df2 <- pl$DataFrame(
a = c('{"x":"a"}', NA, '{"x":"b"}', NA),
b = c('{"a":1, "b": true}', NA, '{"a":2, "b": false}', NA),
)
# Apply `<series>$str$json_decode()` to both the "a" and "b" columns
df2$map_columns(c("a", "b"), \(s) s$str$json_decode())
#> shape: (4, 2)
#> ┌───────────┬───────────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ struct[1] ┆ struct[2] │
#> ╞═══════════╪═══════════╡
#> │ {"a"} ┆ {1,true} │
#> │ null ┆ null │
#> │ {"b"} ┆ {2,false} │
#> │ null ┆ null │
#> └───────────┴───────────┘
# Use a selector to apply the function to all columns
df2$map_columns(cs$all(), \(s) s$str$json_decode())
#> shape: (4, 2)
#> ┌───────────┬───────────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ struct[1] ┆ struct[2] │
#> ╞═══════════╪═══════════╡
#> │ {"a"} ┆ {1,true} │
#> │ null ┆ null │
#> │ {"b"} ┆ {2,false} │
#> │ null ┆ null │
#> └───────────┴───────────┘