Skip to content

Convert a Series of type List to Struct

Source code

Description

Convert a Series of type List to Struct

Usage

<Expr>$list$to_struct(
  n_field_strategy = c("first_non_null", "max_width"),
  fields = NULL,
  upper_bound = 0
)

Arguments

n_field_strategy Strategy to determine the number of fields of the struct. If “first_non_null” (default), set number of fields equal to the length of the first non zero-length list. If “max_width”, the number of fields is the maximum length of a list.
fields If the name and number of the desired fields is known in advance, a list of field names can be given, which will be assigned by index. Otherwise, to dynamically assign field names, a custom R function that takes an R double and outputs a string value can be used. If NULL (default), fields will be field_0, field_1field_n.
upper_bound A LazyFrame needs to know the schema at all time. The caller therefore must provide an upper_bound of struct fields that will be set. If set incorrectly, downstream operation may fail. For instance an all()$sum() expression will look in the current schema to determine which columns to select. When operating on a DataFrame, the schema does not need to be tracked or pre-determined, as the result will be eagerly evaluated, so you can leave this parameter unset.

Value

Expr

Examples

library("polars")

df = pl$DataFrame(list(a = list(1:2, 1:3)))

# this discards the third value of the second list as the struct length is
# determined based on the length of the first non-empty list
df$with_columns(
  struct = pl$col("a")$list$to_struct()
)
#> shape: (2, 2)
#> ┌───────────┬───────────┐
#> │ a         ┆ struct    │
#> │ ---       ┆ ---       │
#> │ list[i32] ┆ struct[2] │
#> ╞═══════════╪═══════════╡
#> │ [1, 2]    ┆ {1,2}     │
#> │ [1, 2, 3] ┆ {1,2}     │
#> └───────────┴───────────┘
# we can use "max_width" to keep all values
df$with_columns(
  struct = pl$col("a")$list$to_struct(n_field_strategy = "max_width")
)
#> shape: (2, 2)
#> ┌───────────┬────────────┐
#> │ a         ┆ struct     │
#> │ ---       ┆ ---        │
#> │ list[i32] ┆ struct[3]  │
#> ╞═══════════╪════════════╡
#> │ [1, 2]    ┆ {1,2,null} │
#> │ [1, 2, 3] ┆ {1,2,3}    │
#> └───────────┴────────────┘
# pass a custom function that will name all fields by adding a prefix
df2 = df$with_columns(
  pl$col("a")$list$to_struct(
    fields = \(idx) paste0("col_", idx)
  )
)
df2
#> shape: (2, 1)
#> ┌───────────┐
#> │ a         │
#> │ ---       │
#> │ struct[2] │
#> ╞═══════════╡
#> │ {1,2}     │
#> │ {1,2}     │
#> └───────────┘
df2$unnest()
#> shape: (2, 2)
#> ┌───────┬───────┐
#> │ col_0 ┆ col_1 │
#> │ ---   ┆ ---   │
#> │ i32   ┆ i32   │
#> ╞═══════╪═══════╡
#> │ 1     ┆ 2     │
#> │ 1     ┆ 2     │
#> └───────┴───────┘
df2$to_list()
#> $a
#> $a$col_0
#> [1] 1 1
#> 
#> $a$col_1
#> [1] 2 2