Skip to content

Get the difference of two list variables

Source code

Description

This returns the "asymmetric difference", meaning only the elements of the first list that are not in the second list. To get all elements that are in only one of the two lists, use $set_symmetric_difference().

Usage

<Expr>$list$set_difference(other)

Arguments

other Other list variable. Can be an Expr or something coercible to an Expr.

Details

Note that the datatypes inside the list must have a common supertype. For example, the first column can be list[i32] and the second one can be list[i8] because it can be cast to list[i32]. However, the second column cannot be e.g list[f32].

Value

Expr

Examples

library("polars")

df = pl$DataFrame(
  a = list(1:3, NA_integer_, c(NA_integer_, 3L), 5:7),
  b = list(2:4, 3L, c(3L, 4L, NA_integer_), c(6L, 8L))
)

df$with_columns(difference = pl$col("a")$list$set_difference("b"))
#> shape: (4, 3)
#> ┌───────────┬──────────────┬────────────┐
#> │ a         ┆ b            ┆ difference │
#> │ ---       ┆ ---          ┆ ---        │
#> │ list[i32] ┆ list[i32]    ┆ list[i32]  │
#> ╞═══════════╪══════════════╪════════════╡
#> │ [1, 2, 3] ┆ [2, 3, 4]    ┆ [1]        │
#> │ [null]    ┆ [3]          ┆ [null]     │
#> │ [null, 3] ┆ [3, 4, null] ┆ []         │
#> │ [5, 6, 7] ┆ [6, 8]       ┆ [5, 7]     │
#> └───────────┴──────────────┴────────────┘