Skip to content

Bin continuous values into discrete categories

Source code

Description

Bin continuous values into discrete categories

Usage

<Expr>$cut(
  breaks,
  ...,
  labels = NULL,
  left_closed = FALSE,
  include_breaks = FALSE
)

Arguments

breaks Unique cut points.
Ignored.
labels Names of the categories. The number of labels must be equal to the number of cut points plus one.
left_closed Set the intervals to be left-closed instead of right-closed.
include_breaks Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct.

Value

Expr of data type Categorical is include_breaks is FALSE and of data type Struct if include_breaks is TRUE.

See Also

$qcut()

Examples

library("polars")

df = pl$DataFrame(foo = c(-2, -1, 0, 1, 2))

df$with_columns(
  cut = pl$col("foo")$cut(c(-1, 1), labels = c("a", "b", "c"))
)
#> shape: (5, 2)
#> ┌──────┬─────┐
#> │ foo  ┆ cut │
#> │ ---  ┆ --- │
#> │ f64  ┆ cat │
#> ╞══════╪═════╡
#> │ -2.0 ┆ a   │
#> │ -1.0 ┆ a   │
#> │ 0.0  ┆ b   │
#> │ 1.0  ┆ b   │
#> │ 2.0  ┆ c   │
#> └──────┴─────┘
# Add both the category and the breakpoint
df$with_columns(
  cut = pl$col("foo")$cut(c(-1, 1), include_breaks = TRUE)
)$unnest("cut")
#> shape: (5, 3)
#> ┌──────┬────────────┬────────────┐
#> │ foo  ┆ breakpoint ┆ category   │
#> │ ---  ┆ ---        ┆ ---        │
#> │ f64  ┆ f64        ┆ cat        │
#> ╞══════╪════════════╪════════════╡
#> │ -2.0 ┆ -1.0       ┆ (-inf, -1] │
#> │ -1.0 ┆ -1.0       ┆ (-inf, -1] │
#> │ 0.0  ┆ 1.0        ┆ (-1, 1]    │
#> │ 1.0  ┆ 1.0        ┆ (-1, 1]    │
#> │ 2.0  ┆ inf        ┆ (1, inf]   │
#> └──────┴────────────┴────────────┘