# Bin continuous values into discrete categories based on their quantiles

Source code

## Description

Bin continuous values into discrete categories based on their quantiles

## Usage

``````<Expr>\$qcut(
quantiles,
...,
labels = NULL,
left_closed = FALSE,
allow_duplicates = FALSE,
include_breaks = FALSE
)
``````

## Arguments

 `quantiles` Either a vector of quantile probabilities between 0 and 1 or a positive integer determining the number of bins with uniform probability. `…` Ignored. `labels` Names of the categories. The number of labels must be equal to the number of cut points plus one. `left_closed` Set the intervals to be left-closed instead of right-closed. `allow_duplicates` If set to `TRUE`, duplicates in the resulting quantiles are dropped, rather than raising an error. This can happen even with unique probabilities, depending on the data. `include_breaks` Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a `Categorical` to a `Struct`.

## Value

Expr of data type `Categorical` is `include_breaks` is `FALSE` and of data type `Struct` if `include_breaks` is `TRUE`.

`\$cut()`

## Examples

``````library(polars)

df = pl\$DataFrame(foo = c(-2, -1, 0, 1, 2))

# Divide a column into three categories according to pre-defined quantile
# probabilities
df\$with_columns(
qcut = pl\$col("foo")\$qcut(c(0.25, 0.75), labels = c("a", "b", "c"))
)
``````
``````#> shape: (5, 2)
#> ┌──────┬──────┐
#> │ foo  ┆ qcut │
#> │ ---  ┆ ---  │
#> │ f64  ┆ cat  │
#> ╞══════╪══════╡
#> │ -2.0 ┆ a    │
#> │ -1.0 ┆ a    │
#> │ 0.0  ┆ b    │
#> │ 1.0  ┆ b    │
#> │ 2.0  ┆ c    │
#> └──────┴──────┘
``````
``````# Divide a column into two categories using uniform quantile probabilities.
df\$with_columns(
qcut = pl\$col("foo")\$qcut(2, labels = c("low", "high"), left_closed = TRUE)
)
``````
``````#> shape: (5, 2)
#> ┌──────┬──────┐
#> │ foo  ┆ qcut │
#> │ ---  ┆ ---  │
#> │ f64  ┆ cat  │
#> ╞══════╪══════╡
#> │ -2.0 ┆ low  │
#> │ -1.0 ┆ low  │
#> │ 0.0  ┆ high │
#> │ 1.0  ┆ high │
#> │ 2.0  ┆ high │
#> └──────┴──────┘
``````
``````# Add both the category and the breakpoint
df\$with_columns(
qcut = pl\$col("foo")\$qcut(c(0.25, 0.75), include_breaks = TRUE)
)\$unnest("qcut")
``````
``````#> shape: (5, 3)
#> ┌──────┬────────────┬────────────┐
#> │ foo  ┆ breakpoint ┆ category   │
#> │ ---  ┆ ---        ┆ ---        │
#> │ f64  ┆ f64        ┆ cat        │
#> ╞══════╪════════════╪════════════╡
#> │ -2.0 ┆ -1.0       ┆ (-inf, -1] │
#> │ -1.0 ┆ -1.0       ┆ (-inf, -1] │
#> │ 0.0  ┆ 1.0        ┆ (-1, 1]    │
#> │ 1.0  ┆ 1.0        ┆ (-1, 1]    │
#> │ 2.0  ┆ inf        ┆ (1, inf]   │
#> └──────┴────────────┴────────────┘
``````