Create Categorical DataType
Description
Create Categorical DataType
Usage
DataType_Categorical(ordering = "physical")
Arguments
ordering
|
Either “physical” (default) or “lexical” .
|
Details
When a categorical variable is created, its string values (or "lexical" values) are stored and encoded as integers ("physical" values) by order of appearance. Therefore, sorting a categorical value can be done either on the lexical or on the physical values. See Examples.
Value
A Categorical DataType
Examples
library("polars")
# default is to order by physical values
df = pl$DataFrame(x = c("z", "z", "k", "a", "z"), schema = list(x = pl$Categorical()))
df$sort("x")
#> shape: (5, 1)
#> ┌─────┐
#> │ x │
#> │ --- │
#> │ cat │
#> ╞═════╡
#> │ z │
#> │ z │
#> │ z │
#> │ k │
#> │ a │
#> └─────┘
# when setting ordering = "lexical", sorting will be based on the strings
df_lex = pl$DataFrame(
x = c("z", "z", "k", "a", "z"),
schema = list(x = pl$Categorical("lexical"))
)
df_lex$sort("x")
#> shape: (5, 1)
#> ┌─────┐
#> │ x │
#> │ --- │
#> │ cat │
#> ╞═════╡
#> │ a │
#> │ k │
#> │ z │
#> │ z │
#> │ z │
#> └─────┘