Return the k largest rows
Description
Non-null elements are always preferred over null elements, regardless of
the value of reverse. The output is not guaranteed to be in
any particular order, call sort() after this function if
you wish the output to be sorted.
Usage
<LazyFrame>$top_k(k, ..., by, reverse = FALSE)
Arguments
k
|
Number of rows to return. |
…
|
These dots are for future extensions and must be empty. |
by
|
Column(s) used to determine the bottom rows. Accepts expression input. Strings are parsed as column names. |
reverse
|
Consider the k smallest elements of the by
column(s) (instead of the k largest). This can be specified
per column by passing a sequence of booleans.
|
Value
A polars LazyFrame
Examples
library("polars")
lf <- pl$LazyFrame(
a = c("a", "b", "a", "b", "b", "c"),
b = c(2, 1, 1, 3, 2, 1)
)
# Get the rows which contain the 4 largest values in column b.
lf$top_k(4, by = "b")$collect()
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ str ┆ f64 │
#> ╞═════╪═════╡
#> │ b ┆ 3.0 │
#> │ a ┆ 2.0 │
#> │ b ┆ 2.0 │
#> │ b ┆ 1.0 │
#> └─────┴─────┘
# Get the rows which contain the 4 largest values when sorting on column a
# and b
lf$top_k(4, by = c("a", "b"))$collect()
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ str ┆ f64 │
#> ╞═════╪═════╡
#> │ c ┆ 1.0 │
#> │ b ┆ 3.0 │
#> │ b ┆ 2.0 │
#> │ b ┆ 1.0 │
#> └─────┴─────┘