polars.DataFrame.unique#

DataFrame.unique(maintain_order: bool = True, subset: str | Sequence[str] | None = None, keep: UniqueKeepStrategy = 'first') DF[source]#

Drop duplicate rows from this DataFrame.

Parameters:
maintain_order

Keep the same order as the original DataFrame. This requires more work to compute.

subset

Subset to use to compare rows.

keep{‘first’, ‘last’}

Which of the duplicate rows to keep (in conjunction with subset).

Returns:
DataFrame with unique rows

Warning

Note that this fails if there is a column of type List in the DataFrame or subset.

Examples

>>> df = pl.DataFrame(
...     {
...         "a": [1, 1, 2, 3, 4, 5],
...         "b": [0.5, 0.5, 1.0, 2.0, 3.0, 3.0],
...         "c": [True, True, True, False, True, True],
...     }
... )
>>> df.unique()
shape: (5, 3)
┌─────┬─────┬───────┐
│ a   ┆ b   ┆ c     │
│ --- ┆ --- ┆ ---   │
│ i64 ┆ f64 ┆ bool  │
╞═════╪═════╪═══════╡
│ 1   ┆ 0.5 ┆ true  │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 2   ┆ 1.0 ┆ true  │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 3   ┆ 2.0 ┆ false │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 4   ┆ 3.0 ┆ true  │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 5   ┆ 3.0 ┆ true  │
└─────┴─────┴───────┘