polars.DataFrame.drop_nulls#

DataFrame.drop_nulls(subset: str | Sequence[str] | None = None) Self[source]#

Drop all rows that contain null values.

Returns a new DataFrame.

Parameters:
subset

Column name(s) for which null values are considered. If set to None (default), use all columns.

Examples

>>> df = pl.DataFrame(
...     {
...         "foo": [1, 2, 3],
...         "bar": [6, None, 8],
...         "ham": ["a", "b", "c"],
...     }
... )
>>> df.drop_nulls()
shape: (2, 3)
┌─────┬─────┬─────┐
│ foo ┆ bar ┆ ham │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ str │
╞═════╪═════╪═════╡
│ 1   ┆ 6   ┆ a   │
│ 3   ┆ 8   ┆ c   │
└─────┴─────┴─────┘

This method drops rows where any single value of the row is null.

Below are some example snippets that show how you could drop null values based on other conditions

>>> df = pl.DataFrame(
...     {
...         "a": [None, None, None, None],
...         "b": [1, 2, None, 1],
...         "c": [1, None, None, 1],
...     }
... )
>>> df
shape: (4, 3)
┌──────┬──────┬──────┐
│ a    ┆ b    ┆ c    │
│ ---  ┆ ---  ┆ ---  │
│ f64  ┆ i64  ┆ i64  │
╞══════╪══════╪══════╡
│ null ┆ 1    ┆ 1    │
│ null ┆ 2    ┆ null │
│ null ┆ null ┆ null │
│ null ┆ 1    ┆ 1    │
└──────┴──────┴──────┘

Drop a row only if all values are null:

>>> df.filter(~pl.all(pl.all().is_null()))
shape: (3, 3)
┌──────┬─────┬──────┐
│ a    ┆ b   ┆ c    │
│ ---  ┆ --- ┆ ---  │
│ f64  ┆ i64 ┆ i64  │
╞══════╪═════╪══════╡
│ null ┆ 1   ┆ 1    │
│ null ┆ 2   ┆ null │
│ null ┆ 1   ┆ 1    │
└──────┴─────┴──────┘

Drop a column if all values are null:

>>> df[[s.name for s in df if not (s.null_count() == df.height)]]
shape: (4, 2)
┌──────┬──────┐
│ b    ┆ c    │
│ ---  ┆ ---  │
│ i64  ┆ i64  │
╞══════╪══════╡
│ 1    ┆ 1    │
│ 2    ┆ null │
│ null ┆ null │
│ 1    ┆ 1    │
└──────┴──────┘