Skip to content

Version 0.19

Breaking changes

Aggregation functions no longer support horizontal computation

This impacts aggregation functions like sum, min, and max. These functions were overloaded to support both vertical and horizontal computation. Recently, new dedicated functionality for horizontal computation was released, and horizontal computation was deprecated.

Restore the old behavior by using the horizontal variant, e.g. sum_horizontal.

Example

Before:

>>> df = pl.DataFrame({'a': [1, 2], 'b': [11, 12]})
>>> df.select(pl.sum('a', 'b'))  # horizontal computation
shape: (2, 1)
┌─────┐
│ sum │
│ --- │
│ i64 │
╞═════╡
│ 12  │
│ 14  │
└─────┘

After:

>>> df = pl.DataFrame({'a': [1, 2], 'b': [11, 12]})
>>> df.select(pl.sum('a', 'b'))  # vertical computation
shape: (1, 2)
┌─────┬─────┐
│ a    b   │
│ ---  --- │
│ i64  i64 │
╞═════╪═════╡
│ 3    23  │
└─────┴─────┘

Update to all / any

all will now ignore null values by default, rather than treat them as False.

For both any and all, the drop_nulls parameter has been renamed to ignore_nulls and is now keyword-only. Also fixed an issue when setting this parameter to False would erroneously result in None output in some cases.

To restore the old behavior, set ignore_nulls to False and check for None output.

Example

Before:

>>> pl.Series([True, None]).all()
False

After:

>>> pl.Series([True, None]).all()
True

Improved error types for many methods

Improving our error messages is an ongoing effort. We did a sweep of our Python code base and made many improvements to error messages and error types. Most notably, many ValueErrors were changed to TypeErrors.

If your code relies on handling Polars exceptions, you may have to make some adjustments.

Example

Before:

>>> pl.Series(values=15)
...
ValueError: Series constructor called with unsupported type; got 'int'

After:

>>> pl.Series(values=15)
...
TypeError: Series constructor called with unsupported type 'int' for the `values` parameter

Updates to expression input parsing

Methods like select and with_columns accept one or more expressions. But they also accept strings, integers, lists, and other inputs that we try to interpret as expressions. We updated our internal logic to parse inputs more consistently.

Example

Before:

>>> pl.DataFrame({'a': [1, 2]}).with_columns(None)
shape: (2, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 1   │
│ 2   │
└─────┘

After:

>>> pl.DataFrame({'a': [1, 2]}).with_columns(None)
shape: (2, 2)
┌─────┬─────────┐
│ a    literal │
│ ---  ---     │
│ i64  null    │
╞═════╪═════════╡
│ 1    null    │
│ 2    null    │
└─────┴─────────┘

shuffle / sample now use an internal Polars seed

If you used the built-in Python random.seed function to control the randomness of Polars expressions, this will no longer work. Instead, use the new set_random_seed function.

Example

Before:

import random

random.seed(1)

After:

import polars as pl

pl.set_random_seed(1)

Deprecations

Creating a consistent and intuitive API is hard; finding the right name for each function, method, and parameter might be the hardest part. The new version comes with several naming changes, and you will most likely run into deprecation warnings when upgrading to 0.19.

If you want to upgrade without worrying about deprecation warnings right now, you can add the following snippet to your code:

import warnings

warnings.filterwarnings("ignore", category=DeprecationWarning)

groupby renamed to group_by

This is not a change we make lightly, as it will impact almost all our users. But "group by" is really two different words, and our naming strategy dictates that these should be separated by an underscore.

Most likely, a simple search and replace will be enough to take care of this update:

  • Search: .groupby(
  • Replace: .group_by(

apply renamed to map_*

apply is probably the most misused part of our API. Many Polars users come from pandas, where apply has a completely different meaning.

We now consolidate all our functionality for user-defined functions under the name map. This results in the following renaming:

Before After
Series/Expr.apply map_elements
Series/Expr.rolling_apply rolling_map
DataFrame.apply map_rows
GroupBy.apply map_groups
apply map_groups
map map_batches