Polars Expressions

Polars has a powerful concept called expressions that is central to its very fast performance.

Expressions are at the core of many data science operations:

  • taking a sample of rows from a column
  • multiplying values in a column
  • extracting a column of years from dates
  • convert a column of strings to lowercase
  • and so on!

However, expressions are also used within other operations:

  • taking the mean of a group in a groupby operation
  • calculating the size of groups in a groupby operation
  • taking the sum horizontally across columns

Polars performs these core data transformations very quickly by:

  • automatic query optimization on each expression
  • automatic parallelization of expressions on many columns

Polars expressions are a mapping from a series to a series (or mathematically Fn(Series) -> Series). As expressions have a Series as an input and a Series as an output then it is straightforward to do a sequence of expressions (similar to method chaining in Pandas).

This has all been a bit abstract, so let's start with some examples.