polars.Series.apply#

Series.apply(function: Callable[[Any], Any], return_dtype: PolarsDataType | None = None, *, skip_nulls: bool = True) Self[source]#

Apply a custom/user-defined function (UDF) over elements in this Series.

If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.

Implementing logic using a Python function is almost always _significantly_ slower and more memory intensive than implementing the same logic using the native expression API because:

  • The native expression engine runs in Rust; UDFs run in Python.

  • Use of Python UDFs forces the DataFrame to be materialized in memory.

  • Polars-native expressions can be parallelised (UDFs typically cannot).

  • Polars-native expressions can be logically optimised (UDFs cannot).

Wherever possible you should strongly prefer the native expression API to achieve the best performance.

Parameters:
function

Custom function or lambda.

return_dtype

Output datatype. If none is given, the same datatype as this Series will be used.

skip_nulls

Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.

Returns:
Series

Notes

If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an @lru_cache decorator to it. With suitable data you may achieve order-of-magnitude speedups (or more).

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.apply(lambda x: x + 10)
shape: (3,)
Series: 'a' [i64]
[
        11
        12
        13
]