- Series.apply(function: Callable[[Any], Any], return_dtype: PolarsDataType | None = None, *, skip_nulls: bool = True) Self [source]#
Apply a custom/user-defined function (UDF) over elements in this Series.
If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.
Implementing logic using a Python function is almost always _significantly_ slower and more memory intensive than implementing the same logic using the native expression API because:
The native expression engine runs in Rust; UDFs run in Python.
Use of Python UDFs forces the DataFrame to be materialized in memory.
Polars-native expressions can be parallelised (UDFs typically cannot).
Polars-native expressions can be logically optimised (UDFs cannot).
Wherever possible you should strongly prefer the native expression API to achieve the best performance.
Custom function or lambda.
Output datatype. If none is given, the same datatype as this Series will be used.
Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.
If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an
@lru_cachedecorator to it. With suitable data you may achieve order-of-magnitude speedups (or more).
>>> s = pl.Series("a", [1, 2, 3]) >>> s.apply(lambda x: x + 10) shape: (3,) Series: 'a' [i64] [ 11 12 13 ]