LazyFrame.map(function: Callable[[DataFrame], DataFrame], *, predicate_pushdown: bool = True, projection_pushdown: bool = True, slice_pushdown: bool = True, no_optimizations: bool = False, schema: None | SchemaDict = None, validate_output_schema: bool = True, streamable: bool = False) Self[source]#

Apply a custom function.

It is important that the function returns a Polars DataFrame.


Lambda/ function to apply.


Allow predicate pushdown optimization to pass this node.


Allow projection pushdown optimization to pass this node.


Allow slice pushdown optimization to pass this node.


Turn off all optimizations past this point.


Output schema of the function, if set to None we assume that the schema will remain unchanged by the applied function.


It is paramount that polars’ schema is correct. This flag will ensure that the output schema of this function will be checked with the expected schema. Setting this to False will not do this check, but may lead to hard to debug bugs.


Whether the function that is given is eligible to be running with the streaming engine. That means that the function must produce the same result when it is executed in batches or when it is be executed on the full dataset.


The schema of a LazyFrame must always be correct. It is up to the caller of this function to ensure that this invariant is upheld.

It is important that the optimization flags are correct. If the custom function for instance does an aggregation of a column, predicate_pushdown should not be allowed, as this prunes rows and will influence your aggregation results.


>>> lf = pl.LazyFrame(
...     {
...         "a": [1, 2],
...         "b": [3, 4],
...     }
... )
>>> lf.map(lambda x: 2 * x).collect()
shape: (2, 2)
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
│ 2   ┆ 6   │
│ 4   ┆ 8   │