polars.LazyFrame.collect#

LazyFrame.collect(*, type_coercion: bool = True, predicate_pushdown: bool = True, projection_pushdown: bool = True, simplify_expression: bool = True, no_optimization: bool = False, slice_pushdown: bool = True, common_subplan_elimination: bool = True, streaming: bool = False) DataFrame[source]#

Collect into a DataFrame.

Note: use fetch() if you want to run your query on the first n rows only. This can be a huge time saver in debugging queries.

Parameters:
type_coercion

Do type coercion optimization.

predicate_pushdown

Do predicate pushdown optimization.

projection_pushdown

Do projection pushdown optimization.

simplify_expression

Run simplify expressions optimization.

no_optimization

Turn off (certain) optimizations.

slice_pushdown

Slice pushdown optimization.

common_subplan_elimination

Will try to cache branching subplans that occur on self-joins or unions.

streaming

Run parts of the query in a streaming fashion (this is in an alpha state)

Returns:
DataFrame

Examples

>>> lf = pl.LazyFrame(
...     {
...         "a": ["a", "b", "a", "b", "b", "c"],
...         "b": [1, 2, 3, 4, 5, 6],
...         "c": [6, 5, 4, 3, 2, 1],
...     }
... )
>>> lf.groupby("a", maintain_order=True).agg(pl.all().sum()).collect()
shape: (3, 3)
┌─────┬─────┬─────┐
│ a   ┆ b   ┆ c   │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ a   ┆ 4   ┆ 10  │
│ b   ┆ 11  ┆ 10  │
│ c   ┆ 6   ┆ 1   │
└─────┴─────┴─────┘