polars.scan_parquet¶
- polars.scan_parquet(file: Union[str, pathlib.Path], n_rows: Optional[int] = None, cache: bool = True, parallel: bool = True, rechunk: bool = True, row_count_name: Optional[str] = None, row_count_offset: int = 0, **kwargs: Any) polars.internals.lazy_frame.LazyFrame ¶
Lazily read from a parquet file or multiple files via glob patterns.
This allows the query optimizer to push down predicates and projections to the scan level, thereby potentially reducing memory overhead.
- Parameters
- file
Path to a file.
- n_rows
Stop reading from parquet file after reading
n_rows
.- cache
Cache the result after reading.
- parallel
Read the parquet file in parallel. The single threaded reader consumes less memory.
- rechunk
In case of reading multiple files via a glob pattern rechunk the final DataFrame into contiguous memory chunks.
- row_count_name
If not None, this will insert a row count column with give name into the DataFrame
- row_count_offset
Offset to start the row_count column (only use if the name is set)