polars.scan_ndjson#

polars.scan_ndjson(file: str | pathlib.Path, infer_schema_length: int | None = 100, batch_size: int | None = 1024, n_rows: Optional[int] = None, low_memory: bool = False, rechunk: bool = True, row_count_name: Optional[str] = None, row_count_offset: int = 0) LazyFrame[source]#

Lazily read from a newline delimited JSON file.

This allows the query optimizer to push down predicates and projections to the scan level, thereby potentially reducing memory overhead.

Parameters:
file

Path to a file.

infer_schema_length

Infer the schema length from the first infer_schema_length rows.

batch_size

Number of rows to read in each batch.

n_rows

Stop reading from JSON file after reading n_rows.

low_memory

Reduce memory pressure at the expense of performance.

rechunk

Reallocate to contiguous memory when all chunks/ files are parsed.

row_count_name

If not None, this will insert a row count column with give name into the DataFrame

row_count_offset

Offset to start the row_count column (only use if the name is set)