New LazyFrame from NDJSON
Description
Read a file from path into a polars LazyFrame.
Usage
pl_scan_ndjson(
source,
...,
infer_schema_length = 100,
batch_size = NULL,
n_rows = NULL,
low_memory = FALSE,
rechunk = FALSE,
row_index_name = NULL,
row_index_offset = 0,
reuse_downloaded = TRUE,
ignore_errors = FALSE
)
Arguments
source
|
Path to a file or URL. It is possible to provide multiple paths provided that all NDJSON files have the same schema. It is not possible to provide several URLs. |
…
|
Ignored. |
infer_schema_length
|
Maximum number of rows to read to infer the column types. If set to 0,
all columns will be read as UTF-8. If NULL , a full table
scan will be done (slow).
|
batch_size
|
Number of rows that will be processed per thread. |
n_rows
|
Maximum number of rows to read. |
low_memory
|
Reduce memory usage (will yield a lower performance). |
rechunk
|
Reallocate to contiguous memory when all chunks / files are parsed. |
row_index_name
|
If not NULL , this will insert a row index column with the
given name into the DataFrame.
|
row_index_offset
|
Offset to start the row index column (only used if the name is set). |
reuse_downloaded
|
If TRUE (default) and a URL was provided, cache the
downloaded files in session for an easy reuse.
|
ignore_errors
|
Keep reading the file even if some lines yield errors. You can also use
infer_schema_length = 0 to read all columns as UTF8 to
check which values might cause an issue.
|
Value
A LazyFrame
Examples
library("polars")
if (require("jsonlite", quietly = TRUE)) {
ndjson_filename = tempfile()
jsonlite::stream_out(iris, file(ndjson_filename), verbose = FALSE)
pl$scan_ndjson(ndjson_filename)$collect()
}
#> shape: (150, 5)
#> ┌──────────────┬─────────────┬──────────────┬─────────────┬───────────┐
#> │ Sepal.Length ┆ Sepal.Width ┆ Petal.Length ┆ Petal.Width ┆ Species │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ str │
#> ╞══════════════╪═════════════╪══════════════╪═════════════╪═══════════╡
#> │ 5.1 ┆ 3.5 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.9 ┆ 3.0 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.7 ┆ 3.2 ┆ 1.3 ┆ 0.2 ┆ setosa │
#> │ 4.6 ┆ 3.1 ┆ 1.5 ┆ 0.2 ┆ setosa │
#> │ 5.0 ┆ 3.6 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ … ┆ … ┆ … ┆ … ┆ … │
#> │ 6.7 ┆ 3.0 ┆ 5.2 ┆ 2.3 ┆ virginica │
#> │ 6.3 ┆ 2.5 ┆ 5.0 ┆ 1.9 ┆ virginica │
#> │ 6.5 ┆ 3.0 ┆ 5.2 ┆ 2.0 ┆ virginica │
#> │ 6.2 ┆ 3.4 ┆ 5.4 ┆ 2.3 ┆ virginica │
#> │ 5.9 ┆ 3.0 ┆ 5.1 ┆ 1.8 ┆ virginica │
#> └──────────────┴─────────────┴──────────────┴─────────────┴───────────┘