Function scanJson

  • Read a JSON file or string into a DataFrame.

    Note: Currently only newline delimited JSON is supported

    Parameters

    • path: string

      path to json file

      • path: Path to a file or a file like string. Any valid filepath can be used. Example: ./file.json.
    • Optionaloptions: Partial<ScanJsonOptions>
      • inferSchemaLength

        Maximum number of lines to read to infer schema. If set to 0, all columns will be read as pl.Utf8. If set to null, a full table scan will be done (slow).

      • nThreads

        Maximum number of threads to use when reading json.

      • lowMemory

        Reduce memory usage in expense of performance.

      • batchSize

        Number of lines to read into the buffer at once. Modify this to change performance.

      • numRows

        Stop reading from parquet file after reading numRows.

      • skipRows

        Start reading after skipRows position.

      • rowCount

        Add row count as column

    Returns LazyDataFrame

    (DataFrame)

    > const df = pl.scanJson('path/to/file.json', {numRows: 2}).collectSync()
    > console.log(df)
    shape: (2, 3)
    ╭─────┬─────┬─────╮
    abc
    │ --- ┆ --- ┆ --- │
    i64stri64
    ╞═════╪═════╪═════╡
    1foo3
    ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
    2bar6
    ╰─────┴─────┴─────╯