Function readCSV

  • Read a CSV file or string into a Dataframe.


    Parameters

    • pathOrBody: string | Buffer

      path or buffer or string

      • path: Path to a file or a file like string. Any valid filepath can be used. Example: file.csv.
      • body: String or buffer to be read as a CSV
    • Optionaloptions: Partial<ReadCsvOptions>
      • inferSchemaLength

        Maximum number of lines to read to infer schema. If set to 0, all columns will be read as pl.Utf8. If set to null, a full table scan will be done (slow).

      • nRows

        After n rows are read from the CSV, it stops reading. During multi-threaded parsing, an upper bound of n rows cannot be guaranteed.

      • batchSize

        Number of lines to read into the buffer at once. Modify this to change performance.

      • hasHeader

        Indicate if first row of dataset is header or not. If set to False first row will be set to column_x, x being an enumeration over every column in the dataset.

      • ignoreErrors

        Try to keep reading lines if some lines yield errors.

      • endRows

        After n rows are read from the CSV, it stops reading. During multi-threaded parsing, an upper bound of n rows cannot be guaranteed.

      • startRows

        Start reading after startRows position.

      • projection

        Indices of columns to select. Note that column indices start at zero.

      • sep

        Character to use as delimiter in the file.

      • columns

        Columns to select.

      • rechunk

        Make sure that all columns are contiguous in memory by aggregating the chunks into a single array.

      • encoding

        Allowed encodings: utf8, utf8-lossy. Lossy means that invalid utf8 values are replaced with character.

      • numThreads

        Number of threads to use in csv parsing. Defaults to the number of physical cpu's of your system.

      • dtype

        Overwrite the dtypes during inference.

      • schema

        Set the CSV file's schema. This only accepts datatypes that are implemented in the csv parser and expects a complete Schema.

      • lowMemory

        Reduce memory usage in expense of performance.

      • commentChar

        character that indicates the start of a comment line, for instance '#'.

      • quoteChar

        character that is used for csv quoting, default = ''. Set to null to turn special handling and escaping of quotes off.

      • nullValues

        Values to interpret as null values. You can provide a - string -> all values encountered equal to this string will be null - Array<string> -> A null value per column. - Record<string,string> -> An object or map that maps column name to a null value string.Ex. {"column_1": 0}

      • parseDates

        Whether to attempt to parse dates or not

    Returns pl.DataFrame

    DataFrame