path or buffer or string
file.csv
.Optional
options: Partial<ReadCsvOptions>Maximum number of lines to read to infer schema. If set to 0, all columns will be read as pl.Utf8.
If set to null
, a full table scan will be done (slow).
After n rows are read from the CSV, it stops reading.
During multi-threaded parsing, an upper bound of n
rows
cannot be guaranteed.
Number of lines to read into the buffer at once. Modify this to change performance.
Indicate if first row of dataset is header or not. If set to False first row will be set to column_x
,
x
being an enumeration over every column in the dataset.
Try to keep reading lines if some lines yield errors.
After n rows are read from the CSV, it stops reading.
During multi-threaded parsing, an upper bound of n
rows
cannot be guaranteed.
Start reading after startRows
position.
Indices of columns to select. Note that column indices start at zero.
Character to use as delimiter in the file.
Columns to select.
Make sure that all columns are contiguous in memory by aggregating the chunks into a single array.
Allowed encodings: utf8
, utf8-lossy
. Lossy means that invalid utf8 values are replaced with �
character.
Number of threads to use in csv parsing. Defaults to the number of physical cpu's of your system.
Overwrite the dtypes during inference.
Set the CSV file's schema. This only accepts datatypes that are implemented in the csv parser and expects a complete Schema.
Reduce memory usage in expense of performance.
character that indicates the start of a comment line, for instance '#'.
character that is used for csv quoting, default = ''. Set to null to turn special handling and escaping of quotes off.
Values to interpret as null values. You can provide a
- string
-> all values encountered equal to this string will be null
- Array<string>
-> A null value per column.
- Record<string,string>
-> An object or map that maps column name to a null value string.Ex. {"column_1": 0}
Whether to attempt to parse dates or not
DataFrame
Read a CSV file or string into a Dataframe.