Reading & Writing
Polars supports reading & writing to all common files (e.g. csv, json, parquet), cloud storage (S3, Azure Blob, BigQuery) and databases (e.g. postgres, mysql). In the following examples we will show how to operate on most common file formats. For the following dataframe
import polars as pl
from datetime import datetime
df = pl.DataFrame(
{
"integer": [1, 2, 3],
"date": [
datetime(2022, 1, 1),
datetime(2022, 1, 2),
datetime(2022, 1, 3),
],
"float": [4.0, 5.0, 6.0],
}
)
print(df)
use std::fs::File;
use chrono::prelude::*;
let mut df: DataFrame = df!("integer" => &[1, 2, 3],
"date" => &[
NaiveDate::from_ymd_opt(2022, 1, 1).unwrap().and_hms_opt(0, 0, 0).unwrap(),
NaiveDate::from_ymd_opt(2022, 1, 2).unwrap().and_hms_opt(0, 0, 0).unwrap(),
NaiveDate::from_ymd_opt(2022, 1, 3).unwrap().and_hms_opt(0, 0, 0).unwrap(),
],
"float" => &[4.0, 5.0, 6.0]
).expect("should not fail");
println!("{}",df);
let df = pl.DataFrame({
integer: [1, 2, 3],
date: [
new Date(2022, 1, 1, 0, 0),
new Date(2022, 1, 2, 0, 0),
new Date(2022, 1, 3, 0, 0),
],
float: [4.0, 5.0, 6.0],
});
console.log(df);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
CSV
Polars has its own fast implementation for csv reading with many flexible configuration options.
CsvReader
· CsvWriter
· Available on feature csv
let mut file = File::create("output.csv").expect("could not create file");
CsvWriter::new(&mut file).has_header(true).with_delimiter(b',').finish(&mut df);
let df_csv = CsvReader::from_path("output.csv")?.infer_schema(None).has_header(true).finish()?;
println!("{}",df_csv);
shape: (3, 3)
┌─────────┬────────────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ f64 │
╞═════════╪════════════════════════════╪═══════╡
│ 1 ┆ 2022-01-01T00:00:00.000000 ┆ 4.0 │
│ 2 ┆ 2022-01-02T00:00:00.000000 ┆ 5.0 │
│ 3 ┆ 2022-01-03T00:00:00.000000 ┆ 6.0 │
└─────────┴────────────────────────────┴───────┘
As we can see above, Polars made the datetimes a string
. We can tell Polars to parse dates, when reading the csv, to ensure the date becomes a datetime. The example can be found below:
df_csv = pl.read_csv("output.csv", try_parse_dates=True)
print(df_csv)
CsvReader
· Available on feature csv
let mut file = File::create("output.csv").expect("could not create file");
CsvWriter::new(&mut file).has_header(true).with_delimiter(b',').finish(&mut df);
let df_csv = CsvReader::from_path("output.csv")?.infer_schema(None).has_header(true).with_parse_dates(true).finish()?;
println!("{}",df_csv);
var df_csv = pl.readCSV("output.csv", { parseDates: true });
console.log(df_csv);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
JSON
df.write_json("output.json")
df_json = pl.read_json("output.json")
print(df_json)
JsonReader
· JsonWriter
· Available on feature json
let mut file = File::create("output.json").expect("could not create file");
JsonWriter::new(&mut file).finish(&mut df);
let mut f = File::open("output.json")?;
let df_json = JsonReader::new(f).with_json_format(JsonFormat::JsonLines).finish()?;
println!("{}",df_json);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
Parquet
df.write_parquet("output.parquet")
df_parquet = pl.read_parquet("output.parquet")
print(df_parquet)
ParquetReader
· ParquetWriter
· Available on feature parquet
let mut file = File::create("output.parquet").expect("could not create file");
ParquetWriter::new(&mut file).finish(&mut df);
let mut f = File::open("output.parquet")?;
let df_parquet = ParquetReader::new(f).finish()?;
println!("{}",df_parquet);
df.writeParquet("output.parquet");
let df_parquet = pl.readParquet("output.parquet");
console.log(df_parquet);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
To see more examples and other data formats go to the User Guide, section IO.