Reading & writing
Polars supports reading and writing to all common files (e.g. csv, json, parquet), cloud storage (S3, Azure Blob, BigQuery) and databases (e.g. postgres, mysql). In the following examples we will show how to operate on most common file formats. For the following dataframe
import polars as pl
from datetime import datetime
df = pl.DataFrame(
{
"integer": [1, 2, 3],
"date": [
datetime(2022, 1, 1),
datetime(2022, 1, 2),
datetime(2022, 1, 3),
],
"float": [4.0, 5.0, 6.0],
}
)
print(df)
use chrono::prelude::*;
use std::fs::File;
let mut df: DataFrame = df!(
"integer" => &[1, 2, 3],
"date" => &[
NaiveDate::from_ymd_opt(2022, 1, 1).unwrap().and_hms_opt(0, 0, 0).unwrap(),
NaiveDate::from_ymd_opt(2022, 1, 2).unwrap().and_hms_opt(0, 0, 0).unwrap(),
NaiveDate::from_ymd_opt(2022, 1, 3).unwrap().and_hms_opt(0, 0, 0).unwrap(),
],
"float" => &[4.0, 5.0, 6.0]
)
.expect("should not fail");
println!("{}", df);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
CSV
Polars has its own fast implementation for csv reading with many flexible configuration options.
df.write_csv("docs/data/output.csv")
df_csv = pl.read_csv("docs/data/output.csv")
print(df_csv)
CsvReader
· CsvWriter
· Available on feature csv
let mut file = File::create("docs/data/output.csv").expect("could not create file");
CsvWriter::new(&mut file)
.has_header(true)
.with_delimiter(b',')
.finish(&mut df);
let df_csv = CsvReader::from_path("docs/data/output.csv")?
.infer_schema(None)
.has_header(true)
.finish()?;
println!("{}", df_csv);
shape: (3, 3)
┌─────────┬────────────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ f64 │
╞═════════╪════════════════════════════╪═══════╡
│ 1 ┆ 2022-01-01T00:00:00.000000 ┆ 4.0 │
│ 2 ┆ 2022-01-02T00:00:00.000000 ┆ 5.0 │
│ 3 ┆ 2022-01-03T00:00:00.000000 ┆ 6.0 │
└─────────┴────────────────────────────┴───────┘
As we can see above, Polars made the datetimes a string
. We can tell Polars to parse dates, when reading the csv, to ensure the date becomes a datetime. The example can be found below:
df_csv = pl.read_csv("docs/data/output.csv", try_parse_dates=True)
print(df_csv)
CsvReader
· Available on feature csv
let mut file = File::create("docs/data/output.csv").expect("could not create file");
CsvWriter::new(&mut file)
.has_header(true)
.with_delimiter(b',')
.finish(&mut df);
let df_csv = CsvReader::from_path("docs/data/output.csv")?
.infer_schema(None)
.has_header(true)
.with_parse_dates(true)
.finish()?;
println!("{}", df_csv);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
JSON
df.write_json("docs/data/output.json")
df_json = pl.read_json("docs/data/output.json")
print(df_json)
JsonReader
· JsonWriter
· Available on feature json
let mut file = File::create("docs/data/output.json").expect("could not create file");
JsonWriter::new(&mut file).finish(&mut df);
let mut f = File::open("docs/data/output.json")?;
let df_json = JsonReader::new(f)
.with_json_format(JsonFormat::JsonLines)
.finish()?;
println!("{}", df_json);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
Parquet
df.write_parquet("docs/data/output.parquet")
df_parquet = pl.read_parquet("docs/data/output.parquet")
print(df_parquet)
ParquetReader
· ParquetWriter
· Available on feature parquet
let mut file = File::create("docs/data/output.parquet").expect("could not create file");
ParquetWriter::new(&mut file).finish(&mut df);
let mut f = File::open("docs/data/output.parquet")?;
let df_parquet = ParquetReader::new(f).finish()?;
println!("{}", df_parquet);
shape: (3, 3)
┌─────────┬─────────────────────┬───────┐
│ integer ┆ date ┆ float │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs] ┆ f64 │
╞═════════╪═════════════════════╪═══════╡
│ 1 ┆ 2022-01-01 00:00:00 ┆ 4.0 │
│ 2 ┆ 2022-01-02 00:00:00 ┆ 5.0 │
│ 3 ┆ 2022-01-03 00:00:00 ┆ 6.0 │
└─────────┴─────────────────────┴───────┘
To see more examples and other data formats go to the User Guide, section IO.