Blazingly Fast DataFrame Library
Polars is a highly performant DataFrame library for manipulating structured data. The core is written in Rust, but the library is also available in Python. Its key features are:
- Fast: Polars is written from the ground up, designed close to the machine and without external dependencies.
- I/O: First class support for all common data storage layers: local, cloud storage & databases.
- Easy to use: Write your queries the way they were intended. Polars, internally, will determine the most efficient way to execute using its query optimizer.
- Out of Core: Polars supports out of core data transformation with its streaming API. Allowing you to process your results without requiring all your data to be in memory at the same time
- Parallel: Polars fully utilises the power of your machine by dividing the workload among the available CPU cores without any additional configuration.
- Vectorized Query Engine: Polars uses Apache Arrow, a columnar data format, to process your queries in a vectorized manner. It uses SIMD to optimize CPU usage.
Polars is very fast, and in fact is one of the best performing solutions available. See the results in h2oai's db-benchmark, revived by the DuckDB project.
Polars TPC-H Benchmark results are now available on the official website.
import polars as pl q = ( pl.scan_csv("docs/data/iris.csv") .filter(pl.col("sepal_length") > 5) .group_by("species") .agg(pl.all().sum()) ) df = q.collect()
use polars::prelude::*; let q = LazyCsvReader::new("docs/data/iris.csv") .has_header(true) .finish()? .filter(col("sepal_length").gt(lit(5))) .group_by(vec![col("species")]) .agg([col("*").sum()]); let df = q.collect()?;
Polars has a very active community with frequent releases (approximately weekly). Below are some of the top contributors to the project:
We appreciate all contributions, from reporting bugs to implementing new features. Read our contributing guide to learn more.
This project is licensed under the terms of the MIT license.