Skip to content

Polars

logo

Blazingly Fast DataFrame Library

Polars is a highly performant DataFrame library for manipulating structured data. The core is written in Rust, but the library is also available in Python. Its key features are:

  • Fast: Polars is written from the ground up, designed close to the machine and without external dependencies.
  • I/O: First class support for all common data storage layers: local, cloud storage & databases.
  • Easy to use: Write your queries the way they were intended. Polars, internally, will determine the most efficient way to execute using its query optimizer.
  • Out of Core: Polars supports out of core data transformation with its streaming API. Allowing you to process your results without requiring all your data to be in memory at the same time
  • Parallel: Polars fully utilises the power of your machine by dividing the workload among the available CPU cores without any additional configuration.
  • Vectorized Query Engine: Polars uses Apache Arrow, a columnar data format, to process your queries in a vectorized manner. It uses SIMD to optimize CPU usage.

About this guide

The Polars user guide is intended to live alongside the API documentation. Its purpose is to explain (new) users how to use Polars and to provide meaningful examples. The guide is split into two parts:

  • Getting Started: A 10 minute helicopter view of the library and its primary function.
  • User Guide: A detailed explanation of how the library is setup and how to use it most effectively.

If you are looking for details on a specific level / object, it is probably best to go the API documentation: Python | Rust.

Performance 馃殌 馃殌

Polars is very fast, and in fact is one of the best performing solutions available. See the results in h2oai's db-benchmark, revived by the DuckDB project.

Polars TPCH Benchmark results are now available on the official website.

Example

scan_csv filter group_by collect

import polars as pl

q = (
    pl.scan_csv("docs/data/iris.csv")
    .filter(pl.col("sepal_length") > 5)
    .group_by("species")
    .agg(pl.all().sum())
)

df = q.collect()

LazyCsvReader filter group_by collect Available on feature csv Available on feature streaming

use polars::prelude::*;

let q = LazyCsvReader::new("docs/data/iris.csv")
    .has_header(true)
    .finish()?
    .filter(col("sepal_length").gt(lit(5)))
    .group_by(vec![col("species")])
    .agg([col("*").sum()]);

let df = q.collect();

Sponsors

Community

Polars has a very active community with frequent releases (approximately weekly). Below are some of the top contributors to the project:

ritchie46 stinodego alexander-beedie MarcoGorelli zundertj ghuls universalmind303 reswqa orlp matteosantama Dandandan magarick mcrumiller ibENPC jorgecarleitao moritzwilksch marcvanheerden borchero cjermain jonashaag josh ryanrussell cnpryer marioloko thatlittleboy illumination-k jakob-keller braaannigan c-peters mhconradt rben01 sorhawell chitralverma YuRiTan cmdlineluser elbaro messense nickray adamgreg CloseChoice paq olivier-lacroix owrior romanovacca avimallu SeanTroyUWO fsimkovic slonik-az nmandery

Contribute

Thanks for taking the time to contribute! We appreciate all contributions, from reporting bugs to implementing new features. If you're unclear on how to proceed read our contribution guide or contact us on discord.

License

This project is licensed under the terms of the MIT license.