Polars SQL
Starting the SQL Context
You can query a Polars
LazyFrame
with SQL.
The first step is to initialize a SQL context, and register a LazyFrame
with it.
Let's load some data and initialize the SQL context:
import polars as pl
# convert 'pokemon' into a Lazyframe by calling the .lazy() method
pokemon = pl.read_csv(
"https://gist.githubusercontent.com/ritchie46/cac6b337ea52281aa23c049250a4ff03/raw/89a957ff3919d90e6ef2d34235e6bf22304f3366/pokemon.csv"
).lazy()
# initialize the SQL context and register the lazyframe
sql = pl.SQLContext()
sql.register("pokemon", pokemon)
Polars supports a single SQL context per thread, and the registered dataframe should be a LazyFrame
.
You can call the register function multiple time for each of your LazyFrame.
Running your SQL queries 🚀🚀
You run your SQL queries with SQLContext.query
.
out = sql.query(
"""
SELECT
"Type 1",
COUNT(DISTINCT "Type 2") AS count_type_2,
AVG(Attack) AS avg_attack_by_type,
MAX(Speed) AS max_speed
FROM pokemon
GROUP BY "Type 1"
"""
)
shape: (15, 4)
┌──────────┬──────────────┬────────────────────┬───────────┐
│ Type 1 ┆ count_type_2 ┆ avg_attack_by_type ┆ max_speed │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ u32 ┆ f64 ┆ i64 │
╞══════════╪══════════════╪════════════════════╪═══════════╡
│ Electric ┆ 3 ┆ 62.0 ┆ 140 │
│ Normal ┆ 3 ┆ 70.625 ┆ 121 │
│ Ground ┆ 2 ┆ 81.875 ┆ 120 │
│ … ┆ … ┆ … ┆ … │
│ Ghost ┆ 1 ┆ 53.75 ┆ 130 │
│ Rock ┆ 3 ┆ 87.5 ┆ 150 │
│ Fairy ┆ 1 ┆ 57.5 ┆ 60 │
└──────────┴──────────────┴────────────────────┴───────────┘