Lazy API
The Lazy API page is under construction.
To demonstrate the lazy Polars
capabilities we'll explore two medium-large
datasets of usernames:
Reddit usernames dataset containing 69+ million rows
import polars as pl
from ..paths import DATA_DIR
dataset = pl.read_csv(f"{DATA_DIR}/reddit.csv", n_rows=10)
shape: (5, 6)
┌─────┬──────────────────────────┬─────────────┬────────────┬───────────────┬────────────┐
│ id ┆ name ┆ created_utc ┆ updated_on ┆ comment_karma ┆ link_karma │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪══════════════════════════╪═════════════╪════════════╪═══════════════╪════════════╡
│ 1 ┆ truman48lamb_jasonbroken ┆ 1397113470 ┆ 1536527864 ┆ 0 ┆ 0 │
│ 2 ┆ johnethen06_jasonbroken ┆ 1397113483 ┆ 1536527864 ┆ 0 ┆ 0 │
│ 3 ┆ yaseinrez_jasonbroken ┆ 1397113483 ┆ 1536527864 ┆ 0 ┆ 1 │
│ 4 ┆ Valve92_jasonbroken ┆ 1397113503 ┆ 1536527864 ┆ 0 ┆ 0 │
│ 5 ┆ srbhuyan_jasonbroken ┆ 1397113506 ┆ 1536527864 ┆ 0 ┆ 0 │
└─────┴──────────────────────────┴─────────────┴────────────┴───────────────┴────────────┘
and the Runescape username dataset containing about 55+ million records.
import polars as pl
from ..paths import DATA_DIR
dataset = pl.read_csv(f"{DATA_DIR}/runescape.csv", has_header=False, n_rows=10)
shape: (5, 1)
┌─────────────┐
│ column_1 │
│ --- │
│ str │
╞═════════════╡
│ a000 │
│ a0000 │
│ a000000 │
│ a0000000 │
│ a0000000000 │
└─────────────┘