Timestamp parsing

Polars offers 4 time datatypes:

  • pl.Date, to be used for date objects: the number of days since the UNIX epoch as a 32 bit signed integer.
  • pl.Datetime, to be used of datetime objects: the number of nanoseconds since the UNIX epoch as a 64 bit signed integer.
  • pl.Time, encoded as the number of nanoseconds since midnight.

Polars string (pl.Utf8) datatypes can be parsed as either of them. You can let Polars try to guess the format of the date[time], or explicitly provide a fmt rule.

For instance (check this link for an comprehensive list):

  • "%Y-%m-%d" for "2020-12-31"
  • "%Y/%B/%d" for "2020/December/31"
  • "%B %y" for "December 20"

Below a quick example:

import polars as pl

dataset = pl.DataFrame({"date": ["2020-01-02", "2020-01-03", "2020-01-04"], "index": [1, 2, 3]})

q = dataset.lazy().with_column(pl.col("date").str.strptime(pl.Date, "%Y-%m-%d"))

df = q.collect()

returning:

shape: (3, 2)
┌────────────┬───────┐
│ date       ┆ index │
│ ---        ┆ ---   │
│ date       ┆ i64   │
╞════════════╪═══════╡
│ 2020-01-02 ┆ 1     │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 2020-01-03 ┆ 2     │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 2020-01-04 ┆ 3     │
└────────────┴───────┘

All datetime functionality is shown in the dt namespace.