polars.DataFrame.upsample#
- DataFrame.upsample(time_column: str, *, every: str | timedelta, offset: str | timedelta | None = None, by: str | Sequence[str] | None = None, maintain_order: bool = False) Self [source]#
Upsample a DataFrame at a regular frequency.
- Parameters:
- time_column
time column will be used to determine a date_range. Note that this column has to be sorted for the output to make sense.
- every
interval will start ‘every’ duration
- offset
change the start of the date_range by this offset.
- by
First group by these columns and then upsample for every group
- maintain_order
Keep the ordering predictable. This is slower.
- The `every` and `offset` arguments are created with
- the following string language:
- - 1ns (1 nanosecond)
- - 1us (1 microsecond)
- - 1ms (1 millisecond)
- - 1s (1 second)
- - 1m (1 minute)
- - 1h (1 hour)
- - 1d (1 day)
- - 1w (1 week)
- - 1mo (1 calendar month)
- - 1y (1 calendar year)
- - 1i (1 index count)
- Or combine them:
- “3d12h4m25s” # 3 days, 12 hours, 4 minutes, and 25 seconds
- Suffix with `”_saturating”` to indicate that dates too large for
- their month should saturate at the largest date (e.g. 2022-02-29 -> 2022-02-28)
- instead of erroring.
- Returns:
- DataFrame
Result will be sorted by time_column (but note that if by columns are passed, it will only be sorted within each by group).
Examples
Upsample a DataFrame by a certain interval.
>>> from datetime import datetime >>> df = pl.DataFrame( ... { ... "time": [ ... datetime(2021, 2, 1), ... datetime(2021, 4, 1), ... datetime(2021, 5, 1), ... datetime(2021, 6, 1), ... ], ... "groups": ["A", "B", "A", "B"], ... "values": [0, 1, 2, 3], ... } ... ).set_sorted("time") >>> df.upsample( ... time_column="time", every="1mo", by="groups", maintain_order=True ... ).select(pl.all().forward_fill()) shape: (7, 3) ┌─────────────────────┬────────┬────────┐ │ time ┆ groups ┆ values │ │ --- ┆ --- ┆ --- │ │ datetime[μs] ┆ str ┆ i64 │ ╞═════════════════════╪════════╪════════╡ │ 2021-02-01 00:00:00 ┆ A ┆ 0 │ │ 2021-03-01 00:00:00 ┆ A ┆ 0 │ │ 2021-04-01 00:00:00 ┆ A ┆ 0 │ │ 2021-05-01 00:00:00 ┆ A ┆ 2 │ │ 2021-04-01 00:00:00 ┆ B ┆ 1 │ │ 2021-05-01 00:00:00 ┆ B ┆ 1 │ │ 2021-06-01 00:00:00 ┆ B ┆ 3 │ └─────────────────────┴────────┴────────┘