polars.LazyFrame.groupby_dynamic¶
- LazyFrame.groupby_dynamic(index_column: str, every: str, period: Optional[str] = None, offset: Optional[str] = None, truncate: bool = True, include_boundaries: bool = False, closed: str = 'right', by: Optional[Union[str, List[str], polars.internals.expr.Expr, List[polars.internals.expr.Expr]]] = None) polars.internals.lazy_frame.LazyGroupBy[polars.internals.lazy_frame.LDF] ¶
Groups based on a time value (or index value of type Int32, Int64). Time windows are calculated and rows are assigned to windows. Different from a normal groupby is that a row can be member of multiple groups. The time/index window could be seen as a rolling window, with a window size determined by dates/times/values instead of slots in the DataFrame.
See also
groupby_rolling
A window is defined by:
every: interval of the window
period: length of the window
offset: offset of the window
The every, period and offset arguments are created with the following string language:
1ns (1 nanosecond)
1us (1 microsecond)
1ms (1 millisecond)
1s (1 second)
1m (1 minute)
1h (1 hour)
1d (1 day)
1w (1 week)
1mo (1 calendar month)
1y (1 calendar year)
1i (1 index count)
Or combine them: “3d12h4m25s” # 3 days, 12 hours, 4 minutes, and 25 seconds
In case of a groupby_dynamic on an integer column, the windows are defined by:
“1i” # length 1
“10i” # length 10
- Parameters
- index_column
Column used to group based on the time window. Often to type Date/Datetime This column must be sorted in ascending order. If not the output will not make sense.
In case of a dynamic groupby on indices, dtype needs to be one of {Int32, Int64}. Note that Int32 gets temporarily cast to Int64, so if performance matters use an Int64 column.
- every
interval of the window
- period
length of the window, if None it is equal to ‘every’
- offset
offset of the window if None and period is None it will be equal to negative every
- truncate
truncate the time value to the window lower bound
- include_boundaries
add the lower and upper bound of the window to the “_lower_bound” and “_upper_bound” columns this will impact performance because it’s harder to parallelize
- closed
Defines if the window interval is closed or not. Any of {“left”, “right”, “both” “none”}
- by
Also group by this column/these columns