polars.Series.ewm_mean_by#

Series.ewm_mean_by(by: str | IntoExpr, *, half_life: str | timedelta) → Series[source]#

Calculate time-based exponentially weighted moving average.

Given observations \(x_1, x_2, \ldots, x_n\) at times \(t_1, t_2, \ldots, t_n\), the EWMA is calculated as

\[ \begin{align}\begin{aligned}y_0 &= x_0\\\alpha_i &= \exp(-\lambda(t_i - t_{i-1}))\\y_i &= \alpha_i x_i + (1 - \alpha_i) y_{i-1}; \quad i > 0\end{aligned}\end{align} \]

where \(\lambda\) equals \(\ln(2) / \text{half_life}\).

Parameters:

by

Times to calculate average by. Should be DateTime, Date, UInt64, UInt32, Int64, or Int32 data type.

half_life

Unit over which observation decays to half its value.

Can be created either from a timedelta, or by using the following string language:

1ns (1 nanosecond)
1us (1 microsecond)
1ms (1 millisecond)
1s (1 second)
1m (1 minute)
1h (1 hour)
1d (1 day)
1w (1 week)
1i (1 index count)

Or combine them: “3d12h4m25s” # 3 days, 12 hours, 4 minutes, and 25 seconds

Note that half_life is treated as a constant duration - calendar durations such as months (or even days in the time-zone-aware case) are not supported, please express your duration in an approximately equivalent number of hours (e.g. ‘370h’ instead of ‘1mo’).

check_sorted

Check whether by column is sorted. Incorrectly setting this to False will lead to incorrect output.

Returns:

Expr: Float32 if input is Float32, otherwise Float64.

Examples

>>> from datetime import date, timedelta
>>> df = pl.DataFrame(
...     {
...         "values": [0, 1, 2, None, 4],
...         "times": [
...             date(2020, 1, 1),
...             date(2020, 1, 3),
...             date(2020, 1, 10),
...             date(2020, 1, 15),
...             date(2020, 1, 17),
...         ],
...     }
... ).sort("times")
>>> df["values"].ewm_mean_by(df["times"], half_life="4d")
shape: (5,)
Series: 'values' [f64]
[
        0.0
        0.292893
        1.492474
        null
        3.254508
]