Datetime namespace
List namespace
String namespace
Struct namespace
Clip (limit) the values in an array to any value that fits in 64 floating point range. Only works for the following dtypes: {Int32, Int64, Float32, Float64, UInt32}. If you want to clip other dtypes, consider writing a when -> then -> otherwise expression
Minimum value
Maximum value
Get the group indexes of the group by operation. Should be used in aggregation context only.
>>> const df = pl.DataFrame(
... {
... "group": [
... "one",
... "one",
... "one",
... "two",
... "two",
... "two",
... ],
... "value": [94, 95, 96, 97, 97, 99],
... }
... )
>>> df.group_by("group", maintainOrder=True).agg(pl.col("value").aggGroups())
shape: (2, 2)
┌───────┬───────────┐
│ group ┆ value │
│ --- ┆ --- │
│ str ┆ list[u32] │
╞═══════╪═══════════╡
│ one ┆ [0, 1, 2] │
│ two ┆ [3, 4, 5] │
└───────┴───────────┘
Rename the output of an expression.
new name
> const df = pl.DataFrame({
... "a": [1, 2, 3],
... "b": ["a", "b", None],
... });
> df
shape: (3, 2)
╭─────┬──────╮
│ a ┆ b │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪══════╡
│ 1 ┆ "a" │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ "b" │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3 ┆ null │
╰─────┴──────╯
> df.select([
... pl.col("a").alias("bar"),
... pl.col("b").alias("foo"),
... ])
shape: (3, 2)
╭─────┬──────╮
│ bar ┆ foo │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪══════╡
│ 1 ┆ "a" │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ "b" │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3 ┆ null │
╰─────┴──────╯
Exclude certain columns from a wildcard/regex selection.
You may also use regexes in the exclude list. They must start with ^
and end with $
.
Column(s) to exclude from selection
> const df = pl.DataFrame({
... "a": [1, 2, 3],
... "b": ["a", "b", None],
... "c": [None, 2, 1],
...});
> df
shape: (3, 3)
╭─────┬──────┬──────╮
│ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ i64 │
╞═════╪══════╪══════╡
│ 1 ┆ "a" ┆ null │
├╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ "b" ┆ 2 │
├╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3 ┆ null ┆ 1 │
╰─────┴──────┴──────╯
> df.select(
... pl.col("*").exclude("b"),
... );
shape: (3, 2)
╭─────┬──────╮
│ a ┆ c │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════╡
│ 1 ┆ null │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ 2 │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3 ┆ 1 │
╰─────┴──────╯
Check if elements of this Series are in the right Series, or List values of the right Series.
Series of primitive type or List type.
Expr that evaluates to a Boolean Series.
> const df = pl.DataFrame({
... "sets": [[1, 2, 3], [1, 2], [9, 10]],
... "optional_members": [1, 2, 3]
... });
> df.select(
... pl.col("optional_members").isIn("sets").alias("contains")
... );
shape: (3, 1)
┌──────────┐
│ contains │
│ --- │
│ bool │
╞══════════╡
│ true │
├╌╌╌╌╌╌╌╌╌╌┤
│ true │
├╌╌╌╌╌╌╌╌╌╌┤
│ false │
└──────────┘
Keep the original root name of the expression.
A groupby aggregation often changes the name of a column.
With keepName
we can keep the original name of the column
> const df = pl.DataFrame({
... "a": [1, 2, 3],
... "b": ["a", "b", None],
... });
> df
... .groupBy("a")
... .agg(pl.col("b").list())
... .sort({by:"a"});
shape: (3, 2)
╭─────┬────────────╮
│ a ┆ b_agg_list │
│ --- ┆ --- │
│ i64 ┆ list [str] │
╞═════╪════════════╡
│ 1 ┆ [a] │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2 ┆ [b] │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3 ┆ [null] │
╰─────┴────────────╯
Keep the original column name:
> df
... .groupby("a")
... .agg(col("b").list().keepName())
... .sort({by:"a"})
shape: (3, 2)
╭─────┬────────────╮
│ a ┆ b │
│ --- ┆ --- │
│ i64 ┆ list [str] │
╞═════╪════════════╡
│ 1 ┆ [a] │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2 ┆ [b] │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3 ┆ [null] │
╰─────┴────────────╯
Apply window function over a subgroup.
This is similar to a groupby + aggregation + self join. Or similar to window functions in Postgres
Column(s) to partition by.
> const df = pl.DataFrame({
... "groups": [1, 1, 2, 2, 1, 2, 3, 3, 1],
... "values": [1, 2, 3, 4, 5, 6, 7, 8, 8],
... });
> df.select(
... pl.col("groups").sum().over("groups")
... );
╭────────┬────────╮
│ groups ┆ values │
│ --- ┆ --- │
│ i32 ┆ i32 │
╞════════╪════════╡
│ 1 ┆ 16 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 1 ┆ 16 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2 ┆ 13 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2 ┆ 13 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ ... ┆ ... │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 1 ┆ 16 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2 ┆ 13 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 3 ┆ 15 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 3 ┆ 15 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 1 ┆ 16 │
╰────────┴────────╯
Add a prefix the to root column name of the expression.
> const df = pl.DataFrame({
... "A": [1, 2, 3, 4, 5],
... "fruits": ["banana", "banana", "apple", "apple", "banana"],
... "B": [5, 4, 3, 2, 1],
... "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
... });
shape: (5, 4)
╭─────┬──────────┬─────┬──────────╮
│ A ┆ fruits ┆ B ┆ cars │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ i64 ┆ str │
╞═════╪══════════╪═════╪══════════╡
│ 1 ┆ "banana" ┆ 5 ┆ "beetle" │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2 ┆ "banana" ┆ 4 ┆ "audi" │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 3 ┆ "apple" ┆ 3 ┆ "beetle" │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 4 ┆ "apple" ┆ 2 ┆ "beetle" │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 5 ┆ "banana" ┆ 1 ┆ "beetle" │
╰─────┴──────────┴─────┴──────────╯
> df.select(
... pl.col("*").reverse().prefix("reverse_"),
... )
shape: (5, 8)
╭───────────┬────────────────┬───────────┬──────────────╮
│ reverse_A ┆ reverse_fruits ┆ reverse_B ┆ reverse_cars │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ i64 ┆ str │
╞═══════════╪════════════════╪═══════════╪══════════════╡
│ 5 ┆ "banana" ┆ 1 ┆ "beetle" │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4 ┆ "apple" ┆ 2 ┆ "beetle" │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3 ┆ "apple" ┆ 3 ┆ "beetle" │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2 ┆ "banana" ┆ 4 ┆ "audi" │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 1 ┆ "banana" ┆ 5 ┆ "beetle" │
╰───────────┴────────────────┴───────────┴──────────────╯
Assign ranks to data, dealing with ties appropriately.
Optional
method: RankMethodOptional
descending: booleanReplace the given values by different values of the same data type.
Value or sequence of values to replace. Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals.
Value or sequence of values to replace by.
Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals.
Length must match the length of old
or have length 1.
Replace a single value by another value. Values that were not replaced remain unchanged.
>>> const df = pl.DataFrame({"a": [1, 2, 2, 3]});
>>> df.withColumns(pl.col("a").replace(2, 100).alias("replaced"));
shape: (4, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════════╡
│ 1 ┆ 1 │
│ 2 ┆ 100 │
│ 2 ┆ 100 │
│ 3 ┆ 3 │
└─────┴──────────┘
Replace multiple values by passing sequences to the old
and new_
parameters.
>>> df.withColumns(pl.col("a").replace([2, 3], [100, 200]).alias("replaced"));
shape: (4, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════════╡
│ 1 ┆ 1 │
│ 2 ┆ 100 │
│ 2 ┆ 100 │
│ 3 ┆ 200 │
└─────┴──────────┘
Passing a mapping with replacements is also supported as syntactic sugar. Specify a default to set all values that were not matched.
>>> const mapping = {2: 100, 3: 200};
>>> df.withColumns(pl.col("a").replace({ old: mapping }).alias("replaced");
shape: (4, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════════╡
│ 1 ┆ -1 │
│ 2 ┆ 100 │
│ 2 ┆ 100 │
│ 3 ┆ 200 │
└─────┴──────────┘
Replace values by different values.
Value or sequence of values to replace. Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals.
Value or sequence of values to replace by.
Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals.
Length must match the length of old
or have length 1.
Optional
default_: string | number | pl.Expr | (string | number)[]Set values that were not replaced to this value. Defaults to keeping the original value. Accepts expression input. Non-expression inputs are parsed as literals.
Optional
returnDtype: DataTypeThe data type of the resulting expression. If set to None
(default), the data type is determined automatically based on the other inputs.
Replace a single value by another value. Values that were not replaced remain unchanged.
>>> const df = pl.DataFrame({"a": [1, 2, 2, 3]});
>>> df.withColumns(pl.col("a").replace(2, 100).alias("replaced"));
shape: (4, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════════╡
│ 1 ┆ 1 │
│ 2 ┆ 100 │
│ 2 ┆ 100 │
│ 3 ┆ 3 │
└─────┴──────────┘
Replace multiple values by passing sequences to the old
and new_
parameters.
>>> df.withColumns(pl.col("a").replace([2, 3], [100, 200]).alias("replaced"));
shape: (4, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════════╡
│ 1 ┆ 1 │
│ 2 ┆ 100 │
│ 2 ┆ 100 │
│ 3 ┆ 200 │
└─────┴──────────┘
Passing a mapping with replacements is also supported as syntactic sugar. Specify a default to set all values that were not matched.
>>> const mapping = {2: 100, 3: 200};
>>> df.withColumns(pl.col("a").replaceStrict({ old: mapping, default_: -1, returnDtype: pl.Int64 }).alias("replaced");
shape: (4, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪══════════╡
│ 1 ┆ -1 │
│ 2 ┆ 100 │
│ 2 ┆ 100 │
│ 3 ┆ 200 │
└─────┴──────────┘
Replacing by values of a different data type sets the return type based on
a combination of the new_
data type and either the original data type or the
default data type if it was set.
>>> const df = pl.DataFrame({"a": ["x", "y", "z"]});
>>> const mapping = {"x": 1, "y": 2, "z": 3};
>>> df.withColumns(pl.col("a").replaceStrict({ old: mapping }).alias("replaced"));
shape: (3, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ str ┆ str │
╞═════╪══════════╡
│ x ┆ 1 │
│ y ┆ 2 │
│ z ┆ 3 │
└─────┴──────────┘
>>> df.withColumns(pl.col("a").replaceStrict({ old: mapping, default_: None }).alias("replaced"));
shape: (3, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪══════════╡
│ x ┆ 1 │
│ y ┆ 2 │
│ z ┆ 3 │
└─────┴──────────┘
Set the returnDtype
parameter to control the resulting data type directly.
>>> df.withColumns(pl.col("a").replaceStrict({ old: mapping, returnDtype: pl.UInt8 }).alias("replaced"));
shape: (3, 2)
┌─────┬──────────┐
│ a ┆ replaced │
│ --- ┆ --- │
│ str ┆ u8 │
╞═════╪══════════╡
│ x ┆ 1 │
│ y ┆ 2 │
│ z ┆ 3 │
└─────┴──────────┘
Expression input is supported for all parameters.
>>> const df = pl.DataFrame({"a": [1, 2, 2, 3], "b": [1.5, 2.5, 5.0, 1.0]});
>>> df.withColumns(
... pl.col("a").replaceStrict({
... old: pl.col("a").max(),
... new_: pl.col("b").sum(),
... default_: pl.col("b"),
... }).alias("replaced")
... );
shape: (4, 3)
┌─────┬─────┬──────────┐
│ a ┆ b ┆ replaced │
│ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ f64 │
╞═════╪═════╪══════════╡
│ 1 ┆ 1.5 ┆ 1.5 │
│ 2 ┆ 2.5 ┆ 2.5 │
│ 2 ┆ 5.0 ┆ 5.0 │
│ 3 ┆ 1.0 ┆ 10.0 │
└─────┴─────┴──────────┘
Compute the sample skewness of a data set. For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution.
Optional
bias: booleanIf False, then the calculations are corrected for statistical bias.
Sort this column by the ordering of another column, or multiple other columns. In projection/ selection context the whole column is sorted. If used in a groupby context, the groups are sorted.
The column(s) used for sorting.
Optional
descending: boolean | boolean[]false -> order from small to large. true -> order from large to small.
compat with JSON.stringify
Returns a string representation of an object.
Apply a rolling max (moving max) over the values in this Series.
A window of length window_size
will traverse the series. The values that fill this window
will (optionally) be multiplied with the weights given by the weight
vector.
The resulting parameters' values will be aggregated into their sum.
Apply a rolling mean (moving mean) over the values in this Series.
A window of length window_size
will traverse the series. The values that fill this window
will (optionally) be multiplied with the weights given by the weight
vector.
The resulting parameters' values will be aggregated into their sum.
Apply a rolling min (moving min) over the values in this Series.
A window of length window_size
will traverse the series. The values that fill this window
will (optionally) be multiplied with the weights given by the weight
vector.
The resulting parameters' values will be aggregated into their sum.
Optional
interpolation: InterpolationMethodOptional
windowSize: numberOptional
weights: number[]Optional
minPeriods: number[]Optional
center: booleanOptional
by: stringOptional
closed: ClosedWindowCompute a rolling skew
options for rolling mean operations
Compute a rolling std dev
A window of length window_size
will traverse the array. The values that fill this window
will (optionally) be multiplied with the weights given by the weight
vector. The resulting
values will be aggregated to their sum.
Apply a rolling sum (moving sum) over the values in this Series.
A window of length window_size
will traverse the series. The values that fill this window
will (optionally) be multiplied with the weights given by the weight
vector.
The resulting parameters' values will be aggregated into their sum.
Compute a rolling variance.
A window of length window_size
will traverse the series. The values that fill this window
will (optionally) be multiplied with the weights given by the weight
vector.
The resulting parameters' values will be aggregated into their sum.
Expressions that can be used in various contexts.