# Series#

class polars.Series(name: str | ArrayLike | None = None, values: ArrayLike | None = None, dtype: PolarsDataType | None = None, strict: bool = True, nan_to_null: bool = False, dtype_if_empty: PolarsDataType | None = None)[source]

A Series represents a single column in a polars DataFrame.

Parameters:
namestr, default None

Name of the series. Will be used as a column name when used in a DataFrame. When not specified, name is set to an empty string.

valuesArrayLike, default None

One-dimensional data in various forms. Supported are: Sequence, Series, pyarrow Array, and numpy ndarray.

dtypeDataType, default None

Polars dtype of the Series data. If not specified, the dtype is inferred.

strict

Throw error on numeric overflow.

nan_to_null

In case a numpy array is used to create this Series, indicate how to deal with np.nan values. (This parameter is a no-op on non-numpy data).

dtype_if_empty=dtype_if_emptyDataType, default None

If no dtype is specified and values contains None or an empty list, set the Polars dtype of the Series data. If not specified, Float32 is used.

Examples

Constructing a Series by specifying name and values positionally:

>>> s = pl.Series("a", [1, 2, 3])
>>> s
shape: (3,)
Series: 'a' [i64]
[
1
2
3
]


Notice that the dtype is automatically inferred as a polars Int64:

>>> s.dtype
Int64


Constructing a Series with a specific dtype:

>>> s2 = pl.Series("a", [1, 2, 3], dtype=pl.Float32)
>>> s2
shape: (3,)
Series: 'a' [f32]
[
1.0
2.0
3.0
]


It is possible to construct a Series with values as the first positional argument. This syntax considered an anti-pattern, but it can be useful in certain scenarios. You must specify any other arguments through keywords.

>>> s3 = pl.Series([1, 2, 3])
>>> s3
shape: (3,)
Series: '' [i64]
[
1
2
3
]


Methods:

 abs Compute absolute values. alias Return a copy of the Series with a new alias/name. all Check if all boolean values in the column are True. any Check if any boolean value in the column is True. append Append a Series to this one. apply Apply a custom/user-defined function (UDF) over elements in this Series. arccos Compute the element-wise value for the inverse cosine. arccosh Compute the element-wise value for the inverse hyperbolic cosine. arcsin Compute the element-wise value for the inverse sine. arcsinh Compute the element-wise value for the inverse hyperbolic sine. arctan Compute the element-wise value for the inverse tangent. arctanh Compute the element-wise value for the inverse hyperbolic tangent. arg_max Get the index of the maximal value. arg_min Get the index of the minimal value. arg_sort Get the index values that would sort this Series. arg_true Get index values where Boolean Series evaluate True. arg_unique Get unique index as Series. argsort Get the index values that would sort this Series. cast Cast between data types. ceil Rounds up to the nearest integer value. chunk_lengths Get the length of each individual chunk. clear Create an empty copy of the current Series, with zero to 'n' elements. clip Clip (limit) the values in an array to a min and max boundary. clip_max Clip (limit) the values in an array to a max boundary. clip_min Clip (limit) the values in an array to a min boundary. clone Very cheap deepcopy/clone. cos Compute the element-wise value for the cosine. cosh Compute the element-wise value for the hyperbolic cosine. cummax Get an array with the cumulative max computed at every element. cummin Get an array with the cumulative min computed at every element. cumprod Get an array with the cumulative product computed at every element. cumsum Get an array with the cumulative sum computed at every element. cumulative_eval Run an expression over a sliding window that increases 1 slot every iteration. cut Bin values into discrete values. describe Quick summary statistics of a series. diff Calculate the n-th discrete difference. dot Compute the dot/inner product between two Series. drop_nans Drop NaN values. drop_nulls Drop all null values. entropy Computes the entropy. eq Method equivalent of operator expression series == other. estimated_size Return an estimation of the total (heap) allocated size of the Series. ewm_mean Exponentially-weighted moving average. ewm_std Exponentially-weighted moving standard deviation. ewm_var Exponentially-weighted moving variance. exp Compute the exponential, element-wise. explode Explode a list or utf8 Series. extend_constant Extremely fast method for extending the Series with 'n' copies of a value. fill_nan Fill floating point NaN value with a fill value. fill_null Fill null values using the specified value or strategy. filter Filter elements by a boolean mask. floor Rounds down to the nearest integer value. ge Method equivalent of operator expression series >= other. get_chunks Get the chunks of this Series as a list of Series. gt Method equivalent of operator expression series > other. has_validity Return True if the Series has a validity bitmask. hash Hash the Series. head Get the first n elements. hist Bin values into buckets and count their occurrences. interpolate Interpolate intermediate values. is_between Get a boolean mask of the values that fall between the given start/end values. is_boolean Check if this Series is a Boolean. is_duplicated Get mask of all duplicated values. is_empty Check if the Series is empty. is_finite Returns a boolean Series indicating which values are finite. is_first Get a mask of the first unique value. is_float Check if this Series has floating point numbers. is_in Check if elements of this Series are in the other Series. is_infinite Returns a boolean Series indicating which values are infinite. is_nan Returns a boolean Series indicating which values are not NaN. is_not_nan Returns a boolean Series indicating which values are not NaN. is_not_null Returns a boolean Series indicating which values are not null. is_null Returns a boolean Series indicating which values are null. is_numeric Check if this Series datatype is numeric. is_sorted Check if the Series is sorted. is_temporal Check if this Series datatype is temporal. is_unique Get mask of all unique values. is_utf8 Check if this Series datatype is a Utf8. item Return the series as a scalar. kurtosis Compute the kurtosis (Fisher or Pearson) of a dataset. le Method equivalent of operator expression series <= other. len Length of this Series. limit Get the first n elements. log Compute the logarithm to a given base. log10 Compute the base 10 logarithm of the input array, element-wise. lower_bound Return the lower bound of this Series' dtype as a unit Series. lt Method equivalent of operator expression series < other. map_dict Replace values in the Series using a remapping dictionary. max Get the maximum value in this Series. mean Reduce this Series to the mean value. median Get the median of this Series. min Get the minimal value in this Series. mode Compute the most occurring value(s). n_chunks Get the number of chunks that this Series contains. n_unique Count the number of unique values in this Series. nan_max Get maximum value, but propagate/poison encountered NaN values. nan_min Get minimum value, but propagate/poison encountered NaN values. ne Method equivalent of operator expression series != other. new_from_index Create a new Series filled with values from the given index. null_count Count the null values in this Series. pct_change Computes percentage change between values. peak_max Get a boolean mask of the local maximum peaks. peak_min Get a boolean mask of the local minimum peaks. product Reduce this Series to the product value. qcut Bin values into discrete values based on their quantiles. quantile Get the quantile value of this Series. rank Assign ranks to data, dealing with ties appropriately. rechunk Create a single chunk of memory for this Series. reinterpret Reinterpret the underlying bits as a signed/unsigned integer. rename Rename this Series. reshape Reshape this Series to a flat Series or a Series of Lists. reverse Return Series in reverse order. rolling_apply Apply a custom rolling window function. rolling_max Apply a rolling max (moving max) over the values in this array. rolling_mean Apply a rolling mean (moving mean) over the values in this array. rolling_median Compute a rolling median. rolling_min Apply a rolling min (moving min) over the values in this array. rolling_quantile Compute a rolling quantile. rolling_skew Compute a rolling skew. rolling_std Compute a rolling std dev. rolling_sum Apply a rolling sum (moving sum) over the values in this array. rolling_var Compute a rolling variance. round Round underlying floating point data by decimals digits. sample Sample from this Series. search_sorted Find indices where elements should be inserted to maintain order. series_equal Check if series is equal with another Series. set Set masked values. set_at_idx Set values at the index locations. set_sorted Flags the Series as 'sorted'. shift Shift the values by a given period. shift_and_fill Shift the values by a given period and fill the resulting null values. shrink_dtype Shrink numeric columns to the minimal required datatype. shrink_to_fit Shrink Series memory usage. shuffle Shuffle the contents of this Series. sign Compute the element-wise indication of the sign. sin Compute the element-wise value for the sine. sinh Compute the element-wise value for the hyperbolic sine. skew Compute the sample skewness of a data set. slice Get a slice of this Series. sort Sort this Series. sqrt Compute the square root of the elements. std Get the standard deviation of this Series. sum Reduce this Series to the sum value. tail Get the last n elements. take Take values by index. take_every Take every nth value in the Series and return as new Series. tan Compute the element-wise value for the tangent. tanh Compute the element-wise value for the hyperbolic tangent. to_arrow Get the underlying Arrow Array. to_dummies Get dummy/indicator variables. to_frame Cast this Series to a DataFrame. to_list Convert this Series to a Python List. to_numpy Convert this Series to numpy. to_pandas Convert this Series to a pandas Series. to_physical Cast to physical representation of the logical dtype. top_k Return the k largest elements. unique Get unique elements in series. unique_counts Return a count of the unique values in the order of appearance. upper_bound Return the upper bound of this Series' dtype as a unit Series. value_counts Count the unique values in a Series. var Get variance of this Series. view Get a view into this Series data with a numpy array. zip_with Take values from self or other based on the given mask.
abs() Series[source]

Compute absolute values.

Same as abs(series).

alias(name: str) Series[source]

Return a copy of the Series with a new alias/name.

Parameters:
name

New name.

Examples

>>> srs = pl.Series("x", [1, 2, 3])
>>> new_aliased_srs = srs.alias("y")

all() bool[source]

Check if all boolean values in the column are True.

Returns:
Boolean literal
any() bool[source]

Check if any boolean value in the column is True.

Returns:
Boolean literal
append(other: Series, append_chunks: bool = True) Series[source]

Append a Series to this one.

Parameters:
other

Series to append.

append_chunks

If set to True the append operation will add the chunks from other to self. This is super cheap.

If set to False the append operation will do the same as DataFrame.extend which extends the memory backed by this Series with the values from other.

Different from append chunks, extend appends the data from other to the underlying memory locations and thus may cause a reallocation (which are expensive).

If this does not cause a reallocation, the resulting data structure will not have any extra chunks and thus will yield faster queries.

Prefer extend over append_chunks when you want to do a query after a single append. For instance during online operations where you add n rows and rerun a query.

Prefer append_chunks over extend when you want to append many times before doing a query. For instance when you read in multiple files and when to store them in a single Series. In the latter case, finish the sequence of append_chunks operations with a rechunk.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [4, 5, 6])
>>> s.append(s2)
shape: (6,)
Series: 'a' [i64]
[
1
2
3
4
5
6
]

apply(function: Callable[[Any], Any], return_dtype: PolarsDataType | None = None, *, skip_nulls: bool = True) Self[source]

Apply a custom/user-defined function (UDF) over elements in this Series.

If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.

Implementing logic using a Python function is almost always _significantly_ slower and more memory intensive than implementing the same logic using the native expression API because:

• The native expression engine runs in Rust; UDFs run in Python.

• Use of Python UDFs forces the DataFrame to be materialized in memory.

• Polars-native expressions can be parallelised (UDFs typically cannot).

• Polars-native expressions can be logically optimised (UDFs cannot).

Wherever possible you should strongly prefer the native expression API to achieve the best performance.

Parameters:
function

Custom function or lambda.

return_dtype

Output datatype. If none is given, the same datatype as this Series will be used.

skip_nulls

Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.

Returns:
Series

Notes

If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an @lru_cache decorator to it. With suitable data you may achieve order-of-magnitude speedups (or more).

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.apply(lambda x: x + 10)
shape: (3,)
Series: 'a' [i64]
[
11
12
13
]

arccos() Series[source]

Compute the element-wise value for the inverse cosine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arccos()
shape: (3,)
Series: 'a' [f64]
[
0.0
1.570796
3.141593
]

arccosh() Series[source]

Compute the element-wise value for the inverse hyperbolic cosine.

Examples

>>> s = pl.Series("a", [5.0, 1.0, 0.0, -1.0])
>>> s.arccosh()
shape: (4,)
Series: 'a' [f64]
[
2.292432
0.0
NaN
NaN
]

arcsin() Series[source]

Compute the element-wise value for the inverse sine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arcsin()
shape: (3,)
Series: 'a' [f64]
[
1.570796
0.0
-1.570796
]

arcsinh() Series[source]

Compute the element-wise value for the inverse hyperbolic sine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arcsinh()
shape: (3,)
Series: 'a' [f64]
[
0.881374
0.0
-0.881374
]

arctan() Series[source]

Compute the element-wise value for the inverse tangent.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arctan()
shape: (3,)
Series: 'a' [f64]
[
0.785398
0.0
-0.785398
]

arctanh() Series[source]

Compute the element-wise value for the inverse hyperbolic tangent.

Examples

>>> s = pl.Series("a", [2.0, 1.0, 0.5, 0.0, -0.5, -1.0, -1.1])
>>> s.arctanh()
shape: (7,)
Series: 'a' [f64]
[
NaN
inf
0.549306
0.0
-0.549306
-inf
NaN
]

arg_max() [source]

Get the index of the maximal value.

Returns:
Integer

Examples

>>> s = pl.Series("a", [3, 2, 1])
>>> s.arg_max()
0

arg_min() [source]

Get the index of the minimal value.

Returns:
Integer

Examples

>>> s = pl.Series("a", [3, 2, 1])
>>> s.arg_min()
2

arg_sort(*, descending: bool = False, nulls_last: bool = False) Series[source]

Get the index values that would sort this Series.

Parameters:
descending

Sort in descending order.

nulls_last

Place null values last instead of first.

Examples

>>> s = pl.Series("a", [5, 3, 4, 1, 2])
>>> s.arg_sort()
shape: (5,)
Series: 'a' [u32]
[
3
4
1
2
0
]

arg_true() Series[source]

Get index values where Boolean Series evaluate True.

Returns:
UInt32 Series

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> (s == 2).arg_true()
shape: (1,)
Series: 'a' [u32]
[
1
]

arg_unique() Series[source]

Get unique index as Series.

Returns:
Series

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.arg_unique()
shape: (3,)
Series: 'a' [u32]
[
0
1
3
]

argsort(descending: bool = False, nulls_last: bool = False) Series[source]

Get the index values that would sort this Series.

Alias for Series.arg_sort().

Deprecated since version 0.16.5: Series.argsort will be removed in favour of Series.arg_sort.

Parameters:
descending

Sort in descending order.

nulls_last

Place null values last instead of first.

cast(dtype: PolarsDataType | type[int] | type[float] | type[str] | type[bool], *, strict: bool = True) Self[source]

Cast between data types.

Parameters:
dtype

DataType to cast to.

strict

Throw an error if a cast could not be done for instance due to an overflow.

Examples

>>> s = pl.Series("a", [True, False, True])
>>> s
shape: (3,)
Series: 'a' [bool]
[
true
false
true
]

>>> s.cast(pl.UInt32)
shape: (3,)
Series: 'a' [u32]
[
1
0
1
]

ceil() Series[source]

Rounds up to the nearest integer value.

Only works on floating point Series.

Examples

>>> s = pl.Series("a", [1.12345, 2.56789, 3.901234])
>>> s.ceil()
shape: (3,)
Series: 'a' [f64]
[
2.0
3.0
4.0
]

chunk_lengths() list[int][source]

Get the length of each individual chunk.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("a", [4, 5, 6])


Concatenate Series with rechunk = True

>>> pl.concat([s, s2]).chunk_lengths()
[6]


Concatenate Series with rechunk = False

>>> pl.concat([s, s2], rechunk=False).chunk_lengths()
[3, 3]

clear(n: int = 0) Series[source]

Create an empty copy of the current Series, with zero to ‘n’ elements.

The copy has an identical name/dtype, but no data.

Parameters:
n

Number of (empty) elements to return in the cleared frame.

clone

Cheap deepcopy/clone.

Examples

>>> s = pl.Series("a", [None, True, False])
>>> s.clear()
shape: (0,)
Series: 'a' [bool]
[
]

>>> s.clear(n=2)
shape: (2,)
Series: 'a' [bool]
[
null
null
]

clip(min_val: , max_val: ) Series[source]

Clip (limit) the values in an array to a min and max boundary.

Only works for numerical types.

If you want to clip other dtypes, consider writing a “when, then, otherwise” expression. See when() for more information.

Parameters:
min_val

Minimum value.

max_val

Maximum value.

Examples

>>> s = pl.Series("foo", [-50, 5, None, 50])
>>> s.clip(1, 10)
shape: (4,)
Series: 'foo' [i64]
[
1
5
null
10
]

clip_max(max_val: ) Series[source]

Clip (limit) the values in an array to a max boundary.

Only works for numerical types.

If you want to clip other dtypes, consider writing a “when, then, otherwise” expression. See when() for more information.

Parameters:
max_val

Maximum value.

clip_min(min_val: ) Series[source]

Clip (limit) the values in an array to a min boundary.

Only works for numerical types.

If you want to clip other dtypes, consider writing a “when, then, otherwise” expression. See when() for more information.

Parameters:
min_val

Minimum value.

clone() Self[source]

Very cheap deepcopy/clone.

clear

Create an empty copy of the current Series, with identical schema but no data.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.clone()
shape: (3,)
Series: 'a' [i64]
[
1
2
3
]

cos() Series[source]

Compute the element-wise value for the cosine.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.cos()
shape: (3,)
Series: 'a' [f64]
[
1.0
6.1232e-17
-1.0
]

cosh() Series[source]

Compute the element-wise value for the hyperbolic cosine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.cosh()
shape: (3,)
Series: 'a' [f64]
[
1.543081
1.0
1.543081
]

cummax(reverse: bool = False) Series[source]

Get an array with the cumulative max computed at every element.

Parameters:
reverse

reverse the operation.

Examples

>>> s = pl.Series("s", [3, 5, 1])
>>> s.cummax()
shape: (3,)
Series: 's' [i64]
[
3
5
5
]

cummin(reverse: bool = False) Series[source]

Get an array with the cumulative min computed at every element.

Parameters:
reverse

reverse the operation.

Examples

>>> s = pl.Series("s", [1, 2, 3])
>>> s.cummin()
shape: (3,)
Series: 's' [i64]
[
1
1
1
]

cumprod(reverse: bool = False) Series[source]

Get an array with the cumulative product computed at every element.

Parameters:
reverse

reverse the operation.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.cumprod()
shape: (3,)
Series: 'a' [i64]
[
1
2
6
]

cumsum(reverse: bool = False) Series[source]

Get an array with the cumulative sum computed at every element.

Parameters:
reverse

reverse the operation.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.cumsum()
shape: (3,)
Series: 'a' [i64]
[
1
3
6
]

cumulative_eval(expr: Expr, min_periods: int = 1, parallel: bool = False) Series[source]

Run an expression over a sliding window that increases 1 slot every iteration.

Parameters:
expr

Expression to evaluate

min_periods

Number of valid values there should be in the window before the expression is evaluated. valid values = length - null_count

parallel

Run in parallel. Don’t do this in a groupby or another operation that already has much parallelization.

Warning

This functionality is experimental and may change without it being considered a breaking change.

This can be really slow as it can have O(n^2) complexity. Don’t use this for operations that visit all elements.

Examples

>>> s = pl.Series("values", [1, 2, 3, 4, 5])
>>> s.cumulative_eval(pl.element().first() - pl.element().last() ** 2)
shape: (5,)
Series: 'values' [f64]
[
0.0
-3.0
-8.0
-15.0
-24.0
]

cut(bins: , labels: list[str] | None = None, break_point_label: str = 'break_point', category_label: str = 'category', maintain_order: bool = False) DataFrame[source]

Bin values into discrete values.

Parameters:
bins

Bins to create.

labels

Labels to assign to the bins. If given the length of labels must be len(bins) + 1.

break_point_label

Name given to the breakpoint column.

category_label

Name given to the category column.

maintain_order

Keep the order of the original Series.

Returns:
DataFrame

Examples

>>> a = pl.Series("a", [v / 10 for v in range(-30, 30, 5)])
>>> a.cut(bins=[-1, 1])
shape: (12, 3)
┌──────┬─────────────┬──────────────┐
│ a    ┆ break_point ┆ category     │
│ ---  ┆ ---         ┆ ---          │
│ f64  ┆ f64         ┆ cat          │
╞══════╪═════════════╪══════════════╡
│ -3.0 ┆ -1.0        ┆ (-inf, -1.0] │
│ -2.5 ┆ -1.0        ┆ (-inf, -1.0] │
│ -2.0 ┆ -1.0        ┆ (-inf, -1.0] │
│ -1.5 ┆ -1.0        ┆ (-inf, -1.0] │
│ …    ┆ …           ┆ …            │
│ 1.0  ┆ 1.0         ┆ (-1.0, 1.0]  │
│ 1.5  ┆ inf         ┆ (1.0, inf]   │
│ 2.0  ┆ inf         ┆ (1.0, inf]   │
│ 2.5  ┆ inf         ┆ (1.0, inf]   │
└──────┴─────────────┴──────────────┘

describe() DataFrame[source]

Quick summary statistics of a series.

Series with mixed datatypes will return summary statistics for the datatype of the first value.

Returns:
Dictionary with summary statistics of a Series.

Examples

>>> series_num = pl.Series([1, 2, 3, 4, 5])
>>> series_num.describe()
shape: (6, 2)
┌────────────┬──────────┐
│ statistic  ┆ value    │
│ ---        ┆ ---      │
│ str        ┆ f64      │
╞════════════╪══════════╡
│ min        ┆ 1.0      │
│ max        ┆ 5.0      │
│ null_count ┆ 0.0      │
│ mean       ┆ 3.0      │
│ std        ┆ 1.581139 │
│ count      ┆ 5.0      │
└────────────┴──────────┘

>>> series_str = pl.Series(["a", "a", None, "b", "c"])
>>> series_str.describe()
shape: (3, 2)
┌────────────┬───────┐
│ statistic  ┆ value │
│ ---        ┆ ---   │
│ str        ┆ i64   │
╞════════════╪═══════╡
│ unique     ┆ 4     │
│ null_count ┆ 1     │
│ count      ┆ 5     │
└────────────┴───────┘

diff(n: int = 1, null_behavior: NullBehavior = 'ignore') Series[source]

Calculate the n-th discrete difference.

Parameters:
n

Number of slots to shift.

null_behavior{‘ignore’, ‘drop’}

How to handle null values.

Examples

>>> s = pl.Series("s", values=[20, 10, 30, 25, 35], dtype=pl.Int8)
>>> s.diff()
shape: (5,)
Series: 's' [i8]
[
null
-10
20
-5
10
]

>>> s.diff(n=2)
shape: (5,)
Series: 's' [i8]
[
null
null
10
15
5
]

>>> s.diff(n=2, null_behavior="drop")
shape: (3,)
Series: 's' [i8]
[
10
15
5
]

dot(other: Union[Series, Sequence[Any], Array, ChunkedArray, ndarray, Series, DatetimeIndex]) [source]

Compute the dot/inner product between two Series.

Parameters:
other

Series (or array) to compute dot product with.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [4.0, 5.0, 6.0])
>>> s.dot(s2)
32.0

drop_nans() Series[source]

Drop NaN values.

drop_nulls() Series[source]

Drop all null values.

Creates a new Series that copies data from this Series without null values.

entropy(base: float = 2.718281828459045, normalize: bool = False) [source]

Computes the entropy.

Uses the formula -sum(pk * log(pk) where pk are discrete probabilities.

Parameters:
base

Given base, defaults to e

normalize

Normalize pk if it doesn’t sum to 1.

Examples

>>> a = pl.Series([0.99, 0.005, 0.005])
>>> a.entropy(normalize=True)
0.06293300616044681
>>> b = pl.Series([0.65, 0.10, 0.25])
>>> b.entropy(normalize=True)
0.8568409950394724

eq(other: Any) Self[source]

Method equivalent of operator expression series == other.

estimated_size(unit: SizeUnit = 'b') [source]

Return an estimation of the total (heap) allocated size of the Series.

Estimated size is given in the specified unit (bytes by default).

This estimation is the sum of the size of its buffers, validity, including nested arrays. Multiple arrays may share buffers and bitmaps. Therefore, the size of 2 arrays is not the sum of the sizes computed from this function. In particular, [StructArray]’s size is an upper bound.

When an array is sliced, its allocated size remains constant because the buffer unchanged. However, this function will yield a smaller number. This is because this function returns the visible size of the buffer, not its total capacity.

FFI buffers are included in this estimation.

Parameters:
unit{‘b’, ‘kb’, ‘mb’, ‘gb’, ‘tb’}

Scale the returned size to the given unit.

Examples

>>> s = pl.Series("values", list(range(1_000_000)), dtype=pl.UInt32)
>>> s.estimated_size()
4000000
>>> s.estimated_size("mb")
3.814697265625

ewm_mean(com: = None, span: = None, half_life: = None, alpha: = None, adjust: bool = True, min_periods: int = 1, ignore_nulls: bool = True) Series[source]

Exponentially-weighted moving average.

Parameters:
com

Specify decay in terms of center of mass, $$\gamma$$, with

$\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0$
span

Specify decay in terms of span, $$\theta$$, with

$\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1$
half_life

Specify decay in terms of half-life, $$\lambda$$, with

$\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0$
alpha

Specify smoothing factor alpha directly, $$0 < \alpha \leq 1$$.

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

• When adjust=True the EW function is calculated using weights $$w_i = (1 - \alpha)^i$$

• When adjust=False the EW function is calculated recursively by

$\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}$
min_periods

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

• When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of $$x_0$$ and $$x_2$$ used in calculating the final weighted average of [$$x_0$$, None, $$x_2$$] are $$(1-\alpha)^2$$ and $$1$$ if adjust=True, and $$(1-\alpha)^2$$ and $$\alpha$$ if adjust=False.

• When ignore_nulls=True, weights are based on relative positions. For example, the weights of $$x_0$$ and $$x_2$$ used in calculating the final weighted average of [$$x_0$$, None, $$x_2$$] are $$1-\alpha$$ and $$1$$ if adjust=True, and $$1-\alpha$$ and $$\alpha$$ if adjust=False.

ewm_std(com: = None, span: = None, half_life: = None, alpha: = None, adjust: bool = True, bias: bool = False, min_periods: int = 1, ignore_nulls: bool = True) Series[source]

Exponentially-weighted moving standard deviation.

Parameters:
com

Specify decay in terms of center of mass, $$\gamma$$, with

$\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0$
span

Specify decay in terms of span, $$\theta$$, with

$\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1$
half_life

Specify decay in terms of half-life, $$\lambda$$, with

$\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0$
alpha

Specify smoothing factor alpha directly, $$0 < \alpha \leq 1$$.

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

• When adjust=True the EW function is calculated using weights $$w_i = (1 - \alpha)^i$$

• When adjust=False the EW function is calculated recursively by

$\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}$
bias

When bias=False, apply a correction to make the estimate statistically unbiased.

min_periods

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

• When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of $$x_0$$ and $$x_2$$ used in calculating the final weighted average of [$$x_0$$, None, $$x_2$$] are $$(1-\alpha)^2$$ and $$1$$ if adjust=True, and $$(1-\alpha)^2$$ and $$\alpha$$ if adjust=False.

• When ignore_nulls=True, weights are based on relative positions. For example, the weights of $$x_0$$ and $$x_2$$ used in calculating the final weighted average of [$$x_0$$, None, $$x_2$$] are $$1-\alpha$$ and $$1$$ if adjust=True, and $$1-\alpha$$ and $$\alpha$$ if adjust=False.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.ewm_std(com=1)
shape: (3,)
Series: 'a' [f64]
[
0.0
0.707107
0.963624
]

ewm_var(com: = None, span: = None, half_life: = None, alpha: = None, adjust: bool = True, bias: bool = False, min_periods: int = 1, ignore_nulls: bool = True) Series[source]

Exponentially-weighted moving variance.

Parameters:
com

Specify decay in terms of center of mass, $$\gamma$$, with

$\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0$
span

Specify decay in terms of span, $$\theta$$, with

$\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1$
half_life

Specify decay in terms of half-life, $$\lambda$$, with

$\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0$
alpha

Specify smoothing factor alpha directly, $$0 < \alpha \leq 1$$.

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

• When adjust=True the EW function is calculated using weights $$w_i = (1 - \alpha)^i$$

• When adjust=False the EW function is calculated recursively by

$\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}$
bias

When bias=False, apply a correction to make the estimate statistically unbiased.

min_periods

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

• When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of $$x_0$$ and $$x_2$$ used in calculating the final weighted average of [$$x_0$$, None, $$x_2$$] are $$(1-\alpha)^2$$ and $$1$$ if adjust=True, and $$(1-\alpha)^2$$ and $$\alpha$$ if adjust=False.

• When ignore_nulls=True, weights are based on relative positions. For example, the weights of $$x_0$$ and $$x_2$$ used in calculating the final weighted average of [$$x_0$$, None, $$x_2$$] are $$1-\alpha$$ and $$1$$ if adjust=True, and $$1-\alpha$$ and $$\alpha$$ if adjust=False.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.ewm_var(com=1)
shape: (3,)
Series: 'a' [f64]
[
0.0
0.5
0.928571
]

exp() Series[source]

Compute the exponential, element-wise.

explode() Series[source]

Explode a list or utf8 Series.

This means that every item is expanded to a new row.

Deprecated since version 0.15.16: Series.explode will be removed in favour of Series.arr.explode and Series.str.explode.

Returns:
Exploded Series of same dtype

ListNameSpace.explode

Explode a list column.

StringNameSpace.explode

Explode a string column.

extend_constant(value: PythonLiteral | None, n: int) Series[source]

Extremely fast method for extending the Series with ‘n’ copies of a value.

Parameters:
value

A constant literal value (not an expression) with which to extend the Series; can pass None to extend with nulls.

n

Examples

>>> s = pl.Series([1, 2, 3])
>>> s.extend_constant(99, n=2)
shape: (5,)
Series: '' [i64]
[
1
2
3
99
99
]

fill_nan(fill_value: int | float | Expr | None) Series[source]

Fill floating point NaN value with a fill value.

Parameters:
fill_value

Value used to fill nan values.

Examples

>>> s = pl.Series("a", [1, 2, 3, float("nan")])
>>> s.fill_nan(0)
shape: (4,)
Series: 'a' [f64]
[
1.0
2.0
3.0
0.0
]

fill_null(value: Any | None = None, strategy: FillNullStrategy | None = None, limit: = None) Series[source]

Fill null values using the specified value or strategy.

Parameters:
value

Value used to fill null values.

strategy{None, ‘forward’, ‘backward’, ‘min’, ‘max’, ‘mean’, ‘zero’, ‘one’}

Strategy used to fill null values.

limit

Number of consecutive null values to fill when using the ‘forward’ or ‘backward’ strategy.

Examples

>>> s = pl.Series("a", [1, 2, 3, None])
>>> s.fill_null(strategy="forward")
shape: (4,)
Series: 'a' [i64]
[
1
2
3
3
]
>>> s.fill_null(strategy="min")
shape: (4,)
Series: 'a' [i64]
[
1
2
3
1
]
>>> s = pl.Series("b", ["x", None, "z"])
>>> s.fill_null(pl.lit(""))
shape: (3,)
Series: 'b' [str]
[
"x"
""
"z"
]

filter(predicate: Series | list[bool]) Self[source]

Filter elements by a boolean mask.

Parameters:
predicate

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> mask = pl.Series("", [True, False, True])
shape: (2,)
Series: 'a' [i64]
[
1
3
]

floor() Series[source]

Rounds down to the nearest integer value.

Only works on floating point Series.

Examples

>>> s = pl.Series("a", [1.12345, 2.56789, 3.901234])
>>> s.floor()
shape: (3,)
Series: 'a' [f64]
[
1.0
2.0
3.0
]

ge(other: Any) Self[source]

Method equivalent of operator expression series >= other.

get_chunks() list[polars.series.series.Series][source]

Get the chunks of this Series as a list of Series.

gt(other: Any) Self[source]

Method equivalent of operator expression series > other.

has_validity() bool[source]

Return True if the Series has a validity bitmask.

If there is none, it means that there are no null values. Use this to swiftly assert a Series does not have null values.

hash(seed: int = 0, seed_1: = None, seed_2: = None, seed_3: = None) Series[source]

Hash the Series.

The hash value is of type UInt64.

Parameters:
seed

Random seed parameter. Defaults to 0.

seed_1

Random seed parameter. Defaults to seed if not set.

seed_2

Random seed parameter. Defaults to seed if not set.

seed_3

Random seed parameter. Defaults to seed if not set.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.hash(seed=42)
shape: (3,)
Series: 'a' [u64]
[
10734580197236529959
3022416320763508302
13756996518000038261
]


Get the first n elements.

Parameters:
n

Number of elements to return. If a negative value is passed, return all elements except the last abs(n).

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
shape: (3,)
Series: 'a' [i64]
[
1
2
3
]


Pass a negative value to get all rows except the last abs(n).

>>> s.head(-3)
shape: (2,)
Series: 'a' [i64]
[
1
2
]

hist(bins: = None, bin_count: = None) DataFrame[source]

Bin values into buckets and count their occurrences.

Parameters:
bins

Discretizations to make. If None given, we determine the boundaries based on the data.

bin_count

If no bins provided, this will be used to determine the distance of the bins

Returns:
DataFrame

Warning

This functionality is experimental and may change without it being considered a breaking change.

Examples

>>> a = pl.Series("a", [1, 3, 8, 8, 2, 1, 3])
>>> a.hist(bin_count=4)
shape: (5, 3)
┌─────────────┬─────────────┬─────────┐
│ break_point ┆ category    ┆ a_count │
│ ---         ┆ ---         ┆ ---     │
│ f64         ┆ cat         ┆ u32     │
╞═════════════╪═════════════╪═════════╡
│ 0.0         ┆ (-inf, 0.0] ┆ 0       │
│ 2.25        ┆ (0.0, 2.25] ┆ 3       │
│ 4.5         ┆ (2.25, 4.5] ┆ 2       │
│ 6.75        ┆ (4.5, 6.75] ┆ 0       │
│ inf         ┆ (6.75, inf] ┆ 2       │
└─────────────┴─────────────┴─────────┘

interpolate(method: InterpolationMethod = 'linear') Series[source]

Interpolate intermediate values. The interpolation method is linear.

Parameters:
method{‘linear’, ‘nearest’}

Interpolation method

Examples

>>> s = pl.Series("a", [1, 2, None, None, 5])
>>> s.interpolate()
shape: (5,)
Series: 'a' [i64]
[
1
2
3
4
5
]

is_between(start: Expr | datetime | date | time | int | float | str, end: Expr | datetime | date | time | int | float | str, closed: ClosedInterval = 'both') Series[source]

Get a boolean mask of the values that fall between the given start/end values.

Parameters:
start

Lower bound value (can be an expression or literal).

end

Upper bound value (can be an expression or literal).

closed{‘both’, ‘left’, ‘right’, ‘none’}

Define which sides of the interval are closed (inclusive).

Examples

>>> s = pl.Series("num", [1, 2, 3, 4, 5])
>>> s.is_between(2, 4)
shape: (5,)
Series: 'num' [bool]
[
false
true
true
true
false
]


Use the closed argument to include or exclude the values at the bounds:

>>> s.is_between(2, 4, closed="left")
shape: (5,)
Series: 'num' [bool]
[
false
true
true
false
false
]


You can also use strings as well as numeric/temporal values:

>>> s = pl.Series("s", ["a", "b", "c", "d", "e"])
>>> s.is_between("b", "d", closed="both")
shape: (5,)
Series: 's' [bool]
[
false
true
true
true
false
]

is_boolean() bool[source]

Check if this Series is a Boolean.

Examples

>>> s = pl.Series("a", [True, False, True])
>>> s.is_boolean()
True

is_duplicated() Series[source]

Get mask of all duplicated values.

Returns:
Boolean Series

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.is_duplicated()
shape: (4,)
Series: 'a' [bool]
[
false
true
true
false
]

is_empty() bool[source]

Check if the Series is empty.

Examples

>>> s = pl.Series("a", [], dtype=pl.Float32)
>>> s.is_empty()
True

is_finite() Series[source]

Returns a boolean Series indicating which values are finite.

Returns:
Boolean Series

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, np.inf])
>>> s.is_finite()
shape: (3,)
Series: 'a' [bool]
[
true
true
false
]

is_first() Series[source]

Get a mask of the first unique value.

Returns:
Boolean Series
is_float() bool[source]

Check if this Series has floating point numbers.

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0])
>>> s.is_float()
True

is_in(other: Union[Series, Collection[Any]]) Series[source]

Check if elements of this Series are in the other Series.

Returns:
Boolean Series

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [2, 4])
>>> s2.is_in(s)
shape: (2,)
Series: 'b' [bool]
[
true
false
]

>>> # check if some values are a member of sublists
>>> sets = pl.Series("sets", [[1, 2, 3], [1, 2], [9, 10]])
>>> optional_members = pl.Series("optional_members", [1, 2, 3])
>>> print(sets)
shape: (3,)
Series: 'sets' [list[i64]]
[
[1, 2, 3]
[1, 2]
[9, 10]
]
>>> print(optional_members)
shape: (3,)
Series: 'optional_members' [i64]
[
1
2
3
]
>>> optional_members.is_in(sets)
shape: (3,)
Series: 'optional_members' [bool]
[
true
true
false
]

is_infinite() Series[source]

Returns a boolean Series indicating which values are infinite.

Returns:
Boolean Series

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, np.inf])
>>> s.is_infinite()
shape: (3,)
Series: 'a' [bool]
[
false
false
true
]

is_nan() Series[source]

Returns a boolean Series indicating which values are not NaN.

Returns:
Boolean Series

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, 3.0, np.NaN])
>>> s.is_nan()
shape: (4,)
Series: 'a' [bool]
[
false
false
false
true
]

is_not_nan() Series[source]

Returns a boolean Series indicating which values are not NaN.

Returns:
Boolean Series

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, 3.0, np.NaN])
>>> s.is_not_nan()
shape: (4,)
Series: 'a' [bool]
[
true
true
true
false
]

is_not_null() Series[source]

Returns a boolean Series indicating which values are not null.

Returns:
Boolean Series

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, None])
>>> s.is_not_null()
shape: (4,)
Series: 'a' [bool]
[
true
true
true
false
]

is_null() Series[source]

Returns a boolean Series indicating which values are null.

Returns:
Boolean Series

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, None])
>>> s.is_null()
shape: (4,)
Series: 'a' [bool]
[
false
false
false
true
]

is_numeric() bool[source]

Check if this Series datatype is numeric.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.is_numeric()
True

is_sorted(descending: bool = False) bool[source]

Check if the Series is sorted.

Parameters:
descending

Check if the Series is sorted in descending order

is_temporal(excluding: OneOrMoreDataTypes | None = None) bool[source]

Check if this Series datatype is temporal.

Parameters:
excluding

Optionally exclude one or more temporal dtypes from matching.

Examples

>>> from datetime import date
>>> s = pl.Series([date(2021, 1, 1), date(2021, 1, 2), date(2021, 1, 3)])
>>> s.is_temporal()
True
>>> s.is_temporal(excluding=[pl.Date])
False

is_unique() Series[source]

Get mask of all unique values.

Returns:
Boolean Series

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.is_unique()
shape: (4,)
Series: 'a' [bool]
[
true
false
false
true
]

is_utf8() bool[source]

Check if this Series datatype is a Utf8.

Examples

>>> s = pl.Series("x", ["a", "b", "c"])
>>> s.is_utf8()
True

item() Any[source]

Return the series as a scalar.

Equivalent to s[0], with a check that the shape is (1,).

Examples

>>> s = pl.Series("a", [1])
>>> s.item()
1

kurtosis(fisher: bool = True, bias: bool = True) [source]

Compute the kurtosis (Fisher or Pearson) of a dataset.

Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution. If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators

Parameters:
fisherbool, optional

If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).

biasbool, optional

If False, the calculations are corrected for statistical bias.

le(other: Any) Self[source]

Method equivalent of operator expression series <= other.

len() int[source]

Length of this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.len()
3

limit(n: int = 10) Series[source]

Get the first n elements.

Alias for Series.head().

Parameters:
n

Number of elements to return. If a negative value is passed, return all elements except the last abs(n).

log(base: float = 2.718281828459045) Series[source]

Compute the logarithm to a given base.

log10() Series[source]

Compute the base 10 logarithm of the input array, element-wise.

lower_bound() Self[source]

Return the lower bound of this Series’ dtype as a unit Series.

upper_bound

return the upper bound of the given Series’ dtype.

Examples

>>> s = pl.Series("s", [-1, 0, 1], dtype=pl.Int32)
>>> s.lower_bound()
shape: (1,)
Series: 's' [i32]
[
-2147483648
]

>>> s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float32)
>>> s.lower_bound()
shape: (1,)
Series: 's' [f32]
[
-inf
]

lt(other: Any) Self[source]

Method equivalent of operator expression series < other.

map_dict(remapping: dict[Any, Any], *, default: Any = None) Self[source]

Replace values in the Series using a remapping dictionary.

Parameters:
remapping

Dictionary containing the before/after values to map.

default

Value to use when the remapping dict does not contain the lookup value. Use pl.first(), to keep the original value.

Examples

>>> s = pl.Series("iso3166", ["TUR", "???", "JPN", "NLD"])
>>> country_lookup = {
...     "JPN": "Japan",
...     "TUR": "Türkiye",
...     "NLD": "Netherlands",
... }


Remap, setting a default for unrecognised values…

>>> s.map_dict(country_lookup, default="Unspecified").rename("country_name")
shape: (4,)
Series: 'country_name' [str]
[
"Türkiye"
"Unspecified"
"Japan"
"Netherlands"
]


…or keep the original value, by making use of pl.first():

>>> s.map_dict(country_lookup, default=pl.first()).rename("country_name")
shape: (4,)
Series: 'country_name' [str]
[
"Türkiye"
"???"
"Japan"
"Netherlands"
]


…or keep the original value, by assigning the input series:

>>> s.map_dict(country_lookup, default=s).rename("country_name")
shape: (4,)
Series: 'country_name' [str]
[
"Türkiye"
"???"
"Japan"
"Netherlands"
]

max() PythonLiteral | None[source]

Get the maximum value in this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.max()
3

mean() [source]

Reduce this Series to the mean value.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.mean()
2.0

median() [source]

Get the median of this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.median()
2.0

min() PythonLiteral | None[source]

Get the minimal value in this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.min()
1

mode() Series[source]

Compute the most occurring value(s).

Can return multiple Values.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.mode()
shape: (1,)
Series: 'a' [i64]
[
2
]

n_chunks() int[source]

Get the number of chunks that this Series contains.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.n_chunks()
1
>>> s2 = pl.Series("a", [4, 5, 6])


Concatenate Series with rechunk = True

>>> pl.concat([s, s2]).n_chunks()
1


Concatenate Series with rechunk = False

>>> pl.concat([s, s2], rechunk=False).n_chunks()
2

n_unique() int[source]

Count the number of unique values in this Series.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.n_unique()
3

nan_max() int | float | date | datetime | timedelta | str[source]

Get maximum value, but propagate/poison encountered NaN values.

This differs from numpy’s nanmax as numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.

nan_min() int | float | date | datetime | timedelta | str[source]

Get minimum value, but propagate/poison encountered NaN values.

This differs from numpy’s nanmax as numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.

ne(other: Any) Self[source]

Method equivalent of operator expression series != other.

new_from_index(index: int, length: int) Self[source]

Create a new Series filled with values from the given index.

null_count() int[source]

Count the null values in this Series.

pct_change(n: int = 1) Series[source]

Computes percentage change between values.

Percentage change (as fraction) between current element and most-recent non-null element at least n period(s) before the current element.

Computes the change from the previous row by default.

Parameters:
n

periods to shift for forming percent change.

Examples

>>> pl.Series(range(10)).pct_change()
shape: (10,)
Series: '' [f64]
[
null
inf
1.0
0.5
0.333333
0.25
0.2
0.166667
0.142857
0.125
]

>>> pl.Series([1, 2, 4, 8, 16, 32, 64, 128, 256, 512]).pct_change(2)
shape: (10,)
Series: '' [f64]
[
null
null
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
]

peak_max() Self[source]

Get a boolean mask of the local maximum peaks.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.peak_max()
shape: (5,)
Series: '' [bool]
[
false
false
false
false
true
]

peak_min() Self[source]

Get a boolean mask of the local minimum peaks.

Examples

>>> s = pl.Series("a", [4, 1, 3, 2, 5])
>>> s.peak_min()
shape: (5,)
Series: '' [bool]
[
false
true
false
true
false
]

product() [source]

Reduce this Series to the product value.

qcut(quantiles: , labels: list[str] | None = None, break_point_label: str = 'break_point', category_label: str = 'category', maintain_order: bool = False) DataFrame[source]

Bin values into discrete values based on their quantiles.

Parameters:
quantiles

Quaniles to create. We expect quantiles 0.0 <= quantile <= 1

labels

Labels to assign to the quantiles. If given the length of labels must be len(bins) + 1.

break_point_label

Name given to the breakpoint column.

category_label

Name given to the category column.

maintain_order

Keep the order of the original Series.

Returns:
DataFrame

Warning

This functionality is experimental and may change without it being considered a breaking change.

Examples

>>> a = pl.Series("a", range(-5, 3))
>>> a.qcut([0.0, 0.25, 0.75])
shape: (8, 3)
┌──────┬─────────────┬───────────────┐
│ a    ┆ break_point ┆ category      │
│ ---  ┆ ---         ┆ ---           │
│ f64  ┆ f64         ┆ cat           │
╞══════╪═════════════╪═══════════════╡
│ -5.0 ┆ -5.0        ┆ (-inf, -5.0]  │
│ -4.0 ┆ -3.25       ┆ (-5.0, -3.25] │
│ -3.0 ┆ 0.25        ┆ (-3.25, 0.25] │
│ -2.0 ┆ 0.25        ┆ (-3.25, 0.25] │
│ -1.0 ┆ 0.25        ┆ (-3.25, 0.25] │
│ 0.0  ┆ 0.25        ┆ (-3.25, 0.25] │
│ 1.0  ┆ inf         ┆ (0.25, inf]   │
│ 2.0  ┆ inf         ┆ (0.25, inf]   │
└──────┴─────────────┴───────────────┘

quantile(quantile: float, interpolation: RollingInterpolationMethod = 'nearest') [source]

Get the quantile value of this Series.

Parameters:
quantile

Quantile between 0.0 and 1.0.

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’}

Interpolation method.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.quantile(0.5)
2.0

rank(method: RankMethod = 'average', *, descending: bool = False) Series[source]

Assign ranks to data, dealing with ties appropriately.

Parameters:
method{‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’, ‘random’}

The method used to assign ranks to tied elements. The following methods are available (default is ‘average’):

• ‘average’ : The average of the ranks that would have been assigned to all the tied values is assigned to each value.

• ‘min’ : The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.)

• ‘max’ : The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.

• ‘dense’ : Like ‘min’, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements.

• ‘ordinal’ : All values are given a distinct rank, corresponding to the order that the values occur in the Series.

• ‘random’ : Like ‘ordinal’, but the rank for ties is not dependent on the order that the values occur in the Series.

descending

Rank in descending order.

Examples

The ‘average’ method:

>>> s = pl.Series("a", [3, 6, 1, 1, 6])
>>> s.rank()
shape: (5,)
Series: 'a' [f32]
[
3.0
4.5
1.5
1.5
4.5
]


The ‘ordinal’ method:

>>> s = pl.Series("a", [3, 6, 1, 1, 6])
>>> s.rank("ordinal")
shape: (5,)
Series: 'a' [u32]
[
3
4
1
2
5
]

rechunk(*, in_place: bool = False) Self[source]

Create a single chunk of memory for this Series.

Parameters:
in_place

In place or not.

reinterpret(signed: bool = True) Series[source]

Reinterpret the underlying bits as a signed/unsigned integer.

This operation is only allowed for 64bit integers. For lower bits integers, you can safely use that cast operation.

Parameters:
signed

If True, reinterpret as pl.Int64. Otherwise, reinterpret as pl.UInt64.

rename(name: str, in_place: bool = False) Series[source]

Rename this Series.

Parameters:
name

New name.

in_place

Modify the Series in-place.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.rename("b")
shape: (3,)
Series: 'b' [i64]
[
1
2
3
]

reshape(dims: tuple[int, ...]) Series[source]

Reshape this Series to a flat Series or a Series of Lists.

Parameters:
dims

Tuple of the dimension sizes. If a -1 is used in any of the dimensions, that dimension is inferred.

Returns:
Series

If a single dimension is given, results in a flat Series of shape (len,). If a multiple dimensions are given, results in a Series of Lists with shape (rows, cols).

ListNameSpace.explode

Explode a list column.

Examples

>>> s = pl.Series("foo", [1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> s.reshape((3, 3))
shape: (3,)
Series: 'foo' [list[i64]]
[
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
]

reverse() Series[source]

Return Series in reverse order.

Examples

>>> s = pl.Series("a", [1, 2, 3], dtype=pl.Int8)
>>> s.reverse()
shape: (3,)
Series: 'a' [i8]
[
3
2
1
]

rolling_apply(function: Callable[[Series], Any], window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Apply a custom rolling window function.

Prefer the specific rolling window functions over this one, as they are faster:

• rolling_min

• rolling_max

• rolling_mean

• rolling_sum

Parameters:
function

Aggregation function

window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> import numpy as np
>>> s = pl.Series("A", [11.0, 2.0, 9.0, float("nan"), 8.0])
>>> print(s.rolling_apply(function=np.nanstd, window_size=3))
shape: (5,)
Series: 'A' [f64]
[
null
null
3.858612
3.5
0.5
]

rolling_max(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Apply a rolling max (moving max) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [100, 200, 300, 400, 500])
>>> s.rolling_max(window_size=2)
shape: (5,)
Series: 'a' [i64]
[
null
200
300
400
500
]

rolling_mean(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Apply a rolling mean (moving mean) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [100, 200, 300, 400, 500])
>>> s.rolling_mean(window_size=2)
shape: (5,)
Series: 'a' [f64]
[
null
150.0
250.0
350.0
450.0
]

rolling_median(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Compute a rolling median.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_median(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
null
null
2.0
3.0
4.0
6.0
]

rolling_min(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Apply a rolling min (moving min) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [100, 200, 300, 400, 500])
>>> s.rolling_min(window_size=3)
shape: (5,)
Series: 'a' [i64]
[
null
null
100
200
300
]

rolling_quantile(quantile: float, interpolation: RollingInterpolationMethod = 'nearest', window_size: int = 2, weights: = None, min_periods: = None, center: bool = False) Series[source]

Compute a rolling quantile.

Parameters:
quantile

Quantile between 0.0 and 1.0.

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’}

Interpolation method.

window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_quantile(quantile=0.33, window_size=3)
shape: (6,)
Series: 'a' [f64]
[
null
null
1.0
2.0
3.0
4.0
]
>>> s.rolling_quantile(quantile=0.33, interpolation="linear", window_size=3)
shape: (6,)
Series: 'a' [f64]
[
null
null
1.66
2.66
3.66
5.32
]

rolling_skew(window_size: int, bias: bool = True) Series[source]

Compute a rolling skew.

Parameters:
window_size

Integer size of the rolling window.

bias

If False, the calculations are corrected for statistical bias.

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_skew(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
null
null
0.0
0.0
0.381802
0.0
]

rolling_std(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Compute a rolling std dev.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_std(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
null
null
1.0
1.0
1.527525
2.0
]

rolling_sum(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Apply a rolling sum (moving sum) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length of the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.rolling_sum(window_size=2)
shape: (5,)
Series: 'a' [i64]
[
null
3
5
7
9
]

rolling_var(window_size: int, weights: = None, min_periods: = None, center: bool = False) Series[source]

Compute a rolling variance.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_var(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
null
null
1.0
1.0
2.333333
4.0
]

round(decimals: int) Series[source]

Round underlying floating point data by decimals digits.

Parameters:
decimals

number of decimals to round by.

Examples

>>> s = pl.Series("a", [1.12345, 2.56789, 3.901234])
>>> s.round(2)
shape: (3,)
Series: 'a' [f64]
[
1.12
2.57
3.9
]

sample(n: = None, frac: = None, with_replacement: bool = False, shuffle: bool = False, seed: = None) Series[source]

Sample from this Series.

Parameters:
n

Number of items to return. Cannot be used with frac. Defaults to 1 if frac is None.

frac

Fraction of items to return. Cannot be used with n.

with_replacement

Allow values to be sampled more than once.

shuffle

Shuffle the order of sampled data points.

seed

Seed for the random number generator. If set to None (default), a random seed is generated using the random module.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.sample(2, seed=0)
shape: (2,)
Series: 'a' [i64]
[
1
5
]

search_sorted(element: , side: SearchSortedSide = 'any') int[source]
search_sorted(element: polars.series.series.Series | numpy.ndarray[Any, Any] | list[int] | list[float], side: SearchSortedSide = 'any') Series

Find indices where elements should be inserted to maintain order.

$a[i-1] < v <= a[i]$
Parameters:
element

Expression or scalar value.

side{‘any’, ‘left’, ‘right’}

If ‘any’, the index of the first suitable location found is given. If ‘left’, the index of the leftmost suitable location found is given. If ‘right’, return the rightmost suitable location found is given.

series_equal(other: Series, null_equal: bool = True, strict: bool = False) bool[source]

Check if series is equal with another Series.

Parameters:
other

Series to compare with.

null_equal

Consider null values as equal.

strict

Don’t allow different numerical dtypes, e.g. comparing pl.UInt32 with a pl.Int64 will return False.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [4, 5, 6])
>>> s.series_equal(s)
True
>>> s.series_equal(s2)
False

set(filter: Series, value: ) Series[source]

Parameters:
filter

value

Value with which to replace the masked values.

Notes

Use of this function is frequently an anti-pattern, as it can block optimisation (predicate pushdown, etc). Consider using pl.when(predicate).then(value).otherwise(self) instead.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.set(s == 2, 10)
shape: (3,)
Series: 'a' [i64]
[
1
10
3
]


It is better to implement this as follows:

>>> s.to_frame().select(
...     pl.when(pl.col("a") == 2).then(10).otherwise(pl.col("a"))
... )
shape: (3, 1)
┌─────────┐
│ literal │
│ ---     │
│ i64     │
╞═════════╡
│ 1       │
│ 10      │
│ 3       │
└─────────┘

set_at_idx(idx: Union[Series, ndarray[Any, Any], Sequence[int], int], value: Optional[Union[int, float, str, bool, Sequence[int], Sequence[float], Sequence[bool], Sequence[str], Sequence[date], Sequence[datetime], date, datetime, Series]]) Series[source]

Set values at the index locations.

Parameters:
idx

Integers representing the index locations.

value

replacement values.

Returns:
the series mutated

Notes

Use of this function is frequently an anti-pattern, as it can block optimisation (predicate pushdown, etc). Consider using pl.when(predicate).then(value).otherwise(self) instead.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.set_at_idx(1, 10)
shape: (3,)
Series: 'a' [i64]
[
1
10
3
]


It is better to implement this as follows:

>>> s.to_frame().with_row_count("row_nr").select(
...     pl.when(pl.col("row_nr") == 1).then(10).otherwise(pl.col("a"))
... )
shape: (3, 1)
┌─────────┐
│ literal │
│ ---     │
│ i64     │
╞═════════╡
│ 1       │
│ 10      │
│ 3       │
└─────────┘

set_sorted(*, descending: bool = False) Self[source]

Flags the Series as ‘sorted’.

Enables downstream code to user fast paths for sorted arrays.

Parameters:
descending

If the Series order is descending.

Warning

This can lead to incorrect results if this Series is not sorted!! Use with care!

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.set_sorted().max()
3

shift(periods: int = 1) Series[source]

Shift the values by a given period.

Parameters:
periods

Number of places to shift (may be negative).

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.shift(periods=1)
shape: (3,)
Series: 'a' [i64]
[
null
1
2
]
>>> s.shift(periods=-1)
shape: (3,)
Series: 'a' [i64]
[
2
3
null
]

shift_and_fill(periods: int, fill_value: int | Expr) Series[source]

Shift the values by a given period and fill the resulting null values.

Parameters:
periods

Number of places to shift (may be negative).

fill_value

Fill None values with the result of this expression.

shrink_dtype() Series[source]

Shrink numeric columns to the minimal required datatype.

Shrink to the dtype needed to fit the extrema of this [Series]. This can be used to reduce memory pressure.

shrink_to_fit(in_place: bool = False) Series[source]

Shrink Series memory usage.

Shrinks the underlying array capacity to exactly fit the actual data. (Note that this function does not change the Series data type).

shuffle(seed: = None) Series[source]

Shuffle the contents of this Series.

Parameters:
seed

Seed for the random number generator. If set to None (default), a random seed is generated using the random module.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.shuffle(seed=1)
shape: (3,)
Series: 'a' [i64]
[
2
1
3
]

sign() Series[source]

Compute the element-wise indication of the sign.

The returned values can be -1, 0, or 1:

• -1 if x < 0.

• 0 if x == 0.

• 1 if x > 0.

(null values are preserved as-is).

Examples

>>> s = pl.Series("a", [-9.0, -0.0, 0.0, 4.0, None])
>>> s.sign()
shape: (5,)
Series: 'a' [i64]
[
-1
0
0
1
null
]

sin() Series[source]

Compute the element-wise value for the sine.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.sin()
shape: (3,)
Series: 'a' [f64]
[
0.0
1.0
1.2246e-16
]

sinh() Series[source]

Compute the element-wise value for the hyperbolic sine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.sinh()
shape: (3,)
Series: 'a' [f64]
[
1.175201
0.0
-1.175201
]

skew(bias: bool = True) [source]

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

Parameters:
biasbool, optional

If False, the calculations are corrected for statistical bias.

Notes

The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.

$g_1=\frac{m_3}{m_2^{3/2}}$

where

$m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i$

is the biased sample $$i\texttt{th}$$ central moment, and $$\bar{x}$$ is the sample mean. If bias is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.

$G_1 = \frac{k_3}{k_2^{3/2}} = \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}$
slice(offset: int, length: = None) Series[source]

Get a slice of this Series.

Parameters:
offset

Start index. Negative indexing is supported.

length

Length of the slice. If set to None, all rows starting at the offset will be selected.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4])
>>> s.slice(1, 2)
shape: (2,)
Series: 'a' [i64]
[
2
3
]

sort(*, descending: bool = False, in_place: bool = False) Self[source]

Sort this Series.

Parameters:
descending

Sort in descending order.

in_place

Sort in-place.

Examples

>>> s = pl.Series("a", [1, 3, 4, 2])
>>> s.sort()
shape: (4,)
Series: 'a' [i64]
[
1
2
3
4
]
>>> s.sort(descending=True)
shape: (4,)
Series: 'a' [i64]
[
4
3
2
1
]

sqrt() Series[source]

Compute the square root of the elements.

Syntactic sugar for

>>> pl.Series([1, 2]) ** 0.5
shape: (2,)
Series: '' [f64]
[
1.0
1.414214
]

std(ddof: int = 1) [source]

Get the standard deviation of this Series.

Parameters:
ddof

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.std()
1.0

sum() [source]

Reduce this Series to the sum value.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.sum()
6

tail(n: int = 10) Series[source]

Get the last n elements.

Parameters:
n

Number of elements to return. If a negative value is passed, return all elements except the first abs(n).

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.tail(3)
shape: (3,)
Series: 'a' [i64]
[
3
4
5
]


Pass a negative value to get all rows except the first abs(n).

>>> s.tail(-3)
shape: (2,)
Series: 'a' [i64]
[
4
5
]

take(indices: int | list[int] | Expr | Series | np.ndarray[Any, Any]) Series[source]

Take values by index.

Parameters:
indices

Index location used for selection.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4])
>>> s.take([1, 3])
shape: (2,)
Series: 'a' [i64]
[
2
4
]

take_every(n: int) Series[source]

Take every nth value in the Series and return as new Series.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4])
>>> s.take_every(2)
shape: (2,)
Series: 'a' [i64]
[
1
3
]

tan() Series[source]

Compute the element-wise value for the tangent.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.tan()
shape: (3,)
Series: 'a' [f64]
[
0.0
1.6331e16
-1.2246e-16
]

tanh() Series[source]

Compute the element-wise value for the hyperbolic tangent.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.tanh()
shape: (3,)
Series: 'a' [f64]
[
0.761594
0.0
-0.761594
]

to_arrow() [source]

Get the underlying Arrow Array.

If the Series contains only a single chunk this operation is zero copy.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s = s.to_arrow()
>>> s
<pyarrow.lib.Int64Array object at ...>
[
1,
2,
3
]

to_dummies(separator: str = '_') DataFrame[source]

Get dummy/indicator variables.

Parameters:
separator

Separator/delimiter used when generating column names.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.to_dummies()
shape: (3, 3)
┌─────┬─────┬─────┐
│ a_1 ┆ a_2 ┆ a_3 │
│ --- ┆ --- ┆ --- │
│ u8  ┆ u8  ┆ u8  │
╞═════╪═════╪═════╡
│ 1   ┆ 0   ┆ 0   │
│ 0   ┆ 1   ┆ 0   │
│ 0   ┆ 0   ┆ 1   │
└─────┴─────┴─────┘

to_frame(name: = None) DataFrame[source]

Cast this Series to a DataFrame.

Parameters:
name

optionally name/rename the Series column in the new DataFrame.

Examples

>>> s = pl.Series("a", [123, 456])
>>> df = s.to_frame()
>>> df
shape: (2, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 123 │
│ 456 │
└─────┘

>>> df = s.to_frame("xyz")
>>> df
shape: (2, 1)
┌─────┐
│ xyz │
│ --- │
│ i64 │
╞═════╡
│ 123 │
│ 456 │
└─────┘

to_list(use_pyarrow: bool = False) list[Any][source]

Convert this Series to a Python List. This operation clones data.

Parameters:
use_pyarrow

Use pyarrow for the conversion.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.to_list()
[1, 2, 3]
>>> type(s.to_list())
<class 'list'>

to_numpy(*args: Any, zero_copy_only: bool = False, writable: bool = False, use_pyarrow: bool = True) [source]

Convert this Series to numpy. This operation clones data but is completely safe.

If you want a zero-copy view and know what you are doing, use .view().

Parameters:
*args

args will be sent to pyarrow.Array.to_numpy.

zero_copy_only

If True, an exception will be raised if the conversion to a numpy array would require copying the underlying data (e.g. in presence of nulls, or for non-primitive types).

writable

For numpy arrays created with zero copy (view on the Arrow data), the resulting array is not writable (Arrow data is immutable). By setting this to True, a copy of the array is made to ensure it is writable.

use_pyarrow

Use pyarrow for the conversion to numpy.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> arr = s.to_numpy()
>>> arr
array([1, 2, 3], dtype=int64)
>>> type(arr)
<class 'numpy.ndarray'>

to_pandas(*args: Any, use_pyarrow_extension_array: bool = False, **kwargs: Any) [source]

Convert this Series to a pandas Series.

This requires that pandas and pyarrow are installed. This operation clones data, unless use_pyarrow_extension_array=True.

Parameters:
use_pyarrow_extension_array

Further operations on this Pandas series, might trigger conversion to numpy. Use PyArrow backed-extension array instead of numpy array for pandas Series. This allows zero copy operations and preservation of nulls values. Further operations on this pandas Series, might trigger conversion to NumPy arrays if that operation is not supported by pyarrow compute functions.

kwargs

Arguments will be sent to pyarrow.Table.to_pandas().

Examples

>>> s1 = pl.Series("a", [1, 2, 3])
>>> s1.to_pandas()
0    1
1    2
2    3
Name: a, dtype: int64
>>> s1.to_pandas(use_pyarrow_extension_array=True)
0    1
1    2
2    3
Name: a, dtype: int64[pyarrow]
>>> s2 = pl.Series("b", [1, 2, None, 4])
>>> s2.to_pandas()
0    1.0
1    2.0
2    NaN
3    4.0
Name: b, dtype: float64
>>> s2.to_pandas(use_pyarrow_extension_array=True)
0       1
1       2
2    <NA>
3       4
Name: b, dtype: int64[pyarrow]

to_physical() Series[source]

Cast to physical representation of the logical dtype.

• polars.datatypes.Date() -> polars.datatypes.Int32()

• polars.datatypes.Datetime() -> polars.datatypes.Int64()

• polars.datatypes.Time() -> polars.datatypes.Int64()

• polars.datatypes.Duration() -> polars.datatypes.Int64()

• polars.datatypes.Categorical() -> polars.datatypes.UInt32()

• Other data types will be left unchanged.

Examples

Replicating the pandas pd.Series.factorize method.

>>> s = pl.Series("values", ["a", None, "x", "a"])
>>> s.cast(pl.Categorical).to_physical()
shape: (4,)
Series: 'values' [u32]
[
0
null
1
0
]

top_k(*, k: int = 5, descending: bool = False) Series[source]

Return the k largest elements.

If ‘descending=True the smallest elements will be given.

This has time complexity:

$\begin{split}O(n + k \\log{}n - \frac{k}{2})\end{split}$
Parameters:
k

Number of elements to return.

descending

Return the smallest elements.

unique(maintain_order: bool = False) Series[source]

Get unique elements in series.

Parameters:
maintain_order

Maintain order of data. This requires more work.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.unique().sort()
shape: (3,)
Series: 'a' [i64]
[
1
2
3
]

unique_counts() Series[source]

Return a count of the unique values in the order of appearance.

Examples

>>> s = pl.Series("id", ["a", "b", "b", "c", "c", "c"])
>>> s.unique_counts()
shape: (3,)
Series: 'id' [u32]
[
1
2
3
]

upper_bound() Self[source]

Return the upper bound of this Series’ dtype as a unit Series.

lower_bound

return the lower bound of the given Series’ dtype.

Examples

>>> s = pl.Series("s", [-1, 0, 1], dtype=pl.Int8)
>>> s.upper_bound()
shape: (1,)
Series: 's' [i8]
[
127
]

>>> s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float64)
>>> s.upper_bound()
shape: (1,)
Series: 's' [f64]
[
inf
]

value_counts(sort: bool = False) DataFrame[source]

Count the unique values in a Series.

Parameters:
sort

Ensure the output is sorted from most values to least.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.value_counts().sort(by="a")
shape: (3, 2)
┌─────┬────────┐
│ a   ┆ counts │
│ --- ┆ ---    │
│ i64 ┆ u32    │
╞═════╪════════╡
│ 1   ┆ 1      │
│ 2   ┆ 2      │
│ 3   ┆ 1      │
└─────┴────────┘

var(ddof: int = 1) [source]

Get variance of this Series.

Parameters:
ddof

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.var()
1.0

view(ignore_nulls: bool = False) SeriesView[source]

Get a view into this Series data with a numpy array.

This operation doesn’t clone data, but does not include missing values. Don’t use this unless you know what you are doing.

Parameters:
ignore_nulls

If True then nulls are converted to 0. If False then an Exception is raised if nulls are present.

Examples

>>> s = pl.Series("a", [1, None])
>>> s.view(ignore_nulls=True)
SeriesView([1, 0])


Take values from self or other based on the given mask.

Where mask evaluates true, take values from self. Where mask evaluates false, take values from other.

Parameters:

Boolean Series.

other

Series of same type.

Returns:
New Series

Examples

>>> s1 = pl.Series([1, 2, 3, 4, 5])
>>> s2 = pl.Series([5, 4, 3, 2, 1])
>>> s1.zip_with(s1 < s2, s2)
shape: (5,)
Series: '' [i64]
[
1
2
3
2
1
]
>>> mask = pl.Series([True, False, True, False, True])
`