polars.internals.expr.ExprStringNameSpace.extract

ExprStringNameSpace.extract(pattern: str, group_index: int = 1) polars.internals.expr.Expr

Extract the target capture group from provided patterns.

Parameters
pattern

A valid regex pattern

group_index

Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group

Returns
Utf8 array. Contain null if original value is null or regex capture nothing.

Examples

>>> df = pl.DataFrame(
...     {
...         "a": [
...             "http://vote.com/ballon_dor?candidate=messi&ref=polars",
...             "http://vote.com/ballon_dor?candidat=jorginho&ref=polars",
...             "http://vote.com/ballon_dor?candidate=ronaldo&ref=polars",
...         ]
...     }
... )
>>> df.select(
...     [
...         pl.col("a").str.extract(r"candidate=(\w+)", 1),
...     ]
... )
shape: (3, 1)
┌─────────┐
│ a       │
│ ---     │
│ str     │
╞═════════╡
│ messi   │
├╌╌╌╌╌╌╌╌╌┤
│ null    │
├╌╌╌╌╌╌╌╌╌┤
│ ronaldo │
└─────────┘