polars.Expr.str.extract_all#

Expr.str.extract_all(pattern: str) Expr[source]#

Extracts all matches for the given regex pattern.

Extracts each successive non-overlapping regex match in an individual string as an array.

Parameters:
pattern

A valid regex pattern

Returns:
List[Utf8] array. Contain null if original value is null or regex capture
nothing.

Examples

>>> df = pl.DataFrame({"foo": ["123 bla 45 asd", "xyz 678 910t"]})
>>> df.select(
...     [
...         pl.col("foo").str.extract_all(r"(\d+)").alias("extracted_nrs"),
...     ]
... )
shape: (2, 1)
┌────────────────┐
│ extracted_nrs  │
│ ---            │
│ list[str]      │
╞════════════════╡
│ ["123", "45"]  │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ["678", "910"] │
└────────────────┘