polars.Series.str.extract#
- Series.str.extract(pattern: str, group_index: int = 1) Series [source]#
Extract the target capture group from provided patterns.
- Parameters:
- pattern
A valid regex pattern
- group_index
Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group
- Returns:
- Utf8 array. Contain null if original value is null or regex capture nothing.
Examples
>>> df = pl.DataFrame( ... { ... "a": [ ... "http://vote.com/ballon_dor?candidate=messi&ref=polars", ... "http://vote.com/ballon_dor?candidat=jorginho&ref=polars", ... "http://vote.com/ballon_dor?candidate=ronaldo&ref=polars", ... ] ... } ... ) >>> df.select([pl.col("a").str.extract(r"candidate=(\w+)", 1)]) shape: (3, 1) ┌─────────┐ │ a │ │ --- │ │ str │ ╞═════════╡ │ messi │ │ null │ │ ronaldo │ └─────────┘