Check if strings in Series contain a substring that matches a pattern.
A valid regular expression pattern, compatible with the regex crate @param literal Treat
pattern` as a literal string, not as a regular expression.
Optional
literal: booleanOptional
strict: booleanRaise an error if the underlying pattern is not a valid regex, otherwise mask out with a null value.
Boolean mask
Extract the target capture group from provided patterns.
A valid regex pattern
Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group
Utf8 array. Contain null if original value is null or regex capture nothing.
> df = pl.DataFrame({
... 'a': [
... 'http://vote.com/ballon_dor?candidate=messi&ref=polars',
... 'http://vote.com/ballon_dor?candidat=jorginho&ref=polars',
... 'http://vote.com/ballon_dor?candidate=ronaldo&ref=polars'
... ]})
> df.getColumn("a").str.extract(/candidate=(\w+)/, 1)
shape: (3, 1)
┌─────────┐
│ a │
│ --- │
│ str │
╞═════════╡
│ messi │
├╌╌╌╌╌╌╌╌╌┤
│ null │
├╌╌╌╌╌╌╌╌╌┤
│ ronaldo │
└─────────┘
Extract the first match of json string in Series with provided JSONPath expression. Throw errors if encounter invalid json strings. All return value will be casted to Utf8 regardless of the original value.
A valid JSON path query string
Utf8 array. Contain null if original value is null or the jsonPath
return nothing.
Parse a Series of dtype Utf8 to a Date/Datetime Series.
Date or Datetime.
Calendar date and time type
Optional
timeUnit: TimeUnit | "ms" | "ns" | "us"any of 'ms' | 'ns' | 'us'
timezone string as defined by Intl.DateTimeFormat America/New_York
for example.
Optional
fmt: stringformatting syntax. Read more
String functions for Series