Check if strings in Series contain a substring that matches a pattern.
A valid regular expression pattern, compatible with the regex crate @param literal Treat
pattern` as a literal string, not as a regular expression.
Optional
literal: booleanOptional
strict: booleanRaise an error if the underlying pattern is not a valid regex, otherwise mask out with a null value.
Boolean mask
const df = pl.DataFrame({"txt": ["Crab", "cat and dog", "rab$bit", null]})
df.select(
... pl.col("txt"),
... pl.col("txt").str.contains("cat|bit").alias("regex"),
... pl.col("txt").str.contains("rab$", true).alias("literal"),
... )
shape: (4, 3)
┌─────────────┬───────┬─────────┐
│ txt ┆ regex ┆ literal │
│ --- ┆ --- ┆ --- │
│ str ┆ bool ┆ bool │
╞═════════════╪═══════╪═════════╡
│ Crab ┆ false ┆ false │
│ cat and dog ┆ true ┆ false │
│ rab$bit ┆ true ┆ true │
│ null ┆ null ┆ null │
└─────────────┴───────┴─────────┘
Extract the target capture group from provided patterns.
Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group
Utf8 array. Contain null if original value is null or regex capture nothing.
> df = pl.DataFrame({
... 'a': [
... 'http://vote.com/ballon_dor?candidate=messi&ref=polars',
... 'http://vote.com/ballon_dor?candidat=jorginho&ref=polars',
... 'http://vote.com/ballon_dor?candidate=ronaldo&ref=polars'
... ]})
> df.select(pl.col('a').str.extract(/candidate=(\w+)/, 1))
shape: (3, 1)
┌─────────┐
│ a │
│ --- │
│ str │
╞═════════╡
│ messi │
├╌╌╌╌╌╌╌╌╌┤
│ null │
├╌╌╌╌╌╌╌╌╌┤
│ ronaldo │
└─────────┘
Parse string values as JSON. Throw errors if encounter invalid JSON strings.
Optional
dtype: DataTypeOptional
inferSchemaLength: numberDF with struct
>>> df = pl.DataFrame( {json: ['{"a":1, "b": true}', null, '{"a":2, "b": false}']} )
>>> df.select(pl.col("json").str.jsonDecode())
shape: (3, 1)
┌─────────────┐
│ json │
│ --- │
│ struct[2] │
╞═════════════╡
│ {1,true} │
│ {null,null} │
│ {2,false} │
└─────────────┘
See Also
----------
jsonPathMatch : Extract the first match of json string with provided JSONPath expression.
Extract the first match of json string with provided JSONPath expression. Throw errors if encounter invalid json strings. All return value will be casted to Utf8 regardless of the original value.
Utf8 array. Contain null if original value is null or the jsonPath
return nothing.
Parse a Series of dtype Utf8 to a Date/Datetime Series.
Date or Datetime.
Calendar date and time type
Optional
timeUnit: TimeUnit | "ms" | "ns" | "us"any of 'ms' | 'ns' | 'us'
timezone string as defined by Intl.DateTimeFormat America/New_York
for example.
Optional
fmt: stringformatting syntax. Read more
namespace containing expr string functions