Replace all matching regex/literal substrings with a new string value
Description
Replace all matching regex/literal substrings with a new string value
Usage
<Expr>$str$replace_all(pattern, value, ..., literal = FALSE)
Arguments
pattern
|
A character or something can be coerced to a string Expr of a valid regex pattern, compatible with the regex crate. |
value
|
A character or an Expr of string that will replace the matched substring. |
…
|
Ignored. |
literal
|
Logical. If TRUE (default), treat pattern as a
literal string, not as a regular expression.
|
Details
To modify regular expression behaviour (such as case-sensitivity) with
flags, use the inline (?iLmsuxU)
syntax. See the regex
crate’s section on
grouping
and flags for additional information about the use of inline
expression modifiers.
Value
Expr of String type
Capture groups
The dollar sign ($
) is a special character related to
capture groups. To refer to a literal dollar sign, use
$$
instead or set
literal
to TRUE
.
See Also
-
\
$str$replace()
Examples
library("polars")
df = pl$DataFrame(id = 1L:2L, text = c("abcabc", "123a123"))
df$with_columns(pl$col("text")$str$replace_all("a", "-"))
#> shape: (2, 2)
#> ┌─────┬─────────┐
#> │ id ┆ text │
#> │ --- ┆ --- │
#> │ i32 ┆ str │
#> ╞═════╪═════════╡
#> │ 1 ┆ -bc-bc │
#> │ 2 ┆ 123-123 │
#> └─────┴─────────┘
# Capture groups are supported.
# Use `${1}` in the value string to refer to the first capture group in the pattern,
# `${2}` to refer to the second capture group, and so on.
# You can also use named capture groups.
df = pl$DataFrame(word = c("hat", "hut"))
df$with_columns(
positional = pl$col("word")$str$replace_all("h(.)t", "b${1}d"),
named = pl$col("word")$str$replace_all("h(?<vowel>.)t", "b${vowel}d")
)
#> shape: (2, 3)
#> ┌──────┬────────────┬───────┐
#> │ word ┆ positional ┆ named │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ str │
#> ╞══════╪════════════╪═══════╡
#> │ hat ┆ bad ┆ bad │
#> │ hut ┆ bud ┆ bud │
#> └──────┴────────────┴───────┘
# Apply case-insensitive string replacement using the `(?i)` flag.
df = pl$DataFrame(
city = "Philadelphia",
season = c("Spring", "Summer", "Autumn", "Winter"),
weather = c("Rainy", "Sunny", "Cloudy", "Snowy")
)
df$with_columns(
pl$col("weather")$str$replace_all(
"(?i)foggy|rainy|cloudy|snowy", "Sunny"
)
)
#> shape: (4, 3)
#> ┌──────────────┬────────┬─────────┐
#> │ city ┆ season ┆ weather │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ str │
#> ╞══════════════╪════════╪═════════╡
#> │ Philadelphia ┆ Spring ┆ Sunny │
#> │ Philadelphia ┆ Summer ┆ Sunny │
#> │ Philadelphia ┆ Autumn ┆ Sunny │
#> │ Philadelphia ┆ Winter ┆ Sunny │
#> └──────────────┴────────┴─────────┘