Use the Aho-Corasick algorithm to replace many matches
Description
This function replaces several matches at once.
Usage
<Expr>$str$replace_many(
patterns,
replace_with,
...,
ascii_case_insensitive = FALSE,
leftmost = FALSE
)
Arguments
patterns
|
String patterns to search. Accepts expression input. To use the same
character vector for all rows, use list(c(…)) instead of
c(…) (see Examples).
|
replace_with
|
A vector of strings used as replacements. If this is of length 1, then
it is applied to all matches. Otherwise, it must be of same length as
the patterns argument.
|
…
|
These dots are for future extensions and must be empty. |
ascii_case_insensitive
|
Enable ASCII-aware case insensitive matching. When this option is enabled, searching will be performed without respect to case for ASCII letters (a-z and A-Z) only. |
leftmost
|
Whether to guarantee in case there are overlapping matches that the
leftmost match is used. In case there are multiple candidates for the
leftmost match, the pattern which comes first in patterns
is used.
|
Value
A polars expression
Examples
library("polars")
df <- pl$DataFrame(
lyrics = c(
"Everybody wants to rule the world",
"Tell me what you want, what you really really want",
"Can you feel the love tonight"
)
)
# a replacement of length 1 is applied to all matches
df$with_columns(
remove_pronouns = pl$col("lyrics")$str$replace_many(list(c("you", "me")), "")
)
#> shape: (3, 2)
#> ┌─────────────────────────────────┬─────────────────────────────────┐
#> │ lyrics ┆ remove_pronouns │
#> │ --- ┆ --- │
#> │ str ┆ str │
#> ╞═════════════════════════════════╪═════════════════════════════════╡
#> │ Everybody wants to rule the wo… ┆ Everybody wants to rule the wo… │
#> │ Tell me what you want, what yo… ┆ Tell what want, what really… │
#> │ Can you feel the love tonight ┆ Can feel the love tonight │
#> └─────────────────────────────────┴─────────────────────────────────┘
# if there are more than one replacement, the patterns and replacements are
# matched
df$with_columns(
fake_pronouns = pl$col("lyrics")$str$replace_many(list(c("you", "me")), c("foo", "bar"))
)
#> shape: (3, 2)
#> ┌─────────────────────────────────┬─────────────────────────────────┐
#> │ lyrics ┆ fake_pronouns │
#> │ --- ┆ --- │
#> │ str ┆ str │
#> ╞═════════════════════════════════╪═════════════════════════════════╡
#> │ Everybody wants to rule the wo… ┆ Everybody wants to rule the wo… │
#> │ Tell me what you want, what yo… ┆ Tell bar what foo want, what f… │
#> │ Can you feel the love tonight ┆ Can foo feel the love tonight │
#> └─────────────────────────────────┴─────────────────────────────────┘