Interface StringFunctions<T>

interface StringFunctions<T> {
    concat(delimiter: string, ignoreNulls?: boolean): T;
    contains(
        pat: string | RegExp | pl.Expr,
        literal: boolean,
        strict: boolean,
    ): T;
    decode(encoding: "base64" | "hex", strict?: boolean): T;
    decode(options: { encoding: "base64" | "hex"; strict?: boolean }): T;
    encode(encoding: "base64" | "hex"): T;
    extract(pat: string | RegExp, groupIndex: number): T;
    jsonPathMatch(pat: string): T;
    lengths(): T;
    lstrip(): T;
    replace(pat: string | RegExp, val: string): T;
    replaceAll(pat: string | RegExp, val: string): T;
    rstrip(): T;
    slice(start: number, length?: number): T;
    split(by: string, options?: boolean | { inclusive?: boolean }): T;
    strip(): T;
    strptime(
        datatype:
            | Date
            | Datetime
            | (
                (
                    timeUnit?: TimeUnit | "ms" | "ns" | "us",
                    timeZone?: string | null | undefined,
                ) => Datetime
            ),
        fmt?: string,
    ): T;
    toLowerCase(): T;
    toUpperCase(): T;
}

Type Parameters

Hierarchy (View Summary)

StringFunctions
- StringNamespace
- StringSeries

Index

Methods

concat contains decode encode extract jsonPathMatch lengths lstrip replace replaceAll rstrip slice split strip strptime toLowerCase toUpperCase

Methods

concat

concat(delimiter: string, ignoreNulls?: boolean): T

Vertically concat the values in the Series to a single string value.

Parameters

delimiter: string
OptionalignoreNulls: boolean

Returns T

Example

> df = pl.DataFrame({"foo": [1, null, 2]})
> df = df.select(pl.col("foo").str.concat("-"))
> df
shape: (1, 1)
┌──────────┐
│ foo      │
│ ---      │
│ str      │
╞══════════╡
│ 1-null-2 │
└──────────┘

contains

contains(pat: string | RegExp | pl.Expr, literal: boolean, strict: boolean): T

Check if strings in Series contain a substring that matches a pattern.

Parameters

pat: string | RegExp | pl.Expr
A valid regular expression pattern, compatible with the regex crate @param literal Treat pattern` as a literal string, not as a regular expression.
literal: boolean
strict: boolean
Raise an error if the underlying pattern is not a valid regex, otherwise mask out with a null value.

Returns T

Boolean mask

Example

const df = pl.DataFrame({"txt": ["Crab", "cat and dog", "rab$bit", null]})
df.select(
...     pl.col("txt"),
...     pl.col("txt").str.contains("cat|bit").alias("regex"),
...     pl.col("txt").str.contains("rab$", true).alias("literal"),
... )
shape: (4, 3)
┌─────────────┬───────┬─────────┐
│ txt         ┆ regex ┆ literal │
│ ---         ┆ ---   ┆ ---     │
│ str         ┆ bool  ┆ bool    │
╞═════════════╪═══════╪═════════╡
│ Crab        ┆ false ┆ false   │
│ cat and dog ┆ true  ┆ false   │
│ rab$bit     ┆ true  ┆ true    │
│ null        ┆ null  ┆ null    │
└─────────────┴───────┴─────────┘

decode

decode(encoding: "base64" | "hex", strict?: boolean): T

Decodes a value using the provided encoding

Parameters

encoding: "base64" | "hex"
hex | base64

Optionalstrict: boolean

how to handle invalid inputs

- true: method will throw error if unable to decode a value
- false: unhandled values will be replaced with `null`

Returns T

Example

> df = pl.DataFrame({"strings": ["666f6f", "626172", null]})
> df.select(col("strings").str.decode("hex"))
shape: (3, 1)
┌─────────┐
│ strings │
│ ---     │
│ str     │
╞═════════╡
│ foo     │
├╌╌╌╌╌╌╌╌╌┤
│ bar     │
├╌╌╌╌╌╌╌╌╌┤
│ null    │
└─────────┘

decode(options: { encoding: "base64" | "hex"; strict?: boolean }): T
Parameters
- options: { encoding: "base64" | "hex"; strict?: boolean }
Returns T
- Defined in polars/shared_traits.ts:1168

encode

encode(encoding: "base64" | "hex"): T

Encodes a value using the provided encoding

Parameters

encoding: "base64" | "hex"
hex | base64

Returns T

Example

> df = pl.DataFrame({"strings", ["foo", "bar", null]})
> df.select(col("strings").str.encode("hex"))
shape: (3, 1)
┌─────────┐
│ strings │
│ ---     │
│ str     │
╞═════════╡
│ 666f6f  │
├╌╌╌╌╌╌╌╌╌┤
│ 626172  │
├╌╌╌╌╌╌╌╌╌┤
│ null    │
└─────────┘

extract

extract(pat: string | RegExp, groupIndex: number): T

Extract the target capture group from provided patterns.

Parameters

pat: string | RegExp
A valid regex pattern
groupIndex: number
Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group

Returns T

Utf8 array. Contain null if original value is null or regex capture nothing.

Example

>  df = pl.DataFrame({
...   'a': [
...       'http://vote.com/ballon_dor?candidate=messi&ref=polars',
...       'http://vote.com/ballon_dor?candidat=jorginho&ref=polars',
...       'http://vote.com/ballon_dor?candidate=ronaldo&ref=polars'
...   ]})
>  df.select(pl.col('a').str.extract(/candidate=(\w+)/, 1))
shape: (3, 1)
┌─────────┐
│ a       │
│ ---     │
│ str     │
╞═════════╡
│ messi   │
├╌╌╌╌╌╌╌╌╌┤
│ null    │
├╌╌╌╌╌╌╌╌╌┤
│ ronaldo │
└─────────┘

jsonPathMatch

jsonPathMatch(pat: string): T
Extract the first match of json string with provided JSONPath expression. Throw errors if encounter invalid json strings. All return value will be casted to Utf8 regardless of the original value.
Parameters
- pat: string
  A valid JSON path query string
Returns T
Utf8 array. Contain null if original value is null or the jsonPath return nothing.
See
https://goessner.net/articles/JsonPath/
Example
```
> df = pl.DataFrame({
...   'json_val': [
...     '{"a":"1"}',
...     null,
...     '{"a":2}',
...     '{"a":2.1}',
...     '{"a":true}'
...   ]
... })
> df.select(pl.col('json_val').str.jsonPathMatch('$.a')
shape: (5,)
Series: 'json_val' [str]
[
    "1"
    null
    "2"
    "2.1"
    "true"
]
```
- Defined in polars/shared_traits.ts:1252

lengths

lengths(): T
Get length of the string values in the Series.

Returns T
- Defined in polars/shared_traits.ts:1254

lstrip

lstrip(): T
Remove leading whitespace.

Returns T
- Defined in polars/shared_traits.ts:1256

replace

replace(pat: string | RegExp, val: string): T
Replace first regex match with a string value.
Parameters
- pat: string | RegExp
- val: string
Returns T
- Defined in polars/shared_traits.ts:1258

replaceAll

replaceAll(pat: string | RegExp, val: string): T
Replace all regex matches with a string value.
Parameters
- pat: string | RegExp
- val: string
Returns T
- Defined in polars/shared_traits.ts:1260

rstrip

rstrip(): T
Remove trailing whitespace.

Returns T
- Defined in polars/shared_traits.ts:1266

slice

slice(start: number, length?: number): T
Create subslices of the string values of a Utf8 Series.
Parameters
- start: number
  Start of the slice (negative indexing may be used).
- Optionallength: number
  Optional length of the slice.
Returns T
- Defined in polars/shared_traits.ts:1272

split

split(by: string, options?: boolean | { inclusive?: boolean }): T
Split a string into substrings using the specified separator and return them as a Series.
Parameters
- by: string
  — A string that identifies character or characters to use in separating the string.
- Optionaloptions: boolean | { inclusive?: boolean }
  - boolean
  - { inclusive?: boolean }
    Optionalinclusive?: boolean
    Include the split character/string in the results
Returns T
- Defined in polars/shared_traits.ts:1278

strip

strip(): T
Remove leading and trailing whitespace.

Returns T
- Defined in polars/shared_traits.ts:1280

strptime

strptime(
    datatype:
        | Date
        | Datetime
        | (
            (
                timeUnit?: TimeUnit | "ms" | "ns" | "us",
                timeZone?: string | null | undefined,
            ) => Datetime
        ),
    fmt?: string,
): T
Parse a Series of dtype Utf8 to a Date/Datetime Series.
Parameters
- datatype:
      | Date
      | Datetime
      | (
          (
              timeUnit?: TimeUnit | "ms" | "ns" | "us",
              timeZone?: string | null | undefined,
          ) => Datetime
      )
  Date or Datetime.
  - Date
  - Datetime
  - (
        timeUnit?: TimeUnit | "ms" | "ns" | "us",
        timeZone?: string | null | undefined,
    ) => Datetime
    (
        timeUnit?: TimeUnit | "ms" | "ns" | "us",
        timeZone?: string | null | undefined,
    ): Datetime
    Calendar date and time type
    
    Parameters
    OptionaltimeUnit: TimeUnit | "ms" | "ns" | "us"
    any of 'ms' | 'ns' | 'us'
    
    timeZone: string | null | undefined = null
    timezone string as defined by Intl.DateTimeFormat America/New_York for example.
    
    Returns Datetime
- Optionalfmt: string
  formatting syntax. Read more
Returns T
- Defined in polars/shared_traits.ts:1286

toLowerCase

toLowerCase(): T
Modify the strings to their lowercase equivalent.

Returns T
- Defined in polars/shared_traits.ts:1262

toUpperCase

toUpperCase(): T
Modify the strings to their uppercase equivalent.

Returns T
- Defined in polars/shared_traits.ts:1264

Interface StringFunctions<T>

Type Parameters

Hierarchy (View Summary)

Index

Methods

Methods

concat

Parameters

Returns T

Example

contains

Parameters

Returns T

Example

decode

Parameters

Returns T

Example

Parameters

Returns T

encode

Parameters

Returns T

Example

extract

Parameters

Returns T

Example

jsonPathMatch

Parameters

Returns T

See

Example

lengths

Returns T

lstrip

Returns T

replace

Parameters

Returns T

replaceAll

Parameters

Returns T

rstrip

Returns T

slice

Parameters

Returns T

split

Parameters

Optionalinclusive?: boolean

Returns T

strip

Returns T

strptime

Parameters

Parameters

Returns Datetime

Returns T

toLowerCase

Returns T

toUpperCase

Returns T

Settings

On This Page

`Optional`inclusive?: boolean