Interface StringSeries

String functions for Series

interface StringSeries {
    concat(delimiter: string, ignoreNulls?: boolean): pl.Series;
    contains(
        pat: string | RegExp,
        literal?: boolean,
        strict?: boolean,
    ): pl.Series;
    decode(encoding: "base64" | "hex", strict?: boolean): pl.Series;
    decode(
        options: { encoding: "base64" | "hex"; strict?: boolean },
    ): pl.Series;
    encode(encoding: "base64" | "hex"): pl.Series;
    extract(pattern: string | RegExp, groupIndex: number): pl.Series;
    jsonDecode(dtype?: DataType<any>, inferSchemaLength?: number): pl.Series;
    jsonPathMatch(jsonPath: string): pl.Series;
    lengths(): pl.Series;
    lstrip(): pl.Series;
    padEnd(length: number, fillChar: string): pl.Series;
    padStart(length: number, fillChar: string): pl.Series;
    replace(pattern: string | RegExp, value: string): pl.Series;
    replaceAll(pattern: string | RegExp, value: string): pl.Series;
    rstrip(): pl.Series;
    slice(start: number | pl.Expr, length?: number | pl.Expr): pl.Series;
    split(
        separator: string,
        options?: boolean | { inclusive?: boolean },
    ): pl.Series;
    strip(): pl.Series;
    strptime(datatype: Date, fmt?: string): pl.Series<Date>;
    strptime(datatype: Datetime, fmt?: string): pl.Series<Datetime>;
    strptime(
        datatype: (
            timeUnit?: TimeUnit | "ms" | "ns" | "us",
            timeZone?: string | null | undefined,
        ) => Datetime,
        fmt?: string,
    ): pl.Series;
    toLowerCase(): pl.Series;
    toUpperCase(): pl.Series;
    zFill(length: number | pl.Expr): pl.Series;
}

Hierarchy (View Summary)

StringFunctions<pl.Series>
- StringSeries

Index

Methods

concat contains decode encode extract jsonDecode jsonPathMatch lengths lstrip padEnd padStart replace replaceAll rstrip slice split strip strptime toLowerCase toUpperCase zFill

Methods

concat

concat(delimiter: string, ignoreNulls?: boolean): pl.Series
Vertically concat the values in the Series to a single string value.
Parameters
- delimiter: string
- OptionalignoreNulls: boolean
Returns pl.Series
Example
```
> pl.Series([1, null, 2]).str.concat("-")[0]
'1-null-2'
```
Overrides StringFunctions.concat
- Defined in polars/series/string.ts:20

contains

contains(pat: string | RegExp, literal?: boolean, strict?: boolean): pl.Series
Check if strings in Series contain a substring that matches a pattern.
Parameters
- pat: string | RegExp
  A valid regular expression pattern, compatible with the regex crate @param literal Treat pattern` as a literal string, not as a regular expression.
- Optionalliteral: boolean
- Optionalstrict: boolean
  Raise an error if the underlying pattern is not a valid regex, otherwise mask out with a null value.
Returns pl.Series
Boolean mask
Example
```
> pl.Series(["Crab", "cat and dog", "rab$bit", null]).str.contains("cat|bit")
shape: (4,)
Series: '' [bool]
[
     false
     true
     true
     null
]
```
Overrides StringFunctions.contains
- Defined in polars/series/string.ts:40

decode

decode(encoding: "base64" | "hex", strict?: boolean): pl.Series
Decodes a value in Series using the provided encoding
Parameters
- encoding: "base64" | "hex"
  hex | base64
- Optionalstrict: boolean
  how to handle invalid inputs
  
  - true: method will throw error if unable to decode a value - false: unhandled values will be replaced with `null`
Returns pl.Series
Example
```
s = pl.Series("strings", ["666f6f", "626172", null])
s.str.decode("hex")
shape: (3,)
Series: 'strings' [str]
[
    "foo",
    "bar",
    null
]
```
Overrides StringFunctions.decode
- Defined in polars/series/string.ts:61

decode(options: { encoding: "base64" | "hex"; strict?: boolean }): pl.Series

Decodes a value using the provided encoding

Parameters

options: { encoding: "base64" | "hex"; strict?: boolean }

Returns pl.Series

Example

> df = pl.DataFrame({"strings": ["666f6f", "626172", null]})
> df.select(col("strings").str.decode("hex"))
shape: (3, 1)
┌─────────┐
│ strings │
│ ---     │
│ str     │
╞═════════╡
│ foo     │
├╌╌╌╌╌╌╌╌╌┤
│ bar     │
├╌╌╌╌╌╌╌╌╌┤
│ null    │
└─────────┘

encode

encode(encoding: "base64" | "hex"): pl.Series
Encodes a value in Series using the provided encoding
Parameters
- encoding: "base64" | "hex"
  hex | base64
Returns pl.Series
Example
```
s = pl.Series("strings", ["foo", "bar", null])
s.str.encode("hex")
shape: (3,)
Series: 'strings' [str]
[
    "666f6f",
    "626172",
    null
]
```
Overrides StringFunctions.encode
- Defined in polars/series/string.ts:79

extract

extract(pattern: string | RegExp, groupIndex: number): pl.Series

Extract the target capture group from provided patterns.

Parameters

pattern: string | RegExp
A valid regex pattern
groupIndex: number
Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group

Returns pl.Series

Utf8 array. Contain null if original value is null or regex capture nothing.

Example

>  df = pl.DataFrame({
...   'a': [
...       'http://vote.com/ballon_dor?candidate=messi&ref=polars',
...       'http://vote.com/ballon_dor?candidat=jorginho&ref=polars',
...       'http://vote.com/ballon_dor?candidate=ronaldo&ref=polars'
...   ]})
>  df.getColumn("a").str.extract(/candidate=(\w+)/, 1)
shape: (3, 1)
┌─────────┐
│ a       │
│ ---     │
│ str     │
╞═════════╡
│ messi   │
├╌╌╌╌╌╌╌╌╌┤
│ null    │
├╌╌╌╌╌╌╌╌╌┤
│ ronaldo │
└─────────┘

jsonDecode

jsonDecode(dtype?: DataType<any>, inferSchemaLength?: number): pl.Series
Parse string values in Series as JSON.
Parameters
- Optionaldtype: DataType<any>
- OptionalinferSchemaLength: number
Returns pl.Series
Utf8 array. Contain null if original value is null or the jsonPath return nothing.
Example
```
s = pl.Series("json", ['{"a":1, "b": true}', null, '{"a":2, "b": false}']);
s.str.jsonDecode().as("json");
shape: (3,)
Series: 'json' [struct[2]]
[
    {1,true}
    {null,null}
    {2,false}
]
```
- Defined in polars/series/string.ts:125

jsonPathMatch

jsonPathMatch(jsonPath: string): pl.Series
Extract the first match of json string in Series with provided JSONPath expression. Throw errors if encounter invalid json strings. All return value will be casted to Utf8 regardless of the original value.
Parameters
- jsonPath: string
  A valid JSON path query string
Returns pl.Series
Utf8 array. Contain null if original value is null or the jsonPath return nothing.
See
https://goessner.net/articles/JsonPath/
Example
```
> s = pl.Series('json_val', [
...   '{"a":"1"}',
...   null,
...   '{"a":2}',
...   '{"a":2.1}',
...   '{"a":true}'
... ])
> s.str.jsonPathMatch('$.a')
shape: (5,)
Series: 'json_val' [str]
[
    "1"
    null
    "2"
    "2.1"
    "true"
]
```
Overrides StringFunctions.jsonPathMatch
- Defined in polars/series/string.ts:154

lengths

lengths(): pl.Series
Get number of chars of the string values in Series. df = pl.Series(["Café", "345", "東京", null]) .str.lengths().alias("n_chars") shape: (4,) Series: 'n_chars' [u32] [ 4 3 2 null ]

Returns pl.Series
Overrides StringFunctions.lengths
- Defined in polars/series/string.ts:167

lstrip

lstrip(): pl.Series
Remove leading whitespace of the string values in Series.

Returns pl.Series
Overrides StringFunctions.lstrip
- Defined in polars/series/string.ts:169

padEnd

padEnd(length: number, fillChar: string): pl.Series
Add trailing zeros
Parameters
- length: number
- fillChar: string
Returns pl.Series
- Defined in polars/series/string.ts:235

padStart

padStart(length: number, fillChar: string): pl.Series

Add a leading fillChar to a string in Series until string length is reached. If string is longer or equal to given length no modifications will be done

Parameters

length: number
of the final string
fillChar: string
that will fill the string. If a string longer than 1 character is provided only the first character will be used

Returns pl.Series

Example

> df = pl.DataFrame({
...   'foo': [
...       "a",
...       "b",
...       "LONG_WORD",
...       "cow"
...   ]})
> df.select(pl.col('foo').str.padStart("_", 3)
shape: (4, 1)
┌──────────┐
│ a        │
│ -------- │
│ str      │
╞══════════╡
│ __a      │
├╌╌╌╌╌╌╌╌╌╌┤
│ __b      │
├╌╌╌╌╌╌╌╌╌╌┤
│ LONG_WORD│
├╌╌╌╌╌╌╌╌╌╌┤
│ cow      │
└──────────┘

replace

replace(pattern: string | RegExp, value: string): pl.Series
Replace first regex match with a string value in Series.
Parameters
- pattern: string | RegExp
  A valid regex pattern or string
- value: string
  Substring to replace.
Returns pl.Series
Example
```
df = pl.Series(["#12.34", "#56.78"]).str.replace(/#(\d+)/, "$$$1")
shape: (2,)
Series: '' [str]
[
       "$12.34"
       "$56.78"
]
```
Overrides StringFunctions.replace
- Defined in polars/series/string.ts:251

replaceAll

replaceAll(pattern: string | RegExp, value: string): pl.Series
Replace all regex matches with a string value in Series.
Parameters
- pattern: string | RegExp
  A valid regex pattern or string
- value: string
  Substring to replace.
Returns pl.Series
Example
```
df = pl.Series(["abcabc", "123a123"]).str.replaceAll("a", "-");
shape: (2,)
Series: '' [str]
[
        "-bc-bc"
        "123-123"
]
```
Overrides StringFunctions.replaceAll
- Defined in polars/series/string.ts:267

rstrip

rstrip(): pl.Series
Remove trailing whitespace.

Returns pl.Series
Overrides StringFunctions.rstrip
- Defined in polars/series/string.ts:273

slice

slice(start: number | pl.Expr, length?: number | pl.Expr): pl.Series
Create subslices of the string values of a Utf8 Series.
Parameters
- start: number | pl.Expr
  Start of the slice (negative indexing may be used).
- Optionallength: number | pl.Expr
  Optional length of the slice.
Returns pl.Series
Overrides StringFunctions.slice
- Defined in polars/series/string.ts:281

split

split(separator: string, options?: boolean | { inclusive?: boolean }): pl.Series
Split a string into substrings using the specified separator. The return type will by of type List
Parameters
- separator: string
  — A string that identifies character or characters to use in separating the string.
- Optionaloptions: boolean | { inclusive?: boolean }
  - boolean
  - { inclusive?: boolean }
    
    Optionalinclusive?: boolean
    
    Include the split character/string in the results
Returns pl.Series
Overrides StringFunctions.split
- Defined in polars/series/string.ts:288

strip

strip(): pl.Series
Remove leading and trailing whitespace.

Returns pl.Series
Overrides StringFunctions.strip
- Defined in polars/series/string.ts:275

strptime

strptime(datatype: Date, fmt?: string): pl.Series<Date>
Parse a Series of dtype Utf8 to a Date/Datetime Series.
Parameters
- datatype: Date
  Date or Datetime.
- Optionalfmt: string
  formatting syntax. Read more
Returns pl.Series<Date>
Overrides StringFunctions.strptime
- Defined in polars/series/string.ts:294
strptime(datatype: Datetime, fmt?: string): pl.Series<Datetime>
Parse a Series of dtype Utf8 to a Date/Datetime Series.
Parameters
- datatype: Datetime
  Date or Datetime.
- Optionalfmt: string
  formatting syntax. Read more
Returns pl.Series<Datetime>
Overrides StringFunctions.strptime
- Defined in polars/series/string.ts:295
strptime(
    datatype: (
        timeUnit?: TimeUnit | "ms" | "ns" | "us",
        timeZone?: string | null | undefined,
    ) => Datetime,
    fmt?: string,
): pl.Series
Parse a Series of dtype Utf8 to a Date/Datetime Series.
Parameters
- datatype: (
  timeUnit?: TimeUnit | "ms" | "ns" | "us",
  timeZone?: string | null | undefined,
  ) => Datetime
  Date or Datetime.
  - (
    timeUnit?: TimeUnit | "ms" | "ns" | "us",
    timeZone?: string | null | undefined,
    ): Datetime
    
    Calendar date and time type
    
    Parameters
    
    OptionaltimeUnit: TimeUnit | "ms" | "ns" | "us"
    any of 'ms' | 'ns' | 'us'
    
    timeZone: string | null | undefined = null
    timezone string as defined by Intl.DateTimeFormat America/New_York for example.
    
    Returns Datetime
- Optionalfmt: string
  formatting syntax. Read more
Returns pl.Series
Overrides StringFunctions.strptime
- Defined in polars/series/string.ts:299

toLowerCase

toLowerCase(): pl.Series
Modify the strings to their lowercase equivalent.

Returns pl.Series
Overrides StringFunctions.toLowerCase
- Defined in polars/series/string.ts:269

toUpperCase

toUpperCase(): pl.Series
Modify the strings to their uppercase equivalent.

Returns pl.Series
Overrides StringFunctions.toUpperCase
- Defined in polars/series/string.ts:271

zFill

zFill(length: number | pl.Expr): pl.Series

Add a leading '0' to a string until string length is reached. If string is longer or equal to given length no modifications will be done

Parameters

length: number | pl.Expr
of the final string

Returns pl.Series

Example

> df = pl.DataFrame({
...   'foo': [
...       "a",
...       "b",
...       "LONG_WORD",
...       "cow"
...   ]})
> df.select(pl.col('foo').str.padStart(3)
shape: (4, 1)
┌──────────┐
│ a        │
│ -------- │
│ str      │
╞══════════╡
│ 00a      │
├╌╌╌╌╌╌╌╌╌╌┤
│ 00b      │
├╌╌╌╌╌╌╌╌╌╌┤
│ LONG_WORD│
├╌╌╌╌╌╌╌╌╌╌┤
│ cow      │
└──────────┘

Interface StringSeries

Hierarchy (View Summary)

Index

Methods

Methods

concat

Parameters

Returns pl.Series

Example

contains

Parameters

Returns pl.Series

Example

decode

Parameters

Returns pl.Series

Example

Parameters

Returns pl.Series

Example

encode

Parameters

Returns pl.Series

Example

extract

Parameters

Returns pl.Series

Example

jsonDecode

Parameters

Returns pl.Series

Example

jsonPathMatch

Parameters

Returns pl.Series

See

Example

lengths

Returns pl.Series

lstrip

Returns pl.Series

padEnd

Parameters

Returns pl.Series

padStart

Parameters

Returns pl.Series

Example

replace

Parameters

Returns pl.Series

Example

replaceAll

Parameters

Returns pl.Series

Example

rstrip

Returns pl.Series

slice

Parameters

Returns pl.Series

split

Parameters

Optionalinclusive?: boolean

Returns pl.Series

strip

Returns pl.Series

strptime

Parameters

Returns pl.Series<Date>

Parameters

Returns pl.Series<Datetime>

Parameters

Parameters

Returns Datetime

Returns pl.Series

toLowerCase

Returns pl.Series

toUpperCase

Returns pl.Series

`Optional`inclusive?: boolean