Vertically concat the values in the Series to a single string value.
Optional
ignoreNulls: boolean>>> df = pl.DataFrame({"foo": [1, null, 2]})
>>> df = df.select(pl.col("foo").str.concat("-"))
>>> df
shape: (1, 1)
┌──────────┐
│ foo │
│ --- │
│ str │
╞══════════╡
│ 1-null-2 │
└──────────┘
Decodes a value using the provided encoding
hex | base64
Optional
strict: booleanhow to handle invalid inputs
- true: method will throw error if unable to decode a value
- false: unhandled values will be replaced with `null`
>>> df = pl.DataFrame({"strings": ["666f6f", "626172", null]})
>>> df.select(col("strings").str.decode("hex"))
shape: (3, 1)
┌─────────┐
│ strings │
│ --- │
│ str │
╞═════════╡
│ foo │
├╌╌╌╌╌╌╌╌╌┤
│ bar │
├╌╌╌╌╌╌╌╌╌┤
│ null │
└─────────┘
Optional
strict?: booleanEncodes a value using the provided encoding
hex | base64
>>> df = pl.DataFrame({"strings", ["foo", "bar", null]})
>>> df.select(col("strings").str.encode("hex"))
shape: (3, 1)
┌─────────┐
│ strings │
│ --- │
│ str │
╞═════════╡
│ 666f6f │
├╌╌╌╌╌╌╌╌╌┤
│ 626172 │
├╌╌╌╌╌╌╌╌╌┤
│ null │
└─────────┘
Extract the target capture group from provided patterns.
Index of the targeted capture group. Group 0 mean the whole pattern, first group begin at index 1 Default to the first capture group
Utf8 array. Contain null if original value is null or regex capture nothing.
> df = pl.DataFrame({
... 'a': [
... 'http://vote.com/ballon_dor?candidate=messi&ref=polars',
... 'http://vote.com/ballon_dor?candidat=jorginho&ref=polars',
... 'http://vote.com/ballon_dor?candidate=ronaldo&ref=polars'
... ]})
> df.select(pl.col('a').str.extract(/candidate=(\w+)/, 1))
shape: (3, 1)
┌─────────┐
│ a │
│ --- │
│ str │
╞═════════╡
│ messi │
├╌╌╌╌╌╌╌╌╌┤
│ null │
├╌╌╌╌╌╌╌╌╌┤
│ ronaldo │
└─────────┘
Parse string values as JSON. Throw errors if encounter invalid JSON strings.
Optional
dtype: DataTypeOptional
inferSchemaLength: numberDF with struct
Not implemented ATM
>>> df = pl.DataFrame( {json: ['{"a":1, "b": true}', null, '{"a":2, "b": false}']} )
>>> df.select(pl.col("json").str.jsonDecode())
shape: (3, 1)
┌─────────────┐
│ json │
│ --- │
│ struct[2] │
╞═════════════╡
│ {1,true} │
│ {null,null} │
│ {2,false} │
└─────────────┘
See Also
----------
jsonPathMatch : Extract the first match of json string with provided JSONPath expression.
Parse string values as JSON. Throw errors if encounter invalid JSON strings.
Optional
dtype: DataTypeOptional
inferSchemaLength: numberDF with struct
Not implemented ATM
0.8.4
>>> df = pl.DataFrame( {json: ['{"a":1, "b": true}', null, '{"a":2, "b": false}']} )
>>> df.select(pl.col("json").str.jsonExtract())
shape: (3, 1)
┌─────────────┐
│ json │
│ --- │
│ struct[2] │
╞═════════════╡
│ {1,true} │
│ {null,null} │
│ {2,false} │
└─────────────┘
See Also
----------
jsonPathMatch : Extract the first match of json string with provided JSONPath expression.
Extract the first match of json string with provided JSONPath expression. Throw errors if encounter invalid json strings. All return value will be casted to Utf8 regardless of the original value.
Utf8 array. Contain null if original value is null or the jsonPath
return nothing.
https://goessner.net/articles/JsonPath/
>>> df = pl.DataFrame({
... 'json_val': [
... '{"a":"1"}',
... null,
... '{"a":2}',
... '{"a":2.1}',
... '{"a":true}'
... ]
... })
>>> df.select(pl.col('json_val').str.jsonPathMatch('$.a')
shape: (5,)
Series: 'json_val' [str]
[
"1"
null
"2"
"2.1"
"true"
]
Add a trailing fillChar to a string until string length is reached. If string is longer or equal to given length no modifications will be done
of the final string
that will fill the string.
If a string longer than 1 character is provided only the first character will be used *
> df = pl.DataFrame({
... 'foo': [
... "a",
... "b",
... "LONG_WORD",
... "cow"
... ]})
> df.select(pl.col('foo').str.padEnd("_", 3)
shape: (4, 1)
┌──────────┐
│ a │
│ -------- │
│ str │
╞══════════╡
│ a__ │
├╌╌╌╌╌╌╌╌╌╌┤
│ b__ │
├╌╌╌╌╌╌╌╌╌╌┤
│ LONG_WORD│
├╌╌╌╌╌╌╌╌╌╌┤
│ cow │
└──────────┘
Add a leading fillChar to a string until string length is reached. If string is longer or equal to given length no modifications will be done
of the final string
that will fill the string.
If a string longer than 1 character is provided only the first character will be used
> df = pl.DataFrame({
... 'foo': [
... "a",
... "b",
... "LONG_WORD",
... "cow"
... ]})
> df.select(pl.col('foo').str.padStart("_", 3)
shape: (4, 1)
┌──────────┐
│ a │
│ -------- │
│ str │
╞══════════╡
│ __a │
├╌╌╌╌╌╌╌╌╌╌┤
│ __b │
├╌╌╌╌╌╌╌╌╌╌┤
│ LONG_WORD│
├╌╌╌╌╌╌╌╌╌╌┤
│ cow │
└──────────┘
Add leading "0" to a string until string length is reached. If string is longer or equal to given length no modifications will be done
padStart *
> df = pl.DataFrame({
... 'foo': [
... "a",
... "b",
... "LONG_WORD",
... "cow"
... ]})
> df.select(pl.col('foo').str.justify(3)
shape: (4, 1)
┌──────────┐
│ a │
│ -------- │
│ str │
╞══════════╡
│ 00a │
├╌╌╌╌╌╌╌╌╌╌┤
│ 00b │
├╌╌╌╌╌╌╌╌╌╌┤
│ LONG_WORD│
├╌╌╌╌╌╌╌╌╌╌┤
│ cow │
└──────────┘
namespace containing expr string functions