polars.StringCache#

class polars.StringCache[source]#

Context manager for enabling and disabling the global string cache.

Categorical columns created under the same global string cache have the same underlying physical value when string values are equal. This allows the columns to be concatenated or used in a join operation, for example.

Notes

Enabling the global string cache introduces some overhead. The amount of overhead depends on the number of categories in your data. It is advised to enable the global string cache only when strictly necessary.

If StringCache calls are nested, the global string cache will only be disabled and cleared when the outermost context exits.

Examples

Construct two Series using the same global string cache.

>>> with pl.StringCache():
...     s1 = pl.Series("color", ["red", "green", "red"], dtype=pl.Categorical)
...     s2 = pl.Series("color", ["blue", "red", "green"], dtype=pl.Categorical)

As both Series are constructed under the same global string cache, they can be concatenated.

>>> pl.concat([s1, s2])
shape: (6,)
Series: 'color' [cat]
[
        "red"
        "green"
        "red"
        "blue"
        "red"
        "green"
]

The class can also be used as a function decorator, in which case the string cache is enabled during function execution, and disabled afterwards.

>>> @pl.StringCache()
... def construct_categoricals() -> pl.Series:
...     s1 = pl.Series("color", ["red", "green", "red"], dtype=pl.Categorical)
...     s2 = pl.Series("color", ["blue", "red", "green"], dtype=pl.Categorical)
...     return pl.concat([s1, s2])
__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)