polars.DataFrame.sample#

DataFrame.sample(n: int | None = None, frac: float | None = None, with_replacement: bool = False, shuffle: bool = False, seed: int | None = None) DF[source]#

Sample from this DataFrame.

Parameters:
n

Number of items to return. Cannot be used with frac. Defaults to 1 if frac is None.

frac

Fraction of items to return. Cannot be used with n.

with_replacement

Allow values to be sampled more than once.

shuffle

Shuffle the order of sampled data points.

seed

Seed for the random number generator. If set to None (default), a random seed is generated using the random module.

Examples

>>> df = pl.DataFrame(
...     {
...         "foo": [1, 2, 3],
...         "bar": [6, 7, 8],
...         "ham": ["a", "b", "c"],
...     }
... )
>>> df.sample(n=2, seed=0)  
shape: (2, 3)
┌─────┬─────┬─────┐
│ foo ┆ bar ┆ ham │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ str │
╞═════╪═════╪═════╡
│ 3   ┆ 8   ┆ c   │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
│ 2   ┆ 7   ┆ b   │
└─────┴─────┴─────┘