polars.DataFrame.to_numpy#

DataFrame.to_numpy(
*,
structured: bool = False,
order: IndexOrder = 'fortran',
allow_copy: bool = True,
writable: bool = False,
use_pyarrow: bool = True,
) np.ndarray[Any, Any][source]#

Convert this DataFrame to a NumPy ndarray.

Parameters:
structured

Return a structured array with a data type that corresponds to the DataFrame schema. If set to False (default), a 2D ndarray is returned instead.

order

The index order of the returned NumPy array, either C-like or Fortran-like. In general, using the Fortran-like index order is faster. However, the C-like order might be more appropriate to use for downstream applications to prevent cloning data, e.g. when reshaping into a one-dimensional array. Note that this option only takes effect if structured is set to False and the DataFrame dtypes allow for a global dtype for all columns.

allow_copy

Allow memory to be copied to perform the conversion. If set to False, causes conversions that are not zero-copy to fail.

writable

Ensure the resulting array is writable. This will force a copy of the data if the array was created without copy, as the underlying Arrow data is immutable.

use_pyarrow

Use pyarrow.Array.to_numpy

function for the conversion to numpy if necessary.

Examples

>>> df = pl.DataFrame(
...     {
...         "foo": [1, 2, 3],
...         "bar": [6.5, 7.0, 8.5],
...         "ham": ["a", "b", "c"],
...     },
...     schema_overrides={"foo": pl.UInt8, "bar": pl.Float32},
... )

Export to a standard 2D numpy array.

>>> df.to_numpy()
array([[1, 6.5, 'a'],
       [2, 7.0, 'b'],
       [3, 8.5, 'c']], dtype=object)

Export to a structured array, which can better-preserve individual column data, such as name and dtype…

>>> df.to_numpy(structured=True)
array([(1, 6.5, 'a'), (2, 7. , 'b'), (3, 8.5, 'c')],
      dtype=[('foo', 'u1'), ('bar', '<f4'), ('ham', '<U1')])

…optionally going on to view as a record array:

>>> import numpy as np
>>> df.to_numpy(structured=True).view(np.recarray)
rec.array([(1, 6.5, 'a'), (2, 7. , 'b'), (3, 8.5, 'c')],
          dtype=[('foo', 'u1'), ('bar', '<f4'), ('ham', '<U1')])