Databases
Read from a database
We can read from a database with Polars using the pl.read_database
function. To use this function you need an SQL query string and a connection string called a connection_uri
.
For example, the following snippet shows the general patterns for reading all columns from the foo
table in a Postgres database:
read_database
ยท Available on feature connectorx
import polars as pl
connection_uri = "postgres://username:password@server:port/database"
query = "SELECT * FROM foo"
pl.read_database(query=query, connection_uri=connection_uri)
Engines
Polars doesn't manage connections and data transfer from databases by itself. Instead external libraries (known as engines) handle this. At present Polars can use two engines to read from databases:
- ConnectorX and
- ADBC
ConnectorX
ConnectorX is the default engine and supports numerous databases including Postgres, Mysql, SQL Server and Redshift. ConnectorX is written in Rust and stores data in Arrow format to allow for zero-copy to Polars.
To read from one of the supported databases with ConnectorX
you need to activate the additional dependency ConnectorX
when installing Polars or install it manually with
$ pip install connectorx
ADBC
ADBC (Arrow Database Connectivity) is an engine supported by the Apache Arrow project. ADBC aims to be both an API standard for connecting to databases and libraries implementing this standard in a range of languages.
It is still early days for ADBC so support for different databases is still limited. At present drivers for ADBC are only available for Postgres and SQLite. To install ADBC you need to install the driver for your database. For example to install the driver for SQLite you run
$ pip install adbc-driver-sqlite
As ADBC is not the default engine you must specify the engine as an argument to pl.read_database
connection_uri = "postgres://username:password@server:port/database"
query = "SELECT * FROM foo"
pl.read_database(query=query, connection_uri=connection_uri, engine="adbc")
Write to a database
We can write to a database with Polars using the pl.write_database
function.
Engines
As with reading from a database above Polars uses an engine to write to a database. The currently supported engines are:
- SQLAlchemy and
- Arrow Database Connectivity (ADBC)
SQLAlchemy
With the default engine SQLAlchemy you can write to any database supported by SQLAlchemy. To use this engine you need to install SQLAlchemy and Pandas
$ pip install SQLAlchemy pandas
In this example, we write the DataFrame
to a table called records
in the database
connection_uri = "postgres://username:password@server:port/database"
df = pl.DataFrame({"foo": [1, 2, 3]})
df.write_database(table_name="records", connection_uri=connection_uri)
In the SQLAlchemy approach Polars converts the DataFrame
to a Pandas DataFrame
backed by PyArrow and then uses SQLAlchemy methods on a Pandas DataFrame
to write to the database.
ADBC
As with reading from a database you can also use ADBC to write to a SQLite or Posgres database. As shown above you need to install the appropriate ADBC driver for your database.
connection_uri = "postgres://username:password@server:port/database"
df = pl.DataFrame({"foo": [1, 2, 3]})
df.write_database(table_name="records", connection_uri=connection_uri, engine="adbc")