Combining DataFrames
There are two ways DataFrame
s can be combined depending on the use case: join and concat.
Join
Polars supports all types of join (e.g. left, right, inner, outer). Let's have a closer look on how to join
two DataFrames
into a single DataFrame
. Our two DataFrames
both have an 'id'-like column: a
and x
. We can use those columns to join
the DataFrames
in this example.
df = pl.DataFrame(
{
"a": range(8),
"b": np.random.rand(8),
"d": [1, 2.0, float("nan"), float("nan"), 0, -5, -42, None],
}
)
df2 = pl.DataFrame(
{
"x": range(8),
"y": ["A", "A", "A", "B", "B", "C", "X", "X"],
}
)
joined = df.join(df2, left_on="a", right_on="x")
print(joined)
use rand::Rng;
let mut rng = rand::thread_rng();
let df: DataFrame = df!(
"a" => 0..8,
"b"=> (0..8).map(|_| rng.gen::<f64>()).collect::<Vec<f64>>(),
"d"=> [Some(1.0), Some(2.0), None, None, Some(0.0), Some(-5.0), Some(-42.), None]
)
.unwrap();
let df2: DataFrame = df!(
"x" => 0..8,
"y"=> &["A", "A", "A", "B", "B", "C", "X", "X"],
)
.unwrap();
let joined = df.join(&df2, ["a"], ["x"], JoinType::Left.into())?;
println!("{}", joined);
shape: (8, 4)
┌─────┬──────────┬───────┬─────┐
│ a ┆ b ┆ d ┆ y │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ f64 ┆ str │
╞═════╪══════════╪═══════╪═════╡
│ 0 ┆ 0.900468 ┆ 1.0 ┆ A │
│ 1 ┆ 0.965213 ┆ 2.0 ┆ A │
│ 2 ┆ 0.883848 ┆ NaN ┆ A │
│ 3 ┆ 0.623242 ┆ NaN ┆ B │
│ 4 ┆ 0.788115 ┆ 0.0 ┆ B │
│ 5 ┆ 0.185598 ┆ -5.0 ┆ C │
│ 6 ┆ 0.14701 ┆ -42.0 ┆ X │
│ 7 ┆ 0.552827 ┆ null ┆ X │
└─────┴──────────┴───────┴─────┘
To see more examples with other types of joins, go the User Guide.
Concat
We can also concatenate
two DataFrames
. Vertical concatenation will make the DataFrame
longer. Horizontal concatenation will make the DataFrame
wider. Below you can see the result of an horizontal concatenation of our two DataFrames
.
shape: (8, 5)
┌─────┬──────────┬───────┬─────┬─────┐
│ a ┆ b ┆ d ┆ x ┆ y │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ f64 ┆ i64 ┆ str │
╞═════╪══════════╪═══════╪═════╪═════╡
│ 0 ┆ 0.900468 ┆ 1.0 ┆ 0 ┆ A │
│ 1 ┆ 0.965213 ┆ 2.0 ┆ 1 ┆ A │
│ 2 ┆ 0.883848 ┆ NaN ┆ 2 ┆ A │
│ 3 ┆ 0.623242 ┆ NaN ┆ 3 ┆ B │
│ 4 ┆ 0.788115 ┆ 0.0 ┆ 4 ┆ B │
│ 5 ┆ 0.185598 ┆ -5.0 ┆ 5 ┆ C │
│ 6 ┆ 0.14701 ┆ -42.0 ┆ 6 ┆ X │
│ 7 ┆ 0.552827 ┆ null ┆ 7 ┆ X │
└─────┴──────────┴───────┴─────┴─────┘