Combining DataFrames
There are two ways DataFrame
s can be combined depending on the use case: join and concat.
Join
Polars supports all types of join (e.g. left, right, inner, outer). Let's have a closer look on how to join
two DataFrames
into a single DataFrame
. Our two DataFrames
both have an 'id'-like column: a
and x
. We can use those columns to join
the DataFrames
in this example.
df = pl.DataFrame(
{
"a": np.arange(0, 8),
"b": np.random.rand(8),
"d": [1, 2.0, np.NaN, np.NaN, 0, -5, -42, None],
}
)
df2 = pl.DataFrame(
{
"x": np.arange(0, 8),
"y": ["A", "A", "A", "B", "B", "C", "X", "X"],
}
)
joined = df.join(df2, left_on="a", right_on="x")
print(joined)
use rand::Rng;
let mut rng = rand::thread_rng();
let df: DataFrame = df!("a" => 0..8,
"b"=> (0..8).map(|_| rng.gen::<f64>()).collect::<Vec<f64>>(),
"d"=> [Some(1.0), Some(2.0), None, None, Some(0.0), Some(-5.0), Some(-42.), None]
).expect("should not fail");
let df2: DataFrame = df!("x" => 0..8,
"y"=> &["A", "A", "A", "B", "B", "C", "X", "X"],
).expect("should not fail");
let joined = df.join(&df2,["a"],["x"],JoinType::Left,None)?;
println!("{}",joined);
shape: (8, 4)
┌─────┬──────────┬───────┬─────┐
│ a ┆ b ┆ d ┆ y │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ f64 ┆ str │
╞═════╪══════════╪═══════╪═════╡
│ 0 ┆ 0.419112 ┆ 1.0 ┆ A │
│ 1 ┆ 0.248841 ┆ 2.0 ┆ A │
│ 2 ┆ 0.468882 ┆ NaN ┆ A │
│ 3 ┆ 0.507387 ┆ NaN ┆ B │
│ 4 ┆ 0.909377 ┆ 0.0 ┆ B │
│ 5 ┆ 0.40115 ┆ -5.0 ┆ C │
│ 6 ┆ 0.912623 ┆ -42.0 ┆ X │
│ 7 ┆ 0.71882 ┆ null ┆ X │
└─────┴──────────┴───────┴─────┘
To see more examples with other types of joins, go the User Guide.
Concat
We can also concatenate
two DataFrames
. Vertical concatenation will make the DataFrame
longer. Horizontal concatenation will make the DataFrame
wider. Below you can see the result of an horizontal concatenation of our two DataFrames
.
shape: (8, 5)
┌─────┬──────────┬───────┬─────┬─────┐
│ a ┆ b ┆ d ┆ x ┆ y │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ f64 ┆ i64 ┆ str │
╞═════╪══════════╪═══════╪═════╪═════╡
│ 0 ┆ 0.419112 ┆ 1.0 ┆ 0 ┆ A │
│ 1 ┆ 0.248841 ┆ 2.0 ┆ 1 ┆ A │
│ 2 ┆ 0.468882 ┆ NaN ┆ 2 ┆ A │
│ 3 ┆ 0.507387 ┆ NaN ┆ 3 ┆ B │
│ 4 ┆ 0.909377 ┆ 0.0 ┆ 4 ┆ B │
│ 5 ┆ 0.40115 ┆ -5.0 ┆ 5 ┆ C │
│ 6 ┆ 0.912623 ┆ -42.0 ┆ 6 ┆ X │
│ 7 ┆ 0.71882 ┆ null ┆ 7 ┆ X │
└─────┴──────────┴───────┴─────┴─────┘