Collect and profile a lazy query
Description
This will run the query and return a list containing the materialized DataFrame and a DataFrame that contains profiling information of each node that is executed.
Usage
<LazyFrame>$profile(
...,
type_coercion = TRUE,
`_type_check` = TRUE,
predicate_pushdown = TRUE,
projection_pushdown = TRUE,
simplify_expression = TRUE,
slice_pushdown = TRUE,
comm_subplan_elim = TRUE,
comm_subexpr_elim = TRUE,
cluster_with_columns = TRUE,
collapse_joins = TRUE,
no_optimization = FALSE,
`_check_order` = TRUE,
show_plot = FALSE,
truncate_nodes = 0
)
Arguments
…
|
These dots are for future extensions and must be empty. |
type_coercion
|
A logical, indicates type coercion optimization. |
predicate_pushdown
|
A logical, indicates predicate pushdown optimization. |
projection_pushdown
|
A logical, indicates projection pushdown optimization. |
simplify_expression
|
A logical, indicates simplify expression optimization. |
slice_pushdown
|
A logical, indicates slice pushdown optimization. |
comm_subplan_elim
|
A logical, indicates trying to cache branching subplans that occur on self-joins or unions. |
comm_subexpr_elim
|
A logical, indicates trying to cache common subexpressions. |
cluster_with_columns
|
A logical, indicates to combine sequential independent calls to with_columns. |
collapse_joins
|
Collapse a join and filters into a faster join. |
no_optimization
|
A logical. If TRUE , turn off (certain) optimizations.
|
\_check_order ,
\_type_check
|
For internal use only. |
show_plot
|
Show a Gantt chart of the profiling result |
truncate_nodes
|
Truncate the label lengths in the Gantt chart to this number of
characters. If 0 (default), do not truncate.
|
Details
The units of the timings are microseconds.
Value
List of two DataFrame
s: one with the collected result, the
other with the timings of each step. If show_plot = TRUE
,
then the plot is also stored in the list.
See Also
-
$collect()
- regular collect. -
$sink_parquet()
streams query to a parquet file. -
$sink_ipc()
streams query to a arrow file.
Examples
library("polars")
lf <- pl$LazyFrame(
a = c("a", "b", "a", "b", "b", "c"),
b = 1:6,
c = 6:1,
)
lf$group_by("a", .maintain_order = TRUE)$agg(
pl$all()$sum()
)$sort("a")$profile()
#> [[1]]
#> shape: (3, 3)
#> ┌─────┬─────┬─────┐
#> │ a ┆ b ┆ c │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ i32 ┆ i32 │
#> ╞═════╪═════╪═════╡
#> │ a ┆ 4 ┆ 10 │
#> │ b ┆ 11 ┆ 10 │
#> │ c ┆ 6 ┆ 1 │
#> └─────┴─────┴─────┘
#>
#> [[2]]
#> shape: (3, 3)
#> ┌─────────────────────────┬───────┬──────┐
#> │ node ┆ start ┆ end │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ u64 ┆ u64 │
#> ╞═════════════════════════╪═══════╪══════╡
#> │ optimization ┆ 0 ┆ 1962 │
#> │ group_by_partitioned(a) ┆ 1962 ┆ 7333 │
#> │ sort(a) ┆ 7423 ┆ 8258 │
#> └─────────────────────────┴───────┴──────┘