Get and reset polars options
Description
polars_options()
returns a list of options for polars.
Options can be set with options()
. Note that
options must be prefixed with "polars.", e.g to modify
the option strictly_immutable
you need to pass
options(polars.strictly_immutable =)
. See below for a
description of all options.
polars_options_reset()
brings all polars options back to
their default value.
Usage
polars_options()
polars_options_reset()
Details
The following options are available (in alphabetical order, with the default value in parenthesis):
-
debug_polars
(FALSE
): Print additional information to debug Polars. -
do_not_repeat_call
(FALSE
): Do not print the call causing the error in error messages. The default is to show them. -
int64_conversion
(“double”
): How should Int64 values be handled when converting a polars object to R?-
“double”
converts the integer values to double. -
“bit64”
usesbit64::as.integer64()
to do the conversion (requires the packagebit64
to be attached). -
“string”
converts Int64 values to character.
-
-
limit_max_threads
(!polars_info()$features$disable_limit_max_threads
): See?pl_thread_pool_size
for details. This option should be set before the package is loaded. -
maintain_order
(FALSE
): Default for themaintain_order
argument in\
and$group_by() \
.$group_by() -
no_messages
(FALSE
): Hide messages. -
rpool_cap
: The maximum number of R sessions that can be used to process R code in the background. See the section "About pool options" below. -
strictly_immutable
(TRUE
): Keep polars strictly immutable. Polars/arrow is in general pro "immutable objects". Immutability is also classic in R. To mimic the Python-polars API, set this toFALSE.
Value
polars_options()
returns a named list where the names are
option names and values are option values.
polars_options_reset()
doesn’t return anything.
About pool options
polars_options()$rpool_active
indicates the number of R
sessions already spawned in pool.
polars_options()$rpool_cap
indicates the maximum number of
new R sessions that can be spawned. Anytime a polars thread worker needs
a background R session specifically to run R code embedded in a query
via $map_batches(…, in_background = TRUE)
or
$map_elements(…, in_background = TRUE)
, it will obtain any
R session idling in rpool, or spawn a new R session (process) and add it
to the rpool if rpool_cap
is not already reached. If
rpool_cap
is already reached, the thread worker will sleep
until an R session is idling.
Background R sessions communicate via polars arrow IPC (series/vectors)
or R serialize + shared memory buffers via the rust crate
ipc-channel
. Multi-process communication has overhead
because all data must be serialized/de-serialized and sent via buffers.
Using multiple R sessions will likely only give a speed-up in a
low io - high cpu
scenario.
Native polars query syntax runs in threads and have no overhead.
Examples
library("polars")
options(polars.maintain_order = TRUE, polars.strictly_immutable = FALSE)
polars_options()
#> Options:
#> ========
#> debug_polars FALSE
#> df_knitr_print auto
#> do_not_repeat_call FALSE
#> int64_conversion double
#> limit_max_threads FALSE
#> maintain_order TRUE
#> no_messages FALSE
#> rpool_active 0
#> rpool_cap 4
#> strictly_immutable FALSE
#>
#> See `?polars_options` for the definition of all options.
# option checks are run when calling polars_options(), not when setting
# options
options(polars.maintain_order = 42, polars.int64_conversion = "foobar")
tryCatch(
polars_options(),
error = function(e) print(e)
)
#> <simpleError: Some polars options have an unexpected value:
#> - maintain_order: input must be TRUE or FALSE.
#> - int64_conversion: input must be one of "float", "string", "bit64".
#>
#> More info at `?polars::polars_options`.>