NEWS.md
Add %dofuture%
operator, which can be used like %dopar%
, but without the need for registerDoFuture()
, e.g. y <- foreach(x = 1:3) %dofuture% { slow_fcn(x) }
. One advantage, contrary to using %dopar%
, is that a developer then has full control on foreach()
; with %dopar%
the exact behavior depends also on what %dopar%
adapter the user has registered. Another advantage is that %dofuture%
can hand over random number generation, identification of globals, and error handling to the future ecosystem, which result in a more predictable and concise behavior, similar to that provided by future.apply and furrr.
Now future operators such as %stdout%
and %conditions%
can be used to control the corresponding options.future
arguments, e.g. y <- foreach(i = 1:3) %dopar% { my_fun(i) } %stdout% FALSE
is the same as y <- foreach(i = 1:3, .options.future = list(stdout = FALSE)) %dopar% { my_fun(i) }
.
Add withDoRNG()
for evaluating a foreach %dopar%
expression with doRNG::registerDoRNG()
temporarily set.
registerDoFuture()
with popular packages (e.g. BiocParallel, foreach, plyr, caret, glmnet, NSP, and TSP) relying on foreach to a separate doFuture.tests.extra package. This reduced the size of the package tarball from 140 kB to 40 kB.Now registerDoFuture()
returns the previously set foreach backend, making it possible to reset the the foreach backend to the previous settings.
Now doFuture recognizes when it is called via the BiocParallel package in which case it skips the check whether or not RNG was used by mistake.
Add option doFuture.rng.onMisuse
which can be used to temporarily override option future.rng.onMisuse
when the doFuture adaptor is running.
Add option doFuture.workarounds
, which can be set by environment variable R_DOFUTURE_WORKAROUNDS
when the package is loaded.
Adding "BiocParallel.DoParam.errors"
to option doFuture.workarounds
will prefix RngFutureError
messages with "task <index> failed - "
in order for such errors to be recognized by the BiocParallel DoParam backend.
foreach()
argument .options.snow = list(preschedule = <logical>)
is now acknowledged as a fallback to analogous argument .options.multicore
- two arguments defined by the doMC and the doSNOW packages and also used by the doParallel package. As before, argument .options.future
will always take precedence if list(scheduling = <logical>/<numeric>)
or list(chunk.size = <integer>)
is given.
Warnings and errors produced when using the RNG without using %dorng%
of the doRNG package are now tailored to the doFuture package.
foreach()
argument .noexport
was completely ignored by doFuture.Now doFuture sets a label on each future that reflects its name and the index of the chunk, e.g. "doFuture-3"
.
doFuture will now detect when doRNG is in use allowing underlying futures to skip the test of incorrectly generated random numbers - an optional validation of parallel RNG that will be added to future (>= 1.16.0).
.options.future = list(chunk.size = <count>)
to foreach()
.The foreach()
argument .options.future
(a named list) can already be used to control whether “chunking” should take place or not, and if so, how much, e.g. .options.future = list(scheduling = 2.0))
. As an alternative to scheduling
, this can now be specified by chunk.size
- the number of elements processed per future (“chunk”). In R 3.5.0, the parallel package introduced argument chunk.size
.
Elements can be processed in random order by setting attribute ordering
of .options.future
elements chunk.size
or scheduling
, e.g. .options.future = list(chunk.size = structure(TRUE, ordering = "random"))
. This improve load balancing in cases where there is a correlation between processing time and ordering of the elements. Note that the order of the returned values is not affected when randomizing the processing order.
Passing argument .options.future = list(stdout = ...)
can be used to to control how standard output should be relayed. See ?future::Future
for further details. Analogously, .options.future = list(conditions = ...)
can be used to control how messages and warnings are relayed, if at all.
Debug messages are now prepended with a timestamp.
foreach()
iterates over, e.g. foreach(f = F, g = G) %dopar% { f() + g() }
.The maximum total size of globals allowed (option future.globals.maxSize
) per future (“chunk”) is now scaled up by the number of elements processed by the future (“chunk”) making the protection approximately invariant to the amount of chunking done by foreach.
Added support for option doFuture.foreach.export = "manual"
, which will strictly follow argument .export
of foreach()
for identifying global variables. None of the future + globals framework for identifying global variables will be used. This is useful for asserting that the .export
argument is correct.
help("doFuture")
and updated example("doFuture")
with a best-practices RNG example.TESTS: The opt-in tests for third-party packages now run their examples with example(..., run.dontrun = TRUE)
to cover even more use cases.
TESTS: Added future.callr to the backend opt-in tested with plyr.
If the doFuture package is missing on a worker, an error on “length(results) == nbr_of_elements is not TRUE” would be produced. Now a more informative error is produced.
foreach(..., .export)
with .export
containing "..."
would produce an error when using globals (<= 0.11.0).
foreach()
would not relay captured conditions as provided by future (>= 0.11.0).
doFuture now respects option future.globals.resolve
instead of being hardcoded to always resolve globals (future.globals.resolve = TRUE
). This makes doFuture consistent with other future frontends.
Added option doFuture.foreach.export
making it possible ignore a faulty .export
argument to foreach()
and instead rely on the future framework to identify globals. For instance, all examples of caret 6.0-77 works with doFuture and any backend when setting this option to "automatic"
(or ".export-and-automatic"
) whereas they will only work on forked backends if using ".export"
or the default "automatic-unless-.export"
. If using ".export-and-automatic-with-warning"
, a warning that lists globals potentially missing from the .export
argument is produced
foreach()
code.help("doFuture.options")
.TESTS: The doFuture package gained more opt-in tests for third-party packages across all known future backends. These tests are not performed on the CRAN servers; instead they are performed on Travis CI. Third-party packages that are currently tested are: caret, foreach, glmnet, NMF, plyr, and TSP.
TESTS: Testing global functions that call themselves recursively.
foreach()
option to control whether scheduling (“chunking”) should take place or not, and if so, how granular it should be. This is specified as foreach(..., .options.future = list(scheduling = <value>))
. With scheduling = 1.0
(or equivalently scheduling = TRUE
), the the elements (iterations) will be split up in equally sized chunks such that each backend worker will process exactly one chunk. With scheduling = Inf
(or equivalently scheduling = FALSE
), chunking is disabled, i.e. each worker process exactly one element at the time. If scheduling = 0.0
, then a single workers processes all elements (and the other workers will not be used). If 2.0
, then each worker will process two chunks, and so on. If above option is not set, then .options.multicore = list(preschedule = <logical>))
which is defined by doParallel, is used to mean scheduling = preschedule. If that is not set, then scheduling = 1.0
is used by default..parallel = TRUE
. The default is to test against all of the future strategies that comes with the future package, but it is possible to also test against future.BatchJobs, future.batchtools, and so on. These tests are performed on all possible future backends before each release (as well as via continuous integration).Using a nested foreach()
call would incorrectly produce an error on not being able to locate the iterator variable of the inner-most foreach()
as a global variable.
If a foreach()
call would result in an error, the error thrown would report on “object ‘expr’ not found” and not the actually error message.
%dopar%
backend now processes all elements in chunks such that each backend worker will process a subset of data at once (and only once). This significantly speeds up processing time when iterating over a large number of elements that each has short a processing time.Now the package tests future.batchtools with foreach by itself, in combination with plyr (parallel = TRUE
) as well as with BiocParallel::bplapply()
and friends. Similar tests are already done using future.BatchJobs.
Added test for foreach::times() %dopar% { ... }
. Especially, it is now tested that global variables are properly identified. Note that times()
does not allow you to specify neither .export
nor .packages
so it is not really designed for processing in external R process. Having said this, times()
does indeed work also in those cases when used with doFuture because it internally handles this automatically.
ROBUSTNESS: The package redundancy tests (not run by R CMD check
; needed to be run manually for now) that run 89 plyr examples with the doFuture foreach adapter, now forces testing of .parallel = TRUE
for all plyr functions that support that argument. Each example is run across various future strategies, including ‘sequential’, ‘multicore’, ‘multisession’, and ‘cluster’, as well as ‘batchjobs_local’ and ‘batchtools_local’, if installed. See doFuture 0.2.0 notes below for how to run these tests.
.export
of foreach()
is acknowledged such that if a character vector of variables names to be exported is specified, then those variables and nothing else are exported to the future. If NULL, then automatic lookup of global variables is used.ROBUSTNESS: Added package redundancy tests that runs all examples of the foreach and the plyr packages using doFuture and all known types of futures. These tests are not package tests and need to be run manually. The test scripts are available in package directory path <- system.file("tests2", package="doFuture")
and can be run as source(file.path(path, "plyr", "examples.R"))
.
ROBUSTNESS: Added package tests validating foreach()
on regular as well as future.BatchJobs futures. Same for plyr and BiocParallel apply functions.
help("doFuture")
.foreach::getDoParWorkers()
gives useful information with registerDoFuture()
in most cases. In cases where the number of workers cannot be inferred easily from future::plan()
it will default to returning a large number (= 99).foreach::getDoParName()
and foreach::getDoParVersion()
gives useful information with registerDoFuture()
.