vignettes/doFuture-1-overview.md.rsp
doFuture-1-overview.md.rsp
The doFuture package
provides mechanisms for using the foreach package
together with the future package such that
foreach()
and times()
parallelizes via
any future backend.
The future package provides a generic API for using futures in R. A future is a simple yet powerful mechanism to evaluate an R expression and retrieve its value at some point in time. Futures can be resolved in many different ways depending on which strategy is used. There are various types of synchronous and asynchronous futures to choose from in the future package. Additional future backends are implemented in other packages. For instance, the future.batchtools package provides futures for any type of backend that the batchtools package supports. For an introduction to futures in R, please consult the vignettes of the future package.
The foreach
package implements a map-reduce API with functions
foreach()
and times()
that provides us with
powerful methods for iterating over one or more sets of elements with
the option to do it in parallel.
The doFuture package provides two alternatives for using futures with foreach:
y <- foreach(...) %dofuture% { ... }
registerDoFuture()
+
y <- foreach(...) %dopar% { ... }
.
%dofuture%
The first alternative (recommended), which uses
%dofuture%
, avoids having to use
registerDoFuture()
. The %dofuture%
operator
provides a more consistent behavior than %dopar%
,
e.g. there is a unique set of foreach arguments instead of one per
possible adapter. Identification of globals, random number generation
(RNG), and error handling is handled by the future ecosystem, just like
with other map-reduce solutions such as future.apply
and furrr.
An example is:
library(doFuture)
plan(multisession)
y <- foreach(x = 1:4, y = 1:10) %dofuture% {
z <- x + y
slow_sqrt(z)
}
This alternative is the recommended way to let foreach()
parallelize via the future framework, especially if you start out from
scratch.
See help("%dofuture%", package = "doFuture")
for more
details and examples on this approach.
registerDoFuture()
+
%dopar%
The second alternative is based on the traditional
foreach approach where one registers a foreach adapter
to be used by %dopar%
. A popular adapter is
doParallel::registerDoParallel()
, which parallelizes on the
local machine using the parallel package. This package
provides registerDoFuture()
, which parallelizes using the
future package, meaning any future-compliant parallel
backend can be used.
An example is:
library(doFuture)
registerDoFuture()
plan(multisession)
y <- foreach(x = 1:4, y = 1:10) %dopar% {
z <- x + y
slow_sqrt(z)
}
This alternative is useful if you already have a lot of R code that
uses %dopar%
and you just want to switch to using the
future framework for parallelization. Using
registerDoFuture()
is also useful when you wish to use the
future framework with packages and functions that uses
foreach()
and %dopar%
internally,
e.g. caret,
plyr,
NMF, and
glmnet. It
can also be used to configure the Bioconductor BiocParallel
package, and any package that rely on it, to parallelize via the future
framework.
See help("registerDoFuture", package = "doFuture")
for
more details and examples on this approach.