Efficient parallel lapply using a SLURM cluster

An easy-to-use form of lapply that emulates parallelization using a SLURM cluster.

superApply(x, FUN, ..., tasks = 1, workingDir = getwd(), packages = NULL,
  sources = NULL, extraBashLines = NULL, extraScriptLines = "",
  clean = T, partition = NULL, time = NULL, mem = NULL, proc = NULL,
  totalProc = NULL, nodes = NULL, email = NULL)

Arguments

x	vector/list - FUN will be applied to the elements of this
FUN	function - function to be applied to each element of x
...	further arguments of FUN
tasks	integer - number of individual parallel jobs to execute
workingDir	string - path to folder that will contain all the temporary files needed for submission, execution, and compilation of inidivudal jobs
packages	character vector - package names to be loaded in individual tasks
sources	character vector - paths to R code to be loaded in individual tasks
extraBashLines	character vector - each element will be added as a line to the inidividual task execution bash script before R gets executed. For instance, here you may want to load R if it is not in your system by default
extraScriptLines	character vector - each element will be added as a line to the individual task execution R script before starting lapply
clean	logical - if TRUE all files created in workingDir will be deleted
partition	character - Partition to use. Equivalent to `--partition` of SLURM sbatch
time	character - Time requested for job execution, one accepted format is "HH:MM:SS". Equivalent to `--time` of SLURM sbatch
mem	character - Memory requested for job execution, one accepted format is "xG" or "xMB". Equivalent to `--mem` of SLURM sbatch
proc	integer - Number of processors requested per task. Equivalent to `--cpus-per-task` of SLURM sbatch
totalProc	integer - Number of tasks requested for job. Equivalent to `--ntasks` of SLURM sbatch
nodes	integer - Number of nodes requested for job. Equivalent to `--nodes` of SLURM sbatch

Value

list - results of FUN applied to each element in x

Details

Mimics the functionality of lapply but implemented in a way that iterations can be submmitted as one or more individual jobs to a SLURM cluster. Each job batch, err, out, and script files are stored in a temporary folder. Once all jobs have been submmitted, the function waits for them to finish. When they are done executing, all results from individual jobs will be compiled into a single list.

Examples

# NOT RUN {
#------------------------
# Parallel execution of 100 function calls using 4 parellel tasks
myFun <- function(x) {
    #Sys.sleep(10)
    return(rep(x, 3))
}

dir.create("~/testSap")
sapOut <- superApply(1:100, FUN = myFun, tasks = 4, workingDir = "~/testSap", time = "60", mem = "1G")


#------------------------
# Parallel execution of 100 function calls using 100  parellel tasks
sapOut <- superApply(1:100, FUN = myFun, tasks = 100, workingDir = "~/testSap", time = "60", mem = "1G")


#------------------------
# Parallel execution where a package is required in function calls
myFun <- function(x) {
    return(ggplot(data.frame(x = 1:100, y = (1:100)*x), aes(x = x, y = y )) + geom_point() + ylim(0, 1e4))
}

dir.create("~/testSap")
sapOut <- superApply(1:100, FUN = myFun, tasks = 4, workingDir = "~/testSap", packages = "ggplot2",  time = "60", mem = "1G")


#------------------------
# Parallel execution where R has to be loaded in the system (e.g. in bash `module load R`)
sapOut <- superApply(1:100, FUN = myFun, tasks = 4, workingDir = "~/testSap", time = "60", mem = "1G", extraBashLines = "module load R")


#------------------------
# Parellel execution where a source is required in funciton calls
# Content of ./customRep.R
   customRep <- function(x) {
           return(paste("customFunction", rep(x, 3)))
   }
# Super appply execution 
myFun <- function(x) {
    return(customRep(x))
}

dir.create("~/testSap")
sapOut <- superApply(1:100, FUN = myFun, tasks = 4, sources = "./customRep.R", workingDir = "~/testSap", time = "60", mem = "1G")

# }

Efficient parallel lapply using a SLURM cluster

Arguments

Value

Details

Examples

Contents