Monte Carlo power evaluation for experimental designs with user-supplied libraries
Source:R/eval_design_custom_mc.R
eval_design_custom_mc.Rd
Evaluates the power of an experimental design, given its run matrix and the statistical model to be fit to the data, using monte carlo simulation. Simulated data is fit using a user-supplied fitting library and power is estimated by the fraction of times a parameter is significant. Returns a data frame of parameter powers.
Usage
eval_design_custom_mc(
design,
model = NULL,
alpha = 0.05,
nsim,
rfunction,
fitfunction,
pvalfunction,
anticoef,
effectsize = 2,
contrasts = contr.sum,
coef_function = coef,
calceffect = FALSE,
detailedoutput = FALSE,
parameternames = NULL,
advancedoptions = NULL,
progress = TRUE,
parallel = FALSE,
parallel_pkgs = NULL,
...
)
Arguments
- design
The experimental design. Internally,
eval_design_custom_mc
rescales each numeric column to the range [-1, 1].- model
The model used in evaluating the design. If this is missing and the design was generated with skpr, the generating model will be used. It can be a subset of the model used to generate the design, or include higher order effects not in the original design generation. It cannot include factors that are not present in the experimental design.
- alpha
Default `0.05`. The type-I error. p-values less than this will be counted as significant.
- nsim
The number of simulations.
- rfunction
Random number generator function. Should be a function of the form f(X, b), where X is the model matrix and b are the anticipated coefficients.
- fitfunction
Function used to fit the data. Should be of the form f(formula, X, contrasts) where X is the model matrix. If contrasts do not need to be specified for the user supplied library, that argument can be ignored.
- pvalfunction
Function that returns a vector of p-values from the object returned from the fitfunction.
- anticoef
The anticipated coefficients for calculating the power. If missing, coefficients will be automatically generated based on
effectsize
.- effectsize
The signal-to-noise ratio. Default 2. For a gaussian model, and for continuous factors, this specifies the difference in response between the highest and lowest levels of a factor (which are +1 and -1 after normalization). More precisely: If you do not specify
anticoef
, the anticipated coefficients will be half ofeffectsize
. If you do specifyanticoef
,effectsize
will be ignored.- contrasts
Default
contr.sum
. Function used to generate the contrasts encoding for categorical variables. If the user has specified their own contrasts for a categorical factor using the contrasts function, those will be used. Otherwise, skpr will use contr.sum.- coef_function
Function that, when applied to a fitfunction return object, returns the estimated coefficients.
- calceffect
Default `FALSE`. Calculates effect power for a Type-III Anova (using the car package) using a Wald test. this ratio can be a vector specifying the variance ratio for each subplot. Otherwise, it will use a single value for all strata. To work, the fit returned by `fitfunction` must have a method compatable with the car package.
- detailedoutput
Default `FALSE`. If `TRUE`, return additional information about evaluation in results.
- parameternames
Vector of parameter names if the coefficients do not correspond simply to the columns in the model matrix (e.g. coefficients from an MLE fit).
- advancedoptions
Default `NULL`. Named list of advanced options. `advancedoptions$anovatype` specifies the Anova type in the car package (default type `III`), user can change to type `II`). `advancedoptions$anovatest` specifies the test statistic if the user does not want a `Wald` test–other options are likelyhood-ratio `LR` and F-test `F`. `advancedoptions$progressBarUpdater` is a function called in non-parallel simulations that can be used to update external progress bar.`advancedoptions$GUI` turns off some warning messages when in the GUI. If `advancedoptions$save_simulated_responses = TRUE`, the dataframe will have an attribute `simulated_responses` that contains the simulated responses from the power evaluation. `advancedoptions$ci_error_conf` will set the confidence level for power intervals, which are printed when `detailedoutput = TRUE`.
- progress
Default `TRUE`. Whether to include a progress bar.
- parallel
Default `FALSE`. If `TRUE`, the power simulation will use all but one of the available cores. If the user wants to set the number of cores manually, they can do this by setting `options("cores")` to the desired number (e.g. `options("cores" = parallel::detectCores())`). NOTE: If you have installed BLAS libraries that include multicore support (e.g. Intel MKL that comes with Microsoft R Open), turning on parallel could result in reduced performance.
- parallel_pkgs
A vector of strings listing the external packages to be included in each parallel worker.
- ...
Additional arguments.
Value
A data frame consisting of the parameters and their powers. The parameter estimates from the simulations are stored in the 'estimates' attribute.
Examples
#To demonstrate how a user can use their own libraries for Monte Carlo power generation,
#We will recreate eval_design_survival_mc using the eval_design_custom_mc framework.
#To begin, first let us generate the same design and random generation function shown in the
#eval_design_survival_mc examples:
basicdesign = expand.grid(a = c(-1, 1), b = c("a","b","c"))
design = gen_design(candidateset = basicdesign, model = ~a + b + a:b, trials = 100,
optimality = "D", repeats = 100)
#Random number generating function
rsurvival = function(X, b) {
Y = rexp(n = nrow(X), rate = exp(-(X %*% b)))
censored = Y > 1
Y[censored] = 1
return(survival::Surv(time = Y, event = !censored, type = "right"))
}
#We now need to tell the package how we want to fit our data,
#given the formula and the model matrix X (and, if needed, the list of contrasts).
#If the contrasts aren't required, "contrastslist" should be set to NULL.
#This should return some type of fit object.
fitsurv = function(formula, X, contrastslist = NULL) {
return(survival::survreg(formula, data = X, dist = "exponential"))
}
#We now need to tell the package how to extract the p-values from the fit object returned
#from the fit function. This is how to extract the p-values from the survreg fit object:
pvalsurv = function(fit) {
return(summary(fit)$table[, 4])
}
#And now we evaluate the design, passing the fitting function and p-value extracting function
#in along with the standard inputs for eval_design_mc.
#This has the exact same behavior as eval_design_survival_mc for the exponential distribution.
eval_design_custom_mc(design = design, model = ~a + b + a:b,
alpha = 0.05, nsim = 100,
fitfunction = fitsurv, pvalfunction = pvalsurv,
rfunction = rsurvival, effectsize = 1)
#> parameter type power
#> 1 (Intercept) custom.parameter.power.mc 0.83
#> 2 a custom.parameter.power.mc 0.77
#> 3 b1 custom.parameter.power.mc 0.27
#> 4 b2 custom.parameter.power.mc 0.64
#> 5 a:b1 custom.parameter.power.mc 0.26
#> 6 a:b2 custom.parameter.power.mc 0.54
#> ===============Evaluation Info===============
#> • Alpha = 0.05 • Trials = 100 • Blocked = FALSE
#> • Evaluating Model = ~a + b + a:b
#> • Anticipated Coefficients = c(0.500, 0.500, 0.500, -0.500, 0.500, -0.500)
#We can also use skpr's framework for parallel computation to automatically parallelize this
#to speed up computation
if (FALSE) eval_design_custom_mc(design = design, model = ~a + b + a:b,
alpha = 0.05, nsim = 1000,
fitfunction = fitsurv, pvalfunction = pvalsurv,
rfunction = rsurvival, effectsize = 1,
parallel = TRUE, parallel_pkgs = "survival")
# \dontrun{}