Skip to contents

Estimate Bootstrap Standard Errors using EM

Usage

COMMA_EM_bootstrap_SE(
  parameter_estimates,
  sigma_estimate = 1,
  n_bootstrap,
  n_parallel,
  outcome_distribution,
  interaction_indicator,
  x_matrix,
  z_matrix,
  c_matrix,
  tolerance = 1e-07,
  max_em_iterations = 1500,
  em_method = "squarem"
)

Arguments

parameter_estimates

A column matrix of \(\beta\), \(\gamma\), and \(\theta\) parameter values obtained from a COMMA analysis function. Parameter estimates should be supplied in the following order: 1) \(\beta\) (intercept, slope), 2) \(\gamma\) (intercept and slope from the M = 1 mechanism, intercept and slope from the M = 2 mechanism), and 3) \(\theta\) (intercept, slope, coefficient for x, slope coefficient for m, slope coefficient for c, and, optionally, slope coefficient for xm if using).

sigma_estimate

A numeric value specifying the estimated standard deviation. This value is only required if outcome_distribution is "Normal". Default is 1. For non-Normal outcome distributions, the value should be NULL.

n_bootstrap

A numeric value specifying the number of bootstrap samples to draw.

n_parallel

A numeric value specifying the number of parallel cores to run the computation on.

outcome_distribution

A character string specifying the distribution of the outcome variable. Options are "Bernoulli", "Normal", or "Poisson".

interaction_indicator

A logical value indicating if an interaction between x and m should be used to generate the outcome variable, y.

x_matrix

A numeric matrix of predictors in the true mediator and outcome mechanisms. x_matrix should not contain an intercept and no values should be NA.

z_matrix

A numeric matrix of covariates in the observation mechanism. z_matrix should not contain an intercept and no values should be NA.

c_matrix

A numeric matrix of covariates in the true mediator and outcome mechanisms. c_matrix should not contain an intercept and no values should be NA.

tolerance

A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is 1e-7.

max_em_iterations

A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is 1e-7.

em_method

A character string specifying which EM algorithm will be applied. Options are "em", "squarem", or "pem". The default and recommended option is "squarem".

Value

COMMA_EM_bootstrap_SE returns a list with two elements: 1) bootstrap_df and 2) bootstrap_SE. bootstrap_df is a data frame containing COMMA_EM output for each bootstrap sample. bootstrap_SE is a data frame containing bootstrap standard error estimates for each parameter.

Examples

set.seed(20240709)
sample_size <- 2000

n_cat <- 2 # Number of categories in the binary mediator

# Data generation settings
x_mu <- 0
x_sigma <- 1
z_shape <- 1
c_shape <- 1

# True parameter values (gamma terms set the misclassification rate)
true_beta <- matrix(c(1, -2, .5), ncol = 1)
true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE)
true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1)

example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape,
                           interaction_indicator = FALSE,
                           outcome_distribution = "Bernoulli",
                           true_beta, true_gamma, true_theta)
                           
beta_start <- matrix(rep(1, 3), ncol = 1)
gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2)
theta_start <- matrix(rep(1, 4), ncol = 1)

Mstar = example_data[["obs_mediator"]]
outcome = example_data[["outcome"]]
x_matrix = example_data[["x"]]
z_matrix = example_data[["z"]]
c_matrix = example_data[["c"]]
                           
EM_results <- COMMA_EM(Mstar, outcome, "Bernoulli", FALSE,
                       x_matrix, z_matrix, c_matrix,
                       beta_start, gamma_start, theta_start)
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!
#> Warning: non-integer #successes in a binomial glm!

EM_results
#>    Parameter  Estimates Convergence
#> 1     beta_0  0.8355459        TRUE
#> 2     beta_1 -1.6157072        TRUE
#> 3     beta_2  0.3864456        TRUE
#> 4    gamma11  1.0924272        TRUE
#> 5    gamma21  1.4723205        TRUE
#> 6    gamma12 -0.2875945        TRUE
#> 7    gamma22 -1.7222641        TRUE
#> 8    theta_0  0.8914868        TRUE
#> 9   theta_x1  1.7143072        TRUE
#> 10   theta_m -1.9727559        TRUE
#> 11  theta_c1 -0.1759259        TRUE

EM_SEs <- COMMA_EM_bootstrap_SE(EM_results$Estimates, sigma_estimate = NULL,
                                n_bootstrap = 3,
                                n_parallel = 1,
                                outcome_distribution = "Bernoulli",
                                interaction_indicator = FALSE,
                                x_matrix, z_matrix, c_matrix)
                                  
EM_SEs$bootstrap_SE
#> # A tibble: 11 × 3
#>    Parameter   Mean     SE
#>    <chr>      <dbl>  <dbl>
#>  1 beta_0     0.819 0.166 
#>  2 beta_1    -1.78  0.212 
#>  3 beta_2     0.373 0.119 
#>  4 gamma11    1.32  0.345 
#>  5 gamma12   -0.353 0.137 
#>  6 gamma21    1.19  0.682 
#>  7 gamma22   -1.47  0.542 
#>  8 theta_0    1.02  0.109 
#>  9 theta_c1  -0.175 0.0400
#> 10 theta_m   -2.18  0.139 
#> 11 theta_x1   1.72  0.282