Ordinary Least Squares Estimation of the Binary Mediator Misclassification Model
COMMA_OLS.RdEstimate \(\beta\), \(\gamma\), and \(\theta\) parameters from the true mediator, observed mediator, and outcome mechanisms, respectively, in a binary mediator misclassification model using an ordinary least squares correction.
Usage
COMMA_OLS(
Mstar,
outcome,
x_matrix,
z_matrix,
c_matrix,
beta_start,
gamma_start,
theta_start,
tolerance = 1e-07,
max_em_iterations = 1500,
em_method = "squarem"
)Arguments
- Mstar
A numeric vector of indicator variables (1, 2) for the observed mediator
M*. There should be noNAterms. The reference category is 2.- outcome
A vector containing the outcome variables of interest. There should be no
NAterms.- x_matrix
A numeric matrix of predictors in the true mediator and outcome mechanisms.
x_matrixshould not contain an intercept and no values should beNA.- z_matrix
A numeric matrix of covariates in the observation mechanism.
z_matrixshould not contain an intercept and no values should beNA.- c_matrix
A numeric matrix of covariates in the true mediator and outcome mechanisms.
c_matrixshould not contain an intercept and no values should beNA.- beta_start
A numeric vector or column matrix of starting values for the \(\beta\) parameters in the true mediator mechanism. The number of elements in
beta_startshould be equal to the number of columns ofx_matrixandc_matrixplus 1. Starting values should be provided in the following order: intercept, slope coefficient for thex_matrixterm, slope coefficient for first column of thec_matrix, ..., slope coefficient for the final column of thec_matrix.- gamma_start
A numeric vector or matrix of starting values for the \(\gamma\) parameters in the observation mechanism. In matrix form, the
gamma_startmatrix rows correspond to parameters for theM* = 1observed mediator, with the dimensions ofz_matrixplus 1, and the gamma parameter matrix columns correspond to the true mediator categories \(M \in \{1, 2\}\). A numeric vector forgamma_startis obtained by concatenating the gamma matrix, i.e.gamma_start <- c(gamma_matrix). Starting values should be provided in the following order within each column: intercept, slope coefficient for first column of thez_matrix, ..., slope coefficient for the final column of thez_matrix.- theta_start
A numeric vector or column matrix of starting values for the \(\theta\) parameters in the outcome mechanism. The number of elements in
theta_startshould be equal to the number of columns ofx_matrixandc_matrixplus 2. Starting values should be provided in the following order: intercept, slope coefficient for thex_matrixterm, slope coefficient for the mediatormterm, slope coefficient for first column of thec_matrix, ..., slope coefficient for the final column of thec_matrix.- tolerance
A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is
1e-7.- max_em_iterations
A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is
1e-7.- em_method
A character string specifying which EM algorithm will be applied. Options are
"em","squarem", or"pem". The default and recommended option is"squarem".
Value
COMMA_PVW returns a data frame containing four columns. The first
column, Parameter, represents a unique parameter value for each row.
The next column contains the parameter Estimates. The third column,
Convergence, reports whether or not the algorithm converged for a
given parameter estimate. The final column, Method, reports
that the estimates are obtained from the "PVW" procedure.
Details
Note that this method can only be used for Normal outcome models, and interaction
terms (between x and m) are not supported.
Examples
set.seed(20240709)
sample_size <- 2000
n_cat <- 2 # Number of categories in the binary mediator
# Data generation settings
x_mu <- 0
x_sigma <- 1
z_shape <- 1
c_shape <- 1
# True parameter values (gamma terms set the misclassification rate)
true_beta <- matrix(c(1, -2, .5), ncol = 1)
true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE)
true_theta <- matrix(c(1, 1.5, -2, 2), ncol = 1)
example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape,
interaction_indicator = FALSE,
outcome_distribution = "Normal",
true_beta, true_gamma, true_theta)
beta_start <- matrix(rep(1, 3), ncol = 1)
gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2)
theta_start <- matrix(rep(1, 4), ncol = 1)
Mstar = example_data[["obs_mediator"]]
outcome = example_data[["outcome"]]
x_matrix = example_data[["x"]]
z_matrix = example_data[["z"]]
c_matrix = example_data[["c"]]
OLS_results <- COMMA_OLS(Mstar, outcome,
x_matrix, z_matrix, c_matrix,
beta_start, gamma_start, theta_start)
OLS_results
#> Parameter Estimates Convergence Method
#> 1 beta1 0.8272721 TRUE OLS
#> 2 beta2 -1.6154039 TRUE OLS
#> 3 beta3 0.3586729 TRUE OLS
#> 4 gamma11 1.2279060 TRUE OLS
#> 5 gamma21 1.3535571 TRUE OLS
#> 6 gamma12 -0.4846708 TRUE OLS
#> 7 gamma22 -1.4126826 TRUE OLS
#> 8 theta0 0.8900722 TRUE OLS
#> 9 theta_m -1.9041094 TRUE OLS
#> 10 theta_x 1.5529579 TRUE OLS
#> 11 theta_c1 2.0215448 TRUE OLS