EM-Algorithm Estimation of the Binary Outcome Misclassification Model
COMBO_EM_algorithm.RdJointly estimate \(\beta\) and \(\gamma\) parameters from the true outcome and observation mechanisms, respectively, in a binary outcome misclassification model.
Usage
COMBO_EM_algorithm(
Ystar,
x_matrix,
z_matrix,
beta_start,
gamma_start,
tolerance = 1e-07,
max_em_iterations = 1500,
em_method = "squarem"
)Arguments
- Ystar
A numeric vector of indicator variables (1, 2) for the observed outcome
Y*. There should be noNAterms. The reference category is 2.- x_matrix
A numeric matrix of covariates in the true outcome mechanism.
x_matrixshould not contain an intercept and no values should beNA.- z_matrix
A numeric matrix of covariates in the observation mechanism.
z_matrixshould not contain an intercept and no values should beNA.- beta_start
A numeric vector or column matrix of starting values for the \(\beta\) parameters in the true outcome mechanism. The number of elements in
beta_startshould be equal to the number of columns ofx_matrixplus 1.- gamma_start
A numeric vector or matrix of starting values for the \(\gamma\) parameters in the observation mechanism. In matrix form, the
gamma_startmatrix rows correspond to parameters for theY* = 1observed outcome, with the dimensions ofz_matrixplus 1, and the gamma parameter matrix columns correspond to the true outcome categories \(M \in \{1, 2\}\). A numeric vector forgamma_startis obtained by concatenating the gamma matrix, i.e.gamma_start <- c(gamma_matrix).- tolerance
A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is
1e-7.- max_em_iterations
An integer specifying the maximum number of iterations of the EM algorithm. The default is
1500.- em_method
A character string specifying which EM algorithm will be applied. Options are
"em","squarem", or"pem". The default and recommended option is"squarem".
Value
COMBO_EM_algorithm returns a data frame containing four columns. The first
column, Parameter, represents a unique parameter value for each row.
The next column contains the parameter Estimates, followed by the standard
error estimates, SE. The final column, Convergence, reports
whether or not the algorithm converged for a given parameter estimate.