EM-Algorithm Estimation of the Binary Outcome Misclassification Model
COMBO_EM_algorithm.Rd
Jointly estimate \(\beta\) and \(\gamma\) parameters from the true outcome and observation mechanisms, respectively, in a binary outcome misclassification model.
Usage
COMBO_EM_algorithm(
Ystar,
x_matrix,
z_matrix,
beta_start,
gamma_start,
tolerance = 1e-07,
max_em_iterations = 1500,
em_method = "squarem"
)
Arguments
- Ystar
A numeric vector of indicator variables (1, 2) for the observed outcome
Y*
. There should be noNA
terms. The reference category is 2.- x_matrix
A numeric matrix of covariates in the true outcome mechanism.
x_matrix
should not contain an intercept and no values should beNA
.- z_matrix
A numeric matrix of covariates in the observation mechanism.
z_matrix
should not contain an intercept and no values should beNA
.- beta_start
A numeric vector or column matrix of starting values for the \(\beta\) parameters in the true outcome mechanism. The number of elements in
beta_start
should be equal to the number of columns ofx_matrix
plus 1.- gamma_start
A numeric vector or matrix of starting values for the \(\gamma\) parameters in the observation mechanism. In matrix form, the
gamma_start
matrix rows correspond to parameters for theY* = 1
observed outcome, with the dimensions ofz_matrix
plus 1, and the gamma parameter matrix columns correspond to the true outcome categories \(M \in \{1, 2\}\). A numeric vector forgamma_start
is obtained by concatenating the gamma matrix, i.e.gamma_start <- c(gamma_matrix)
.- tolerance
A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is
1e-7
.- max_em_iterations
An integer specifying the maximum number of iterations of the EM algorithm. The default is
1500
.- em_method
A character string specifying which EM algorithm will be applied. Options are
"em"
,"squarem"
, or"pem"
. The default and recommended option is"squarem"
.
Value
COMBO_EM_algorithm
returns a data frame containing four columns. The first
column, Parameter
, represents a unique parameter value for each row.
The next column contains the parameter Estimates
, followed by the standard
error estimates, SE
. The final column, Convergence
, reports
whether or not the algorithm converged for a given parameter estimate.