Compute Conditional Probability of Each Second-Stage Observed Outcome Given Each True Outcome and First-Stage Observed Outcome, for Every Subject
misclassification_prob2.Rd
Compute the conditional probability of observing second-stage outcome \(Y^{*(2)} \in \{1, 2 \}\) given the latent true outcome \(Y \in \{1, 2 \}\) and the first-stage outcome \(Y^{*(1)} \in \{1, 2\}\) as \(\frac{\text{exp}\{\gamma^{(2)}_{\ell kj0} + \gamma^{(2)}_{\ell kjZ^{(2)}} Z^{(2)}\}}{1 + \text{exp}\{\gamma^{(2)}_{\ell kj0} + \gamma^{(2)}_{\ell kjZ^{(2)}} Z^{(2)}_i\}}\) for each of the \(i = 1, \dots,\) \(n\) subjects.
Arguments
- gamma2_array
A numeric array of estimated regression parameters for the observation mechanism, \(Y^{*(2)}| Y^{*(1)}, Y\) (second-stage observed outcome, given the first-stage observed outcome and the true outcome) ~ \(Z^{(2)}\) (second-stage misclassification predictor matrix). Rows of the array correspond to parameters for the \(Y^{*(2)} = 1\) observed outcome, with the dimensions of
z2_matrix
. Columns of the array correspond to the first-stage outcome categories \(k = 1, \dots,\)n_cat
. The third stage of the array corresponds to the true outcome categories \(j = 1, \dots,\)n_cat
. The array should be obtained byCOMBO_EM
orCOMBO_MCMC
.- z2_matrix
A numeric matrix of covariates in the second-stage observation mechanism.
z2_matrix
should not contain an intercept.
Value
misclassification_prob2
returns a dataframe containing five columns.
The first column, Subject
, represents the subject ID, from \(1\) to n
,
where n
is the sample size, or equivalently, the number of rows in z2_matrix
.
The second column, Y
, represents a true, latent outcome category \(Y \in \{1, 2 \}\).
The third column, Ystar1
, represents a first-stage observed outcome category \(Y^{*(1)} \in \{1, 2 \}\).
The fourth column, Ystar2
, represents a second-stage observed outcome category \(Y^{*(2)} \in \{1, 2 \}\).
The last column, Probability
, is the value of the equation
\(\frac{\text{exp}\{\gamma^{(2)}_{\ell kj0} + \gamma^{(2)}_{\ell kjZ^{(2)}} Z^{(2)}\}}{1 + \text{exp}\{\gamma^{(2)}_{\ell kj0} + \gamma^{(2)}_{\ell kjZ^{(2)}} Z^{(2)}_i\}}\)
computed for each subject, first-stage observed outcome category, second-stage
observed outcome category, and true, latent outcome category.
Examples
set.seed(123)
sample_size <- 1000
cov1 <- rnorm(sample_size)
cov2 <- rnorm(sample_size, 1, 2)
z2_matrix <- matrix(c(cov1, cov2), nrow = sample_size, byrow = FALSE)
estimated_gamma2 <- array(c(1, -1, .5, .2, -.6, 1.5,
-1, .5, -1, -.5, -1, -.5), dim = c(3,2,2))
P_Ystar2_Ystar1_Y <- misclassification_prob2(estimated_gamma2, z2_matrix)
head(P_Ystar2_Ystar1_Y)
#> Subject Y Ystar1 Ystar2 Probability
#> 1 1 1 1 1 0.7435833
#> 2 2 1 1 1 0.6660164
#> 3 3 1 1 1 0.4808373
#> 4 4 1 1 1 0.7853830
#> 5 5 1 1 1 0.2352985
#> 6 6 1 1 1 0.6954044