Skip to contents

Compute the probability of the latent true outcome \(Y \in \{1, 2 \}\) as \(P(Y_i = j | X_i) = \frac{\exp(X_i \beta)}{1 + \exp(X_i \beta)}\) for each of the \(i = 1, \dots,\) n subjects.

Usage

true_classification_prob(beta_matrix, x_matrix)

Arguments

beta_matrix

A numeric column matrix of estimated regression parameters for the true outcome mechanism, Y (true outcome) ~ X (predictor matrix of interest), obtained from COMBO_EM or COMBO_MCMC.

x_matrix

A numeric matrix of covariates in the true outcome mechanism. x_matrix should not contain an intercept.

Value

true_classification_prob returns a dataframe containing three columns. The first column, Subject, represents the subject ID, from \(1\) to n, where n is the sample size, or equivalently, the number of rows in x_matrix. The second column, Y, represents a true, latent outcome category \(Y \in \{1, 2 \}\). The last column, Probability, is the value of the equation \(P(Y_i = j | X_i) = \frac{\exp(X_i \beta)}{1 + \exp(X_i \beta)}\) computed for each subject and true, latent outcome category.

Examples

set.seed(123)
sample_size <- 1000
cov1 <- rnorm(sample_size)
cov2 <- rnorm(sample_size, 1, 2)
x_matrix <- matrix(c(cov1, cov2), nrow = sample_size, byrow = FALSE)
estimated_betas <- matrix(c(1, -1, .5), ncol = 1)
P_Y <- true_classification_prob(estimated_betas, x_matrix)
head(P_Y)
#>   Subject Y Probability
#> 1       1 1   0.7435833
#> 2       2 1   0.6660164
#> 3       3 1   0.4808373
#> 4       4 1   0.7853830
#> 5       5 1   0.2352985
#> 6       6 1   0.6954044