Generate Data to use in COMMA Functions
COMMA_data.Rd
Generate Data to use in COMMA Functions
Usage
COMMA_data(
sample_size,
x_mu,
x_sigma,
z_shape,
c_shape,
interaction_indicator,
outcome_distribution,
true_beta,
true_gamma,
true_theta
)
Arguments
- sample_size
An integer specifying the sample size of the generated data set.
- x_mu
A numeric value specifying the mean of
x
predictors generated from a Normal distribution.- x_sigma
A positive numeric value specifying the standard deviation of
x
predictors generated from a Normal distribution.- z_shape
A positive numeric value specifying the shape parameter of
z
predictors generated from a Gamma distribution.- c_shape
A positive numeric value specifying the shape parameter of
c
covariates generated from a Gamma distribution.- interaction_indicator
A logical value indicating if an interaction between
x
andm
should be used to generate the outcome variable,y
.- outcome_distribution
A character string specifying the distribution of the outcome variable. Options are
"Bernoulli"
,"Normal"
, or"Poisson"
.- true_beta
A column matrix of \(\beta\) parameter values (intercept, slope) to generate data under in the true mediator mechanism.
- true_gamma
A numeric matrix of \(\gamma\) parameters to generate data in the observed mediator mechanisms. In matrix form, the
gamma
matrix rows correspond to intercept (row 1) and slope (row 2) terms. The gamma parameter matrix columns correspond to the true mediator categories \(M \in \{1, 2\}\).- true_theta
A column matrix of \(\theta\) parameter values (intercept, slope coefficient for
x
, slope coefficient form
, slope coefficient forc
, and, optionally, slope coefficient forxm
if using) to generate data in the outcome mechanism.
Value
COMMA_data
returns a list of generated data elements:
- obs_mediator
A vector of observed mediator values.
- true_mediator
A vector of true mediator values.
- outcome
A vector of outcome values.
- x
A vector of generated predictor values in the true mediator mechanism, from the Normal distribution.
- z
A vector of generated predictor values in the observed mediator mechanism from the Gamma distribution.
- c
A vector of generated covariates.
- x_design_matrix
The design matrix for the
x
predictor.- z_design_matrix
The design matrix for the
z
predictor.- c_design_matrix
The design matrix for the
c
predictor.
Examples
set.seed(20240709)
sample_size <- 10000
n_cat <- 2 # Number of categories in the binary mediator
# Data generation settings
x_mu <- 0
x_sigma <- 1
z_shape <- 1
c_shape <- 1
# True parameter values (gamma terms set the misclassification rate)
true_beta <- matrix(c(1, -2, .5), ncol = 1)
true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE)
true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1)
example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape,
interaction_indicator = FALSE,
outcome_distribution = "Bernoulli",
true_beta, true_gamma, true_theta)
head(example_data$obs_mediator)
#> [1] 1 2 2 1 2 2
head(example_data$true_mediator)
#> [1] 1 1 2 1 1 2