This vignette demonstrates the simulation capabilities included in the cramR package. The simulation code is primarily intended for reproducing experimental results from the associated theoretical papers and for validating the performance of the Cram method under controlled data-generating processes. While not intended for direct use in practical applications, these simulations allow users to benchmark and understand the empirical behavior of the method in synthetic environments.
🎯 What is cram_simulation()
?
The cram_simulation()
function performs
simultaneous policy learning and evaluation under a
known data-generating process (DGP). It is useful
for:
- Benchmarking the performance of the Cram method on controlled simulated datasets
- Measuring empirical bias, variance, and confidence interval coverage of the estimates
- Supporting both synthetic covariate generation from a known DGP
provided by the user (
dgp_X
), and empirical covariate generation based on an input dataset (X
) using row-wise bootstrapping, which approximates the empirical distribution of the observed covariates.
📦 Inputs Overview
You must supply either:
X
: a dataset to bootstrap from (empirical DGP)
ordgp_X
: a function that simulates covariates
You must also define:
-
dgp_D(X)
: treatment assignment function givenX
-
dgp_Y(D, X)
: outcome generation function givenD
andX
📘 Example: Cram Policy Simulation
set.seed(123)
# dgp_X <- function(n) {
# data.table::data.table(
# binary = rbinom(n, 1, 0.5),
# discrete = sample(1:5, n, replace = TRUE),
# continuous = rnorm(n)
# )
# }
n <- 100
X_data <- data.table::data.table(
binary = rbinom(n, 1, 0.5),
discrete = sample(1:5, n, replace = TRUE),
continuous = rnorm(n)
)
dgp_D <- function(X) rbinom(nrow(X), 1, 0.5)
dgp_Y <- function(D, X) {
theta <- ifelse(
X[, binary] == 1 & X[, discrete] <= 2, # Group 1: High benefit
1,
ifelse(X[, binary] == 0 & X[, discrete] >= 4, # Group 3: Negative benefit
-1,
0.1) # Group 2: Neutral effect
)
Y <- D * (theta + rnorm(length(D), mean = 0, sd = 1)) +
(1 - D) * rnorm(length(D)) # Outcome for untreated
return(Y)
}
# Parameters
nb_simulations <- 100
nb_simulations_truth <- 200
batch <- 5
# Perform CRAM simulation
result <- cram_simulation(
X = X_data,
dgp_D = dgp_D,
dgp_Y = dgp_Y,
batch = batch,
nb_simulations = nb_simulations,
nb_simulations_truth = nb_simulations_truth,
sample_size = 500
)
📊 Output Summary
result$raw_results
#> Metric Value
#> 1 Average Proportion Treated 0.52724
#> 2 Average Delta Estimate 0.22597
#> 3 Average Delta Standard Error 0.10424
#> 4 Delta Empirical Bias 0.01224
#> 5 Delta Empirical Coverage 0.96000
#> 6 Variance Delta Empirical Bias 0.00220
#> 7 Average Policy Value Estimate 0.22580
#> 8 Average Policy Value Standard Error 0.10080
#> 9 Policy Value Empirical Bias 0.01210
#> 10 Policy Value Empirical Coverage 0.92000
#> 11 Variance Policy Value Empirical Bias 0.00071
result$interactive_table
Returns a list containing:
-
raw_results
: A summary of key averaged metrics
-
interactive_table
: An interactive HTML widget for quick exploration
Metric | Meaning |
---|---|
Average Proportion Treated | Share of samples treated by learned policy |
Average Delta Estimate | Mean treatment effect (Δ) estimate |
Delta Empirical Bias | Bias of Δ estimate against truth |
Delta Empirical Coverage | CI coverage of Δ estimate |
Average Policy Value Estimate | Mean value of final policy |
Policy Value Empirical Bias | Bias against true policy value |
Policy Value Empirical Coverage | CI coverage of policy value |