Skip to contents

This vignette demonstrates the simulation capabilities included in the cramR package. The simulation code is primarily intended for reproducing experimental results from the associated theoretical papers and for validating the performance of the Cram method under controlled data-generating processes. While not intended for direct use in practical applications, these simulations allow users to benchmark and understand the empirical behavior of the method in synthetic environments.

🎯 What is cram_simulation()?

The cram_simulation() function performs simultaneous policy learning and evaluation under a known data-generating process (DGP). It is useful for:

  • Benchmarking the performance of the Cram method on controlled simulated datasets
  • Measuring empirical bias, variance, and confidence interval coverage of the estimates
  • Supporting both synthetic covariate generation from a known DGP provided by the user (dgp_X), and empirical covariate generation based on an input dataset (X) using row-wise bootstrapping, which approximates the empirical distribution of the observed covariates.

📦 Inputs Overview

You must supply either:

  • X: a dataset to bootstrap from (empirical DGP)
    or

  • dgp_X: a function that simulates covariates

You must also define:

  • dgp_D(X): treatment assignment function given X
  • dgp_Y(D, X): outcome generation function given D and X

📘 Example: Cram Policy Simulation

set.seed(123)

# dgp_X <- function(n) {
#   data.table::data.table(
#     binary     = rbinom(n, 1, 0.5),
#     discrete   = sample(1:5, n, replace = TRUE),
#     continuous = rnorm(n)
#   )
# }

n <- 100

X_data <- data.table::data.table(
    binary     = rbinom(n, 1, 0.5),
    discrete   = sample(1:5, n, replace = TRUE),
    continuous = rnorm(n)
  )


dgp_D <- function(X) rbinom(nrow(X), 1, 0.5)

dgp_Y <- function(D, X) {
  theta <- ifelse(
    X[, binary] == 1 & X[, discrete] <= 2,  # Group 1: High benefit
    1,
    ifelse(X[, binary] == 0 & X[, discrete] >= 4,  # Group 3: Negative benefit
           -1,
           0.1)  # Group 2: Neutral effect
  )
  Y <- D * (theta + rnorm(length(D), mean = 0, sd = 1)) +
    (1 - D) * rnorm(length(D))  # Outcome for untreated
  return(Y)
}

# Parameters
nb_simulations <- 100
nb_simulations_truth <- 200
batch <- 5

# Perform CRAM simulation
result <- cram_simulation(
  X = X_data,
  dgp_D = dgp_D,
  dgp_Y = dgp_Y,
  batch = batch,
  nb_simulations = nb_simulations,
  nb_simulations_truth = nb_simulations_truth,
  sample_size = 500
)

📊 Output Summary

result$raw_results
#>                                  Metric   Value
#> 1            Average Proportion Treated 0.52724
#> 2                Average Delta Estimate 0.22597
#> 3          Average Delta Standard Error 0.10424
#> 4                  Delta Empirical Bias 0.01224
#> 5              Delta Empirical Coverage 0.96000
#> 6         Variance Delta Empirical Bias 0.00220
#> 7         Average Policy Value Estimate 0.22580
#> 8   Average Policy Value Standard Error 0.10080
#> 9           Policy Value Empirical Bias 0.01210
#> 10      Policy Value Empirical Coverage 0.92000
#> 11 Variance Policy Value Empirical Bias 0.00071
result$interactive_table

Returns a list containing:

  • raw_results: A summary of key averaged metrics
  • interactive_table: An interactive HTML widget for quick exploration
Metric Meaning
Average Proportion Treated Share of samples treated by learned policy
Average Delta Estimate Mean treatment effect (Δ) estimate
Delta Empirical Bias Bias of Δ estimate against truth
Delta Empirical Coverage CI coverage of Δ estimate
Average Policy Value Estimate Mean value of final policy
Policy Value Empirical Bias Bias against true policy value
Policy Value Empirical Coverage CI coverage of policy value