Contextual Linear Bandit Environment
Contextual Linear Bandit Environment
Details
An R6 class for simulating a contextual linear bandit environment with normally distributed rewards.
Methods
- `initialize(k, d, list_betas, sigma = 0.1, binary_rewards = FALSE)`: Constructor. - `post_initialization()`: Loads correct coefficients based on `sim_id`. - `get_context(t)`: Returns context and sets internal reward vector. - `get_reward(t, context_common, action)`: Returns observed reward for an action.
Public fields
rewards
A vector of rewards for each arm in the current round.
betas
Coefficient matrix of the linear reward model (one column per arm).
sigma
Standard deviation of the Gaussian noise added to rewards.
binary
Logical, indicating whether to convert rewards into binary outcomes.
weights
The latent reward scores before noise and/or binarization.
list_betas
A list of coefficient matrices, one per simulation.
sim_id
Index for selecting which simulation's coefficients to use.
class_name
Name of the class for internal tracking.
Methods
Inherited methods
Method new()
Usage
ContextualLinearBandit$new(
k,
d,
list_betas,
sigma = 0.1,
binary_rewards = FALSE
)