Contextual Linear Bandit Environment
Contextual Linear Bandit Environment
Details
An R6 class for simulating a contextual linear bandit environment with normally distributed rewards.
Methods
- `initialize(k, d, list_betas, sigma = 0.1, binary_rewards = FALSE)`: Constructor. - `post_initialization()`: Loads correct coefficients based on `sim_id`. - `get_context(t)`: Returns context and sets internal reward vector. - `get_reward(t, context_common, action)`: Returns observed reward for an action.
Public fields
rewardsA vector of rewards for each arm in the current round.
betasCoefficient matrix of the linear reward model (one column per arm).
sigmaStandard deviation of the Gaussian noise added to rewards.
binaryLogical, indicating whether to convert rewards into binary outcomes.
weightsThe latent reward scores before noise and/or binarization.
list_betasA list of coefficient matrices, one per simulation.
sim_idIndex for selecting which simulation's coefficients to use.
class_nameName of the class for internal tracking.
Methods
Inherited methods
Method new()
Usage
ContextualLinearBandit$new(
k,
d,
list_betas,
sigma = 0.1,
binary_rewards = FALSE
)