LinUCB Disjoint Policy with Epsilon-Greedy Exploration
Source:R/armed_bandit_helpers.R
LinUCBDisjointPolicyEpsilon.Rd
LinUCB Disjoint Policy with Epsilon-Greedy Exploration
LinUCB Disjoint Policy with Epsilon-Greedy Exploration
Details
Implements the disjoint LinUCB algorithm with upper confidence bounds and epsilon-greedy exploration.
Methods
- `initialize(alpha = 1.0, epsilon = 0.1)`: Create a new LinUCBDisjointPolicyEpsilon object. - `set_parameters(context_params)`: Initialize arm-level parameters. - `get_action(t, context)`: Selects an arm using epsilon-greedy UCB. - `set_reward(t, context, action, reward)`: Updates internal statistics based on observed reward.
Public fields
alpha
Numeric, exploration parameter controlling the width of the confidence bound.
epsilon
Numeric, probability of selecting a random action (exploration).
class_name
Internal class name.
Methods
Inherited methods
Method new()
Initializes the policy with UCB parameter alpha
and exploration rate epsilon
.
Usage
LinUCBDisjointPolicyEpsilon$new(alpha = 1, epsilon = 0.1)
Method set_reward()
Updates internal statistics using the observed reward for the selected arm.