Skip to contents

LinUCB Disjoint Policy with Epsilon-Greedy Exploration

LinUCB Disjoint Policy with Epsilon-Greedy Exploration

Details

Implements the disjoint LinUCB algorithm with upper confidence bounds and epsilon-greedy exploration.

Methods

- `initialize(alpha = 1.0, epsilon = 0.1)`: Create a new LinUCBDisjointPolicyEpsilon object. - `set_parameters(context_params)`: Initialize arm-level parameters. - `get_action(t, context)`: Selects an arm using epsilon-greedy UCB. - `set_reward(t, context, action, reward)`: Updates internal statistics based on observed reward.

Super class

cramR::NA

Public fields

alpha

Numeric, exploration parameter controlling the width of the confidence bound.

epsilon

Numeric, probability of selecting a random action (exploration).

class_name

Internal class name.

Methods

Inherited methods


Method new()

Initializes the policy with UCB parameter alpha and exploration rate epsilon.

Usage

LinUCBDisjointPolicyEpsilon$new(alpha = 1, epsilon = 0.1)

Arguments

alpha

Numeric. Controls width of the UCB bonus.

epsilon

Numeric between 0 and 1. Probability of random action selection.


Method set_parameters()

Set arm-specific parameter structures.

Usage

LinUCBDisjointPolicyEpsilon$set_parameters(context_params)

Arguments

context_params

A list with context information, typically including the number of unique features.


Method get_action()

Selects an arm using epsilon-greedy Upper Confidence Bound (UCB).

Usage

LinUCBDisjointPolicyEpsilon$get_action(t, context)

Arguments

t

Integer time step.

context

A list with contextual features and number of arms.

Returns

A list containing the selected action.


Method set_reward()

Updates internal statistics using the observed reward for the selected arm.

Usage

LinUCBDisjointPolicyEpsilon$set_reward(t, context, action, reward)

Arguments

t

Integer time step.

context

Contextual features for all arms at time t.

action

A list containing the chosen arm.

reward

A list containing the observed reward for the selected arm.

Returns

Updated internal parameters.


Method clone()

The objects of this class are cloneable with this method.

Usage

LinUCBDisjointPolicyEpsilon$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.