Skip to contents

Batch Contextual Epsilon-Greedy Policy

Batch Contextual Epsilon-Greedy Policy

Details

Implements an epsilon-greedy exploration strategy for contextual bandits with batched updates.

Super class

cramR::NA

Public fields

epsilon

Probability of selecting a random arm (exploration rate).

batch_size

Number of rounds per batch before updating model parameters.

A_cc

List of Gram matrices (one per arm), used to accumulate sufficient statistics across batches.

b_cc

List of reward-weighted context sums (one per arm), updated batch-wise.

class_name

Internal class name identifier.

Methods

Inherited methods


Method new()

Constructor for the Batch Epsilon-Greedy policy.

Usage

BatchContextualEpsilonGreedyPolicy$new(epsilon = 0.1, batch_size = 1)

Arguments

epsilon

Numeric between 0 and 1. Probability of random arm selection.

batch_size

Integer. Number of observations between parameter updates.


Method set_parameters()

Initializes the parameter structures for each arm.

Usage

BatchContextualEpsilonGreedyPolicy$set_parameters(context_params)

Arguments

context_params

A list with at least `d` (number of features) and `k` (number of arms).


Method get_action()

Chooses an arm based on epsilon-greedy logic and the current estimates.

Usage

BatchContextualEpsilonGreedyPolicy$get_action(t, context)

Arguments

t

Integer time step.

context

A list with contextual features and arm count.

Returns

A list with the selected action.


Method set_reward()

Updates model statistics based on observed reward. Updates occur once per batch.

Usage

BatchContextualEpsilonGreedyPolicy$set_reward(t, context, action, reward)

Arguments

t

Integer time step.

context

List of contextual features used for the action.

action

A list with the chosen arm.

reward

A list with the observed reward.

Returns

Updated parameter estimates.


Method clone()

The objects of this class are cloneable with this method.

Usage

BatchContextualEpsilonGreedyPolicy$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.