boax.policies.believes.continuous

Contents

boax.policies.believes.continuous#

boax.policies.believes.continuous(num_variants)#

The continous belief.

Stores the number of tries and average reward for each variant.

Example

>>> belief = continous(10)
>>> params = belief.init()
>>> updated_params = belief.update(params, variant, reward)
>>> best_variant = belief.best(updated_params)
Parameters:

num_variants (int) – The number of variants.

Return type:

Belief[ActionValues, float]

Returns;

The corresponding Belief.