boax.policies.believes.continuous#
- boax.policies.believes.continuous(num_variants)#
The continous belief.
Stores the number of tries and average reward for each variant.
Example
>>> belief = continous(10) >>> params = belief.init() >>> updated_params = belief.update(params, variant, reward) >>> best_variant = belief.best(updated_params)
- Parameters:
num_variants (
int) – The number of variants.- Return type:
Belief[ActionValues,float]
- Returns;
The corresponding Belief.