FriSem
Scott Linderman, Assistant Professor of Statistics, Stanford University
Title: Reconciling empirical, algorithmic, and Bayesian models of mouse behavior in two-armed bandit tasks
Abstract: In probabilistic and nonstationary environments, individuals must combine internal state and external cues to make decisions that lead to desirable outcomes. The two-armed bandit task (2ABT) and its generalizations offer a fruitful testbed for experimentally and theoretically investigating such decision-making behavior. We trained mice in a 2ABT with reward probabilities that switched, stochastically, between two states. Previous work has used empirical models, like logistic regression; algorithmic models, like Q-learning; and theoretically optimal models, like the ideal Bayesian observer model, to explain mouse behavior in such tasks. Though these models differ in their motivation, formulation, and interpretation, we find that all make surprisingly similar predictions about mouse behavior. To understand this phenomenon, we showed how the three types of models can be mathematically reconciled and made equivalent. Moreover, we showed that empirical deviations from the theoretically optimal model can be attributed to a tendency to repeat actions despite incoming evidence, but that this "stickiness" incurs minimal regret. These results indicate that mouse behavior reaches near-optimal performance with reduced action switching and can be described by a set of equivalent models, each offering a unique lens on decision-making behavior.