Why We Want Apples but Choose Chips: The Behavioral Overlap of Policy- and Preference-Based Learning

Submission 225

MixedTopicTalk-04

Presented by: Anoushiravan Zahedi

Anoushiravan Zahedi ^{1, 2}, Mandy Hütter ³, Thorsten Pachur ⁴, Ricarda I. Schubotz ^{1, 2}

¹ Insititute of Psychology, University of Muenster, Germany

² Otto Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Muenster, Germany, Germany

³ Department of Psychology, University of Tübingen, Germany

⁴ School of Management, Technical University of Munich, Germany

Choosing between options can be driven by both preferences, resulting from outcome evaluation, and strategic policy considerations, driven by action selection. Traditional decision-making theories assume a strict separation between these two processes. In contrast, the predictive coding framework provides a unified formal model that integrates valuation and action selection. It assumes two kinds of hierarchical and interacting predictions and prediction errors: one encodes “policies” and the other “preferences”. Here, we pitted these two views against each other, asking whether behavioral manipulations that primarily target action selection or outcome valuation produce distinct or overlapping effects on subsequent choices and preferences. To address this, we compared approach-avoidance training (AAT), a non-reinforced learning method that targets action policies, with evaluative conditioning (EC), a reinforced learning approach that targets outcome preferences in terms of how they influenced participants' choices and preferences. In Experiment 1, using facial stimuli, only EC affected choices and preferences. However, in Experiment 2, using doodles, both approach and avoidance pairings in AAT positively modulated choices and preferences, whereas EC effects were confined to negative pairings only. Experiment 3 employed joystick responses, providing a more ecologically valid measure; here, only approach pairings in AAT resulted in positive effects on choices and preferences. Together, these findings indicate partial overlap and cross-influence between policy- and preference-based learning mechanisms rather than a strict interaction. Hence, our results challenge the strict functional separation between policies and preferences assumed in traditional theories and further highlight that their interaction also depends on stimulus type and response modality.