Similarity and Consistency in Algorithm-Guided Exploration

Tue-H9-Talk 5-5204

Presented by: Ludwig Danwitz

Yongping Bao ¹, Ludwig Danwitz ¹, Fabian Dvorak ³, Sebastian Fehrler ^{1, 3}, Lars Hornuf ^{1, 2}, Hsuan Yu Lin ¹, Bettina von Helversen ¹

¹ Universität Bremen, ² Technische Universität Dresden, ³ Universität Konstanz

Human decisions in exploration scenarios can be enhanced by providing them algorithm advice. However, algorithm advice can only be beneficial, if humans are willing to follow the advice. In an online experiment, we studied in how far the willingness of participants to follow the advice of a (real) reinforcement learning algorithm depends on the relation of their own exploration preferences and the advising algorithm’s exploration preference. Participants played a ten-armed bandit task with and without algorithm advice. We varied the weight that the algorithm places on exploration rather than exploitation, and modeled the participants’ decision-making processes using a learning model comparable to the algorithm’s. We find that the algorithm’s advice affects and improves participants’ choices in all treatments. However, participants are much more likely to follow an algorithm that is more exploitative than they are themselves, while we do not find evidence that the similarity between human choices and the algorithm’s recommendations does not increase humans’ willingness to follow the recommendations. Possible explanations for the found preference for exploitative algorithms are that participants either perceive the consistency over time of the exploitative algorithm as a signal of expertise or that participants perceive myopically small rewards obtained when exploring new options as errors.

Keywords: Decision Making, Algorithm, Artificial Intelligence, Exploration Exploitation, Modelling, Similarity