Does letter perception follow principles of predictive coding?

Wed—HZ_8—Talks8—7704

Presented by: Janos Pauli

Janos Pauli ^*, Benjamin Gagl

University of Cologne Department of Special Education and Rehabilitation Self Learning Systems Lab

Applying the principles of predictive coding to reading enhances our understanding of the neuro-
cognitive processes involved in extracting meaning from text. Central is integrating prior
knowledge with the sensory input to highlight the informative parts of a visual percept, thus
increasing the efficiency of the representation and processing in the visual pathway.
Predominant computer vision models of higher vision are deep learning models. While adept at
word and letter prediction, they lack transparency and insight into their decisions, emphasizing
the need for models that respect neuro-cognitive principles and allow the explanation of how we
read.
The first visual word recognition model that considers neuro-cognitive principles has been
promising but needs an explicit account for recognizing the central unit of words: Letters. Here,
we introduce a transparent image-computable letter identification model based on
representations respecting the principles of predictive coding.
We use a dataset of 520 uppercase Latin alphabet letters presented in four font styles and sizes.
We generate prediction error images and pixel-based letter representations from this dataset
and train logistic regression models on both prediction error representations and letter images.
We evaluate the model based on out-of-distribution samples with varying noise levels and
behavioral letter recognition data from humans who perceived the same images.
Preliminary results indicate that prediction error-based classification surpasses other methods,
closely aligning with human performance.
Additionally, recognition accuracy was highest in conditions that favored predictive coding
mechanisms, supporting its role in letter perception. These results indicate that letter
perception relies on optimized prediction-error representations.

Keywords: Reading, Predictive Coding, Computational Cognitive Neuroscience, Image Computation, Transparent AI, Vision