Robust Letter Recognition Through Predictive Coding: Insights from a Computational Model

16:30 - 18:00

Symbolic Representations in the Visual Pathway: From Numbers and Letters to Words and Semantics

Room: HSZ - N3

Chair/s:

Jack E Taylor, Janos Pauli

Written script, including numbers, letters, and letter-strings, is a recent cultural invention central to letter- and number-literate societies. In such societies, humans learn early on to recognise glyphs and map them fluently onto specific sounds and concepts. This symposium explores how the brain achieves this objective, using multiple complementary lenses to understand the processing of linguistic and mathematical symbols, the degree to which these representations are distinct, and interactions between visual symbol recognition and abstract processes of language and numerosity.

The first talk introduces a predictive coding-motivated computational model of letter recognition, showing how these principles might explain the recognition of letters in noisy environments. This work suggests that predictive coding accounts of word recognition may also apply to isolated letters. The second talk uses an optimal transport framework to model the space of early visual representations of letter symbols revealed by EEG, exploring how such representations may be altered in dyslexia. This work tests whether this learning disorder in reading also results in weaker neural alignment with computational models of letter representations. The third talk presents an analysis of human fMRI and macaque electrocorticography responses to naturalistic images, suggesting a shared prominent representation of stimuli related to both orthography and numerosity. This finding is discussed in relation to the notion of proto-architecture for mathematical cognition in the higher-level visual cortex of non-human primates. The fourth talk examines interactions between the processing of Arabic digits and language. This study exploits the discrepancy between the base-10 system of Arabic numbers and base-20 system of French number words, finding that native French speakers utilise language during a numerical task, even when language is redundant. The fifth talk explores how the brain processes words with varying degrees of misspelling. Using MEG data, it examines connectivity between lower visual areas and the lvOT, suggesting that lvOT processes real words in a feedforward manner but engages feedback mechanisms for misspelled words and pseudowords.

Combining experimental and computational approaches, this symposium advances our understanding of how the brain maps arbitrary visual forms into meaningful symbolic representations, and how these processes interact with language and numerosity.

Submission 462

Robust Letter Recognition Through Predictive Coding: Insights from a Computational Model

SymposiumTalk-01

Presented by: Janos Pauli

Janos Pauli, Benjamin Gagl

Self-Learnings Systems Lab, Department of Special Education and Rehabilitation, University of Cologne, Germany

Reading is a human invention that became an essential part of societal participation. Expert readers can perceive around 200-300 words per minute, indicating that they process words and letters with high efficiency. Despite the significant role of the letter-level, most models of visual word recognition neglect or oversimplify the processes underlying letter recognition. This gap is detrimental, since reading a word becomes impossible without separately identifying its constituent letters. Here, we develop a transparent neuro-cognitive model of letter recognition based on the principles of predictive coding. We assume that humans decompose visual letter information into features that describe distinct letter characteristics and shapes. These distinct features then activate font-invariant letter-prototype representations to identify the letter. To evaluate the computational implementation of the model, we conducted a letter identification task with increasing levels of noise. We collected behavioral and EEG data to compare two variants of the model: (i) one applying knowledge-based top-down influences on all processing levels assumed in predictive coding, or (ii) one that is strictly processing information in a feed-forward way. We demonstrate that the predictive-coding-based model variant best simulates behavioral response patterns. Furthermore, the model transparency enables us to understand that the key to the model's success lies in the feature-level predictions. Additionally, the internal model representations exhibit significant correlation with reaction-time data and EEG activation around 200 ms. Thus, we provide a letter recognition model that demonstrates how efficient neuronal processing, through predictive coding, results in letter identification behavior innate to optimal reading capacities.

Bookmark