Submission 107
Modelling Affix Learning from Reading: Insights from Compositional Distributional Semantics
SymposiumTalk-05
Presented by: Maria Korochkina
Most words in English and other languages are built by combining smaller units of meaning called morphemes (e.g., teach + -er → teacher). Understanding a language’s morphology is essential for developing reading expertise because it enables readers to compute the meanings of familiar and unfamiliar words (e.g., tweeter). Our work investigated what English-speaking children can learn about English morphology through text experience. We trained a compositional distributional semantic model on words from 1,200 books popular with British children aged 7–16 to investigate what the model can learn about affix meanings through reading. We then assessed whether the model’s knowledge of affix meanings aligned with that of 120 adults in a lexical decision task. The model’s knowledge of individual affixes accounted for patterns in readers’ lexical processing: affixes estimated by the model to have richer, more coherent meanings were better known, and this was associated with both the number of distinct words containing each affix and the level of noise in their usage. Our work shows that, despite high levels of noise, natural text contains enough structure to support the extraction of core affix semantics, and that readers’ knowledge of affixes reflects the patterns found in text. This study adds a new dimension to a more principled and psychologically grounded understanding of morpheme learning.