Meaning Modulations and Stability in Large Language Models: An Analysis of BERT Embeddings for Psycholinguistic Research
Mon—HZ_9—Talks2—1201
Presented by: Fritz Günther
Computational models of semantic representations have long assumed and produced a single static representation for each word type, ignoring the influence of linguistic context on semantic representations. Recent Large Language Models (LLMs) introduced in Natural Language Processing, however, learn token-level contextualized representations, holding promise to study how semantic representations change in different contexts. In this study, we probe type- and token-level representations learned using a prominent example of such models, Bidirectional Encoder Representations from Transformers (BERT), for their ability to i) explain semantic effects found for isolated words (semantic relatedness and similarity ratings, lexical decision, and semantic priming), but critically also to ii) exhibit systematic interactions between lexical semantics and context, and iii) explain meaning modulations in context. Across a wide range of empirical studies on each of these topics, we show that BERT representations satisfy two desiderata for psychologically valid semantic representations: i) they have a stable semantic core that allows people to interpret words in isolation and prevents words to be used arbitrarily and ii) they interact with sentence context in systematic ways, with representations shifting as a function of their semantic core and the context. This demonstrates that a single, comprehensive model which simultaneously learns abstract, type-level prototype representations as well as mechanisms of how these interact with context can explain both isolated word effects and context-dependent variations. Notably, these variations are not limited to discrete word senses, eschewing a strict dichotomy between exemplar and prototype models and re-framing traditional notions of polysemy in semantic memory.
Keywords: Distributional semantics, Large Language Models, Computational Psycholinguistics, Mental lexicon, Context-dependent semantic representations, Ambiguity, Polysemy