Concreteness is inherent to human and language model word meaning
Wed—Casino_1.811—Poster3—8903
Presented by: Emily Wiebers
Understanding how concreteness shapes semantic representations and is processed is a central question in cognitive science, neuroscience, and computational linguistics. However, it still remains unclear how concrete and abstract words are represented in the semantic system. To address this, we implemented a semantic task that does not explicitly focus attention on our variable of interest (concreteness), i.e., an odd-one-out task. A sample of native German speakers (N = 40) were presented with 9880 word triplets and instructed to identify the word least similar to the others within each triplet. Orthogonally to concreteness, we varied word frequency and controlled for length and Orthographic Levenshtein distance 20 (OLD20). We derived a Representational Dissimilarity Matrix (RDM) from participants' ratings, reflecting subjective pairwise semantic relationships, and compared it to an RDM derived from pairwise cosine distances of word vectors from \textit{fastText}. We hypothesized that concrete and abstract words would elicit different selection patterns, reflecting distinct underlying semantic representations. Prediction of the human data by the distributional semantic model (DSM) significantly improves when incorporating concreteness as predictor, compared to control variables such as word frequency, word length, and OLD20. Our findings suggest that concreteness, unlike word frequency, is implicitly employed by humans and language model(s) in representing the similarity structure of word meaning. This supports the view that concreteness is an intrinsic property of word meaning, deeply embedded in semantic representations, both biological and artificial.
Keywords: concreteness, semantic space, language model, odd-one-out