Shared and distinct representational dynamics of phonemes and prosody along dorsal and ventral speech streams

Tue—HZ_10—Talks5—4601

Presented by: Seung-Cheol Baek

Seung-Cheol Baek ¹^*, Seung-Goo Kim ¹, Burkhard Maess ², Maren Grigutsch ², Daniela Sammler ^{1, 2}

¹ Max Planck Institute for Empirical Aesthetics, ² Max Planck Institute for Human Cognitive and Brain Sciences

Phonemes and prosody are fundamental linguistic units for speech comprehension, each contributing to different linguistic meanings. Perceiving these linguistic units is thought to involve a serial transformation from acoustic to categorical representations through a cortical hierarchy organized into dorsal and ventral speech streams. However, the neural implementation of this perceptual process remains poorly understood, along with how it compares between phonemes and prosody, which emerge from distinct physical cues. To address these questions, we monitored 29 participants using magnetoencephalography during a psychophysical experiment with single words varying orthogonally in word-initial phonemes (“Bar” vs. “Paar”) and prosody (statement vs. question). Employing a novel approach that combined time-resolved representational similarity and multivariate transfer entropy analyses, we examined phonemic and prosodic representations at different abstraction levels in space and time, as well as their information flow. Our findings reveal that the abstraction of phonemic and prosodic information does not fully align with a serial hierarchical organization. Instead, we observed successive, yet joint and distributed processing of acoustic and categorical information—phonemes along the ventral stream and prosody along both dorsal and ventral streams. Notably, while acoustic representations of both linguistic units similarly spanned across bilateral temporal areas, their categorical representations exhibited distinct spatial extents. Specifically, phonemes were categorically represented locally in posterior temporal areas, whereas categorical representations of prosody extended broadly to anterior temporal and right premotor regions. These results highlight shared processing principles, yet partially distinct mechanisms for the abstraction of phonemes and prosody, which may support their robust and concurrent perception.

Keywords: MEG, multivariate analysis, source localization, modeling, speech perception, language, hearing