Mid-level vision in complex scenes

Mon-HS1-Talk I-05

Presented by: Dirk Bernhardt-Walther

Dirk Bernhardt-Walther

University of Toronto

Human observers can rapidly perceive complex real-world scenes. Grouping visual elements into meaningful units is an integral part of this process. We here introduce a new, image-computable approach for detecting mid-level features in complex, real-world scenes. Specifically, here we manipulated the local parallelism content of real-world scenes. We decoded scene categories from patterns of brain activity obtained via functional magnetic resonance imaging (fMRI) in 38 human observers while they viewed the manipulated scenes. Decoding was significantly more accurate for scenes containing strong local parallelism compared to weak local parallelism in the parahippocampal place area (PPA), indicating a central role of parallelism in scene perception. To investigate the origin of the parallelism signal we performed a model-based fMRI analysis of the public BOLD5000 dataset, looking for voxels whose activation time course matches that of the locally parallel content of the 4916 photographs viewed by the participants in the experiment. We found a strong relationship with average local symmetry in visual areas V1-4, PPA, and retrosplenial cortex (RSC). Notably, the parallelism-related signal peaked first in V4, suggesting V4 as the site for extracting parallelism from the visual input. We conclude that local parallelism is a perceptual grouping cue that influences neuronal activity throughout the visual hierarchy, presumably starting at V4. Parallelism plays a key role in the representation of scene categories in PPA. Our suite of computational tools is publicly available as a software toolbox for the analysis of real-world images.

Keywords: perceptual grouping, scene perception, parallelism, mid-level vision