Object Presence Informs Scene Affordance Estimations
Tue-Main hall - Z3-Poster 2-6002
Presented by: Lea Müller Karoza
Efficiently engaging with the objects in our surroundings comes naturally to us, as their design enables effortless action completion. Functionally related objects are close, and at the centre of these object clusters, we find anchor objects predicting the presence and location of surrounding local objects. However, scenes are suggested to be best categorized by their function rather than the objects they contain.
To investigate whether the hierarchy in scenes informs affordance understanding, we studied the role of anchor presence.
In two studies, participants decided whether action words (e.g., “washing hands”) and scenes presented on screen matched. The scenes could either be missing related anchors (REL; sink), unrelated anchors (UNREL; toilet) or random objects (RAND; mirror). In the first series of experiments, participants saw photographs of real scenes, manipulated by scrambling the pixels of the respective objects, while the second study included 3D rendered scenes in which objects were removed entirely. Participants were consistently less accurate and slower in REL conditions but significantly faster and more accurate in UNREL and RAND trials.
To control for confounding factors and ensure the comparability of the results discussed above, we conducted a scene categorization task on both types of scenes, lacking anchors and random objects. No interaction of scene type and object missing was observed, suggesting similar responses to both stimuli regarding the effect of object removal.
These studies suggest that affordance understanding relies on object-level rather than scene-level information. Scene affordance appears to be strongly determined by the objects a scene contains.
To investigate whether the hierarchy in scenes informs affordance understanding, we studied the role of anchor presence.
In two studies, participants decided whether action words (e.g., “washing hands”) and scenes presented on screen matched. The scenes could either be missing related anchors (REL; sink), unrelated anchors (UNREL; toilet) or random objects (RAND; mirror). In the first series of experiments, participants saw photographs of real scenes, manipulated by scrambling the pixels of the respective objects, while the second study included 3D rendered scenes in which objects were removed entirely. Participants were consistently less accurate and slower in REL conditions but significantly faster and more accurate in UNREL and RAND trials.
To control for confounding factors and ensure the comparability of the results discussed above, we conducted a scene categorization task on both types of scenes, lacking anchors and random objects. No interaction of scene type and object missing was observed, suggesting similar responses to both stimuli regarding the effect of object removal.
These studies suggest that affordance understanding relies on object-level rather than scene-level information. Scene affordance appears to be strongly determined by the objects a scene contains.
Keywords: scene perception, affordances, scene grammar, actions