Actions in Context: The Interrelation Between Scene Functions, Actions and Object Phrases
Wed—HZ_7—Talks7—6806
Presented by: Lea Müller Karoza
Most actions occur in interaction with our environment, where scenes and objects provide cues for action possibilities. While scenes are best categorized by actions, completing actions usually requires objects. The scene grammar framework suggests that objects are grouped into "phrases," potentially reflecting functional organization. However, whether phrases are relevant for understanding actions remains largely unexplored. This study examines how objects and scene context influence action identification and scene-action fit assessment.
Across three experiments, participants viewed scenes, named actions, and identified necessary objects. Those used for the same actions were spatially closer than objects used for different actions, supporting the notion of functionally organized phrases. Sparse scenes lead to object-level actions (e.g., opening windows) while richer scenes evoked higher-level action schemata involving multiple objects. Priming participants solely lexically elicited specific, higher-level actions predominantly (e.g., cooking with utensils and ingredients). Thus, the richness of our surroundings influences how we conceptualize actions; our scene concepts prioritize functions, whereas visual prompts shift the focus towards lower-level actions.
In an independent eye-tracking study, participants viewed scenes from the previous experiments, paired with the most frequently named and spatially distinct actions. Results showed a focus on objects in the action-related phrases when assessing a scene-action fit.
Our findings suggest that action identification and scene-action fit assessment rely on overlapping information. Participants predominantly selected objects within the same phrase when identifying items necessary for actions, and eye-tracking data confirmed a focus on these phrases during scene-action fit assessment. This highlights the functional organization of phrases in scenes.
Across three experiments, participants viewed scenes, named actions, and identified necessary objects. Those used for the same actions were spatially closer than objects used for different actions, supporting the notion of functionally organized phrases. Sparse scenes lead to object-level actions (e.g., opening windows) while richer scenes evoked higher-level action schemata involving multiple objects. Priming participants solely lexically elicited specific, higher-level actions predominantly (e.g., cooking with utensils and ingredients). Thus, the richness of our surroundings influences how we conceptualize actions; our scene concepts prioritize functions, whereas visual prompts shift the focus towards lower-level actions.
In an independent eye-tracking study, participants viewed scenes from the previous experiments, paired with the most frequently named and spatially distinct actions. Results showed a focus on objects in the action-related phrases when assessing a scene-action fit.
Our findings suggest that action identification and scene-action fit assessment rely on overlapping information. Participants predominantly selected objects within the same phrase when identifying items necessary for actions, and eye-tracking data confirmed a focus on these phrases during scene-action fit assessment. This highlights the functional organization of phrases in scenes.
Keywords: scene perception, object perception, action, scene grammar