Revealing the Dimensions Underlying Human Action Recognition
Wed-H4-Talk 7-7205
Presented by: André Bockes
How do we understand the actions performed by other agents? Examining the dimensions underlying the organization of observed actions can gain insights into the way this task is achieved. To reveal such dimensions, we used crowd-based similarity ratings of a large number of actions depicted as short video clips combined with a computational model.
First, we generated and characterised a large stimulus database of 768 one-second video clips selected from the Moments in Time dataset, spanning 256 action categories with three exemplars each. To aid in the selection and editing process of the final set of videos, we used a customized neural network model, based on ResNet-50.
Second, we gathered 1.25 million similarity ratings, using an online triplet odd-one-out task. Ratings were fed into a computational model (SPoSE), which revealed 28 meaningful action-dimensions, including hand-/ tool-relatedness, sport and communication. These dimensions were further tested for interpretability and usability in a separate group of participants. In this talk we will discuss how these dimensions contribute to the existing literature regarding the organization of human actions.
First, we generated and characterised a large stimulus database of 768 one-second video clips selected from the Moments in Time dataset, spanning 256 action categories with three exemplars each. To aid in the selection and editing process of the final set of videos, we used a customized neural network model, based on ResNet-50.
Second, we gathered 1.25 million similarity ratings, using an online triplet odd-one-out task. Ratings were fed into a computational model (SPoSE), which revealed 28 meaningful action-dimensions, including hand-/ tool-relatedness, sport and communication. These dimensions were further tested for interpretability and usability in a separate group of participants. In this talk we will discuss how these dimensions contribute to the existing literature regarding the organization of human actions.
Keywords: action-recognition, action-dimensions, computational model