MOMA: Multi-Object Multi-Actor Activity Parsing

Jan-17-2025, 17:14:07 GMT–Neural Information Processing Systems

Complex activities often involve multiple humans utilizing different objects to complete actions (e.g., in healthcare settings, physicians, nurses, and patients interact with each other and various medical devices). Recognizing activities poses a challenge that requires a detailed understanding of actors' roles, objects' affordances, and their associated relationships. Furthermore, these purposeful activities are composed of multiple achievable steps, including sub-activities and atomic actions, which jointly define a hierarchy of action parts. This paper introduces Activity Parsing as the overarching task of temporal segmentation and classification of activities, sub-activities, atomic actions, along with an instance-level understanding of actors, objects, and their relationships in videos. Involving multiple entities (actors and objects), we argue that traditional pair-wise relationships, often used in scene or action graphs, do not appropriately represent the dynamics between them.

activity parsing, moma, multi-object multi-actor activity parsing, (3 more...)

Neural Information Processing Systems

Jan-17-2025, 17:14:07 GMT

Conferences Web Page

Add feedback

Industry:
- Health & Medicine (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.73)