Multimodal Datasets and Benchmarks for Reasoning about Dynamic Spatio-Temporality in Everyday Environments
Ugai, Takanori, Hara, Kensho, Egami, Shusaku, Fukuda, Ken
–arXiv.org Artificial Intelligence
We used a 3D simulator to create artificial video data with standardized annotations, aiming to aid in the development of Embodied AI. Our question answering (QA) dataset measures the extent to which a robot can understand human behavior and the environment in a home setting. Preliminary experiments suggest our dataset is useful in measuring AI's comprehension of daily life. \end{abstract}
arXiv.org Artificial Intelligence
Sep-16-2024
- Country:
- Asia > Japan
- Honshū > Kantō
- Kanagawa Prefecture (0.04)
- Tokyo Metropolis Prefecture > Tokyo (0.04)
- Honshū > Kantō
- North America
- Puerto Rico > Peñuelas
- Peñuelas (0.04)
- United States > Washington
- King County > Seattle (0.04)
- Puerto Rico > Peñuelas
- Asia > Japan
- Genre:
- Research Report (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (0.48)
- Natural Language > Large Language Model (0.47)
- Vision (1.00)
- Information Technology > Artificial Intelligence