COGITAO: A Visual Reasoning Framework To Study Compositionality & Generalization
Taoudi-Benchekroun, Yassine, Troyan, Klim, Sager, Pascal, Gerber, Stefan, Tuggener, Lukas, Grewe, Benjamin
–arXiv.org Artificial Intelligence
The ability to compose learned concepts and apply them in novel settings is key to human intelligence, but remains a persistent limitation in state-of-the-art machine learning models. To address this issue, we introduce COGITAO, a modular and extensible data generation framework and benchmark designed to systematically study compositionality and generalization in visual domains. Drawing inspiration from ARC-AGI's problem-setting, COGITAO constructs rule-based tasks which apply a set of transformations to objects in grid-like environments. It supports composition, at adjustable depth, over a set of 28 interoperable transformations, along with extensive control over grid parametrization and object properties. This flexibility enables the creation of millions of unique task rules -- surpassing concurrent datasets by several orders of magnitude -- across a wide range of difficulties, while allowing virtually unlimited sample generation per rule. We provide baseline experiments using state-of-the-art vision models, highlighting their consistent failures to generalize to novel combinations of familiar elements, despite strong in-domain performance. COGITAO is fully open-sourced, including all code and datasets, to support continued research in this field.
arXiv.org Artificial Intelligence
Sep-8-2025
- Country:
- Asia > China
- Guangxi Province > Nanning (0.04)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Italy > Tuscany
- Florence (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Louisiana > Orleans Parish
- Canada > Ontario
- Oceania > Australia
- Asia > China
- Genre:
- Research Report > New Finding (0.46)
- Technology: