CLiMB: AContinualLearningBenchmark forVision-and-Language Tasks
–Neural Information Processing Systems
This assumption means learning separate models for language-only, vision-only, and vision-language tasks, as opposed to a single "generalist" model that can handle all modalities or subsets of them [Reed et al., 2022]. Yet, existing work suggests that knowledge grounded in multiple modalities can benefit unimodal tasks [Desai and Johnson, 2021, Jin et al., 2022].
Neural Information Processing Systems
Feb-11-2026, 16:13:02 GMT
- Country:
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Technology: