unified-io
Allen AI & UW Propose Unified-IO: A High-Performance, Task-Agnostic Model for CV, NLP, and Multi-Modal Tasks
Building a general-purpose unified model that can solve diverse tasks in different modalities while maintaining high performance is a long-standing challenge in the machine learning research community. A conventional approach in this direction is building models with task-specialized heads on top of a shared architectural backbone -- but such models require expert knowledge to design a specialized head for each task, and their lack of parameter-sharing for new tasks limits their transfer-learning capabilities. In the new paper Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks, a research team from the Allen Institute for AI and the University of Washington introduces UNIFIED-IO, a neural model with no task- or modality-specific branches that achieves competitive performance across a wide variety of computer vision (CV), natural language processing (NLP), and multi-modal benchmark tasks without fine-tuning. The researchers set out to build a unified neural architecture that ML practitioners with little or no knowledge of the underlying machinery could use to efficiently and effectively train their models for new NLP and CV tasks. For models to support a variety of modalities (images, language, boxes, binary masks, segmentation, etc.), they must represent all modalities in a shared space.
AI2's Unified-IO can complete a range of AI tasks – TechCrunch
The Allen Institute for AI (AI2), the division within the nonprofit Allen Institute focused on machine learning research, today published its work on an AI system, called Unified-IO, that it claims is among the first to perform a "large and diverse" set of AI tasks. Unified-IO can process and create images, text and other structured data, a feat that the research team behind it says is a step toward building capable, unified general-purpose AI systems. "We are interested in building task-agnostic [AI systems], which can enable practitioners to train [machine learning] models for new tasks with little to no knowledge of the underlying machinery," Jaisen Lu, a research scientist at AI2 who worked on Unified-IO, told TechCrunch via email. "Such unified architectures alleviate the need for task-specific parameters and system modifications, can be jointly trained to perform a large variety of tasks and can share knowledge across tasks to boost performance." AI2's early efforts in building unified AI systems led to GPV-1 and GPV-2, two general-purpose, "vision-language" systems that supported a handful of workloads including captioning images and answering questions.