Goto

Collaborating Authors

 gato


On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective

Wen, Ying, Wan, Ziyu, Zhou, Ming, Hou, Shufang, Cao, Zhe, Le, Chenyang, Chen, Jingxiao, Tian, Zheng, Zhang, Weinan, Wang, Jun

arXiv.org Artificial Intelligence

The pervasive uncertainty and dynamic nature of real-world environments present significant challenges for the widespread implementation of machine-driven Intelligent Decision-Making (IDM) systems. Consequently, IDM should possess the ability to continuously acquire new skills and effectively generalize across a broad range of applications. The advancement of Artificial General Intelligence (AGI) that transcends task and application boundaries is critical for enhancing IDM. Recent studies have extensively investigated the Transformer neural architecture as a foundational model for various tasks, including computer vision, natural language processing, and reinforcement learning. We propose that a Foundation Decision Model (FDM) can be developed by formulating diverse decision-making tasks as sequence decoding tasks using the Transformer architecture, offering a promising solution for expanding IDM applications in complex real-world situations. In this paper, we discuss the efficiency and generalization improvements offered by a foundation decision model for IDM and explore its potential applications in multi-agent game AI, production scheduling, and robotics tasks. Lastly, we present a case study demonstrating our FDM implementation, DigitalBrain (DB1) with 1.3 billion parameters, achieving human-level performance in 870 tasks, such as text generation, image captioning, video game playing, robotic control, and traveling salesman problems. As a foundation decision model, DB1 represents an initial step toward more autonomous and efficient real-world IDM applications.


Closer to AGI? – O'Reilly

#artificialintelligence

DeepMind's new model, Gato, has sparked a debate on whether artificial general intelligence (AGI) is nearer–almost at hand–just a matter of scale. Gato is a model that can solve multiple unrelated problems: it can play a large number of different games, label images, chat, operate a robot, and more. Not so many years ago, one problem with AI was that AI systems were only good at one thing. After IBM's Deep Blue defeated Garry Kasparov in chess, it was easy to say "But the ability to play chess isn't really what we mean by intelligence." A model that plays chess can't also play space wars.


The biggest AI breakthroughs of the last year

#artificialintelligence

In 2022, we were presented with several stunning developments in artificial intelligence (AI). Some believe that these advances push the limits of what we have now (narrow AI) towards the holy grail of artificial general intelligence (a machine that can mimic the thinking and problem-solving capacities of humans but faster and more accurately). Among the many developments in 2022, four breakthroughs are of note and will be significant in 2023 and beyond both within the discussions on responsible design development and AI use and in the transformative power they have for our societies. First came DALL-E, the AI that can create pictures from language prompts. Many of us enjoyed playing with the tool and embracing the ability it gave to us to design in new ways. Others worried about AI taking over our human creativity.


A Generalist Agent

Reed, Scott, Zolna, Konrad, Parisotto, Emilio, Colmenarejo, Sergio Gomez, Novikov, Alexander, Barth-Maron, Gabriel, Gimenez, Mai, Sulsky, Yury, Kay, Jackie, Springenberg, Jost Tobias, Eccles, Tom, Bruce, Jake, Razavi, Ali, Edwards, Ashley, Heess, Nicolas, Chen, Yutian, Hadsell, Raia, Vinyals, Oriol, Bordbar, Mahyar, de Freitas, Nando

arXiv.org Artificial Intelligence

Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens. In this report we describe the model and the data, and document the current capabilities of Gato. What is the capital of France?


Four thoughts on AI deep learning in 2022

#artificialintelligence

This article is part of a VB special issue. Read the full series here: How Data Privacy Is Transforming Marketing. We're putting another year of exciting developments in artificial intelligence (AI) deep learning behind us – one filled with remarkable progress, controversies and, of course, disputes. As we wrap up 2022 and prepare to embrace what 2023 has in store, here are some of the most notable overarching trends that marked this year in deep learning. One theme that has remained constant in deep learning over the past few years is the drive to create bigger neural networks.


AI2's Unified-IO can complete a range of AI tasks – TechCrunch

#artificialintelligence

The Allen Institute for AI (AI2), the division within the nonprofit Allen Institute focused on machine learning research, today published its work on an AI system, called Unified-IO, that it claims is among the first to perform a "large and diverse" set of AI tasks. Unified-IO can process and create images, text and other structured data, a feat that the research team behind it says is a step toward building capable, unified general-purpose AI systems. "We are interested in building task-agnostic [AI systems], which can enable practitioners to train [machine learning] models for new tasks with little to no knowledge of the underlying machinery," Jaisen Lu, a research scientist at AI2 who worked on Unified-IO, told TechCrunch via email. "Such unified architectures alleviate the need for task-specific parameters and system modifications, can be jointly trained to perform a large variety of tasks and can share knowledge across tasks to boost performance." AI2's early efforts in building unified AI systems led to GPV-1 and GPV-2, two general-purpose, "vision-language" systems that supported a handful of workloads including captioning images and answering questions.


Why Gato from Deepmind is a game changer - DataScienceCentral.com

#artificialintelligence

While no agent can be expected to excel in all imaginable control tasks, especially those far outside of its training distribution, we here test the hypothesis that training an agent which is generally capable on a large number of tasks is possible; and that this general agent can be adapted with little extra data to succeed at an even larger number of tasks. We hypothesize that such an agent can be obtained through scaling data, compute and model parameters, continually broadening the training distribution while maintaining performance, towards covering any task, behavior and embodiment of interest. In this setting, natural language can act as a common grounding across otherwise incompatible embodiments, unlocking combinatorial generalization to new behaviors. The guiding design principle of Gato is to train on the widest variety of relevant data possible, including diverse modalities such as images, text, proprioception, joint torques, button presses, and other discrete and continuous observations and actions. To enable processing this multi-modal data, we serialize all data into a flat sequence of tokens.


Artificial General Intelligence Is Not as Imminent as You Might Think

#artificialintelligence

To the average person, it must seem as if the field of artificial intelligence is making immense progress. According to the press releases, and some of the more gushing media accounts, OpenAI's DALL-E 2 can seemingly create spectacular images from any text; another OpenAI system called GPT-3 can talk about just about anything; and a system called Gato that was released in May by DeepMind, a division of Alphabet, seemingly worked well on every task the company could throw at it. One of DeepMind's high-level executives even went so far as to brag that in the quest for artificial general intelligence (AGI), AI that has the flexibility and resourcefulness of human intelligence, "The Game is Over!" And Elon Musk said recently that he would be surprised if we didn't have artificial general intelligence by 2029. Machines may someday be as smart as people, and perhaps even smarter, but the game is far from over.


Deepmind: Is "Gato" a precursor for general artificial intelligence?

#artificialintelligence

Deepmind's Gato solves many tasks, but none of them really well. Does the new AI system nevertheless lead the way for general artificial intelligence? Hot on the heels of OpenAI's DALL-E 2, Google's PaLM, LaMDA 2, and Deepmind's Chinchilla and Flamingo, the London-based AI company is showing off another large AI model that outperforms existing systems. Yet Deepmind's Gato is different: The model can't text better, describe images better, play Atari better, control robotic arms better, or orient itself in 3D spaces better than other AI systems. But Gato can do a bit of everything. Deepmind trained the Transformer-based multi-talent with images, text, proprioception, joint moments, keystrokes, and other discrete and continuous observations and actions.


The long, hype-strewn road to general artificial intelligence

#artificialintelligence

This story was originally published by Undark and is reproduced here as part of the Climate Desk collaboration. Last month, DeepMind, a subsidiary of technology giant Alphabet, set Silicon Valley abuzz when it announced Gato, perhaps the most versatile artificial intelligence model in existence. Billed as a "generalist agent," Gato can perform over 600 different tasks. It can drive a robot, caption images, identify objects in pictures, and more. It is probably the most advanced AI system on the planet that isn't dedicated to a singular function.