Goto

Collaborating Authors

 Large Language Model


Re2G: Retrieve, Rerank, Generate

arXiv.org Artificial Intelligence

As demonstrated by GPT-3 and T5, transformers grow in capability as parameter spaces become larger and larger. However, for tasks that require a large amount of knowledge, non-parametric memory allows models to grow dramatically with a sub-linear increase in computational cost and GPU memory requirements. Recent models such as RAG and REALM have introduced retrieval into conditional generation. These models incorporate neural initial retrieval from a corpus of passages. We build on this line of research, proposing Re2G, which combines both neural initial retrieval and reranking into a BART-based sequence-to-sequence generation. Our reranking approach also permits merging retrieval results from sources with incomparable scores, enabling an ensemble of BM25 and neural initial retrieval. To train our system end-to-end, we introduce a novel variation of knowledge distillation to train the initial retrieval, reranker, and generation using only ground truth on the target sequence output. We find large gains in four diverse tasks: zero-shot slot filling, question answering, fact-checking, and dialog, with relative gains of 9% to 34% over the previous state-of-the-art on the KILT leaderboard. We make our code available as open source at https://github.com/IBM/kgi-slot-filling/tree/re2g.


BLOOM

#artificialintelligence

Large language models (LLMs) have made a significant impact on AI research. These powerful, general models can take on a wide variety of new language tasks from a user's instructions. However, academia, nonprofits and smaller companies' research labs find it difficult to create, study, or even use LLMs as only a few industrial labs with the necessary resources and exclusive rights can fully access them. Today, we release BLOOM, the first multilingual LLM trained in complete transparency, to change this status quo -- the result of the largest collaboration of AI researchers ever involved in a single research project. With its 176 billion parameters, BLOOM is able to generate text in 46 natural languages and 13 programming languages.


DeepMind AI learns simple physics like a baby

#artificialintelligence

Even young babies are aware of the basic physics of everyday objects.Credit: Getty Inspired by research into how infants learn, computer scientists have created a program that can pick up simple physical rules about the behaviour of objects -- and express surprise when they seem to violate those rules. The results were published on 11 July in Nature Human Behaviour1. Developmental psychologists test how babies understand the motion of objects by tracking their gaze. When shown a video of, for example, a ball that suddenly disappears, the children express surprise, which researchers quantify by measuring how long the infants stare in a particular direction. Luis Piloto, a computer scientist at Google-owned company DeepMind in London, and his collaborators wanted to develop a similar test for artificial intelligence (AI).


BLOOM: Inside the unconventional new mission to democratize AI - Channel969

#artificialintelligence

However Meta's mannequin is offered solely upon request, and it has a license that limits its use to analysis functions. Hugging Face goes a step additional. The conferences detailing its work over the previous yr are recorded and uploaded on-line, and anybody can obtain the mannequin freed from cost and use it for analysis or to construct industrial purposes. An enormous focus for BigScience was to embed moral concerns into the mannequin from its inception, as a substitute of treating them as an afterthought. LLMs are skilled on tons of knowledge collected by scraping the web.


Inside a radical new project to democratize AI

MIT Technology Review

Unlike other, more famous large language models such as OpenAI's GPT-3 and Google's LaMDA, BLOOM (which stands for BigScience Large Open-science Open-access Multilingual Language Model) is designed to be as transparent as possible, with researchers sharing details about the data it was trained on, the challenges in its development, and the way they evaluated its performance. OpenAI and Google have not shared their code or made their models available to the public, and external researchers have very little understanding of how these models are trained. BLOOM was created over the last year by over 1,000 volunteer researchers in a project called BigScience, which was coordinated by AI startup Hugging Face using funding from the French government. It officially launched on July 12. The researchers hope developing an open-access LLM that performs as well as other leading models will lead to long-lasting changes in the culture of AI development and help democratize access to cutting-edge AI technology for researchers around the world.


Where is 'I' in 'AI' anymore?

#artificialintelligence

Last month, a group of Cosmopolitan editors, alongside digital artist Karen X. Cheng and members of artificial intelligence research lab OpenAI, created the first-ever magazine cover designed by artificial intelligence. This is the first-ever magazine cover generated using DALLE-2. Words I never thought I'd be saying? An image I generated is the cover of @cosmopolitan for their first ever AI-generated magazine cover #dalle #dalle2 pic.twitter.com/x2oqiNMRVx Recently, OpenAI's GPT-3 also published a research thesis on itself.


Inner Monologue: Embodied Reasoning through Planning with Language Models

arXiv.org Artificial Intelligence

Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to the language. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them - answers that change over time in response to the agent's own choices. In this work, we investigate to what extent LLMs used in such embodied contexts can reason over sources of feedback provided through natural language, without any additional training. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios. We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction. We find that closed-loop language feedback significantly improves high-level instruction completion on three domains, including simulated and real table top rearrangement tasks and long-horizon mobile manipulation tasks in a kitchen environment in the real world.


DeepMind AI learns physics by watching videos that don't make sense

New Scientist

Teaching artificial intelligence to understand simple physics concepts, such as that one solid object can't occupy the same space as another, could lead to more capable software that takes less computational resources to train, say researchers at DeepMind. The UK-based company has previously created AI that can beat expert players at chess and Go, write computer software and solve the protein-folding problem. But these models are highly specialised and lack a general understanding of the world. As DeepMind's researchers say in their latest paper, "something fundamental is still missing". Now, Luis Piloto at DeepMind and his colleagues have created an AI called Physics Learning through Auto-encoding and Tracking Objects (PLATO) that is designed to understand that the physical world is composed of objects that follow basic physical laws.


Seat of Knowledge: Information-Centric Classification in AI - Class 2

#artificialintelligence

Gadi Singer is Vice President and Director of Emergent AI Research at Intel Labs leading the development of the third wave of AI capabilities. The previous blog in this series introduced the concept of an information-centric classification of AI systems as a highly valuable view that is complementary to processing-based classifications such as Henry Kautz' taxonomy for neural symbolic computing. It also previewed a classification that emphasizes the high-level architectural choice related to information in the AI system. The first class of systems in this classification system with its'Fully Encapsulated Information' was detailed in the previous blog of this series. Systems in this class incorporate all information required for AI tasks in the weights and model parameters without leveraging any additional adjunct sources of information.


Artificial intelligence bot wrote scientific paper in 2 hours

#artificialintelligence

Sign up for our newsletter for the latest tech news and scoops -- delivered daily to your inbox. A researcher from Sweden gave an AI algorithm known as GPT-3 a simple directive: "Write an academic thesis in 500 words about GPT-3 and add scientific references and citations inside the text." Researcher Almira Osmanovic Thunström said she stood in awe as the text began to generate. In front of her was what she called a "fairly good" research introduction that GPT-3 wrote about itself. After the successful experiment, Thunström, a Swedish researcher at Gothenburg University, sought to get a whole research paper out of GPT-3 and publish it in a peer-reviewed academic journal.