AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Re2G: Retrieve, Rerank, Generate

Glass, Michael, Rossiello, Gaetano, Chowdhury, Md Faisal Mahbub, Naik, Ankita Rajaram, Cai, Pengshan, Gliozzo, Alfio

arXiv.org Artificial IntelligenceJul-13-2022

As demonstrated by GPT-3 and T5, transformers grow in capability as parameter spaces become larger and larger. However, for tasks that require a large amount of knowledge, non-parametric memory allows models to grow dramatically with a sub-linear increase in computational cost and GPU memory requirements. Recent models such as RAG and REALM have introduced retrieval into conditional generation. These models incorporate neural initial retrieval from a corpus of passages. We build on this line of research, proposing Re2G, which combines both neural initial retrieval and reranking into a BART-based sequence-to-sequence generation. Our reranking approach also permits merging retrieval results from sources with incomparable scores, enabling an ensemble of BM25 and neural initial retrieval. To train our system end-to-end, we introduce a novel variation of knowledge distillation to train the initial retrieval, reranker, and generation using only ground truth on the target sequence output. We find large gains in four diverse tasks: zero-shot slot filling, question answering, fact-checking, and dialog, with relative gains of 9% to 34% over the previous state-of-the-art on the KILT leaderboard. We make our code available as open source at https://github.com/IBM/kgi-slot-filling/tree/re2g.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2207.063

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > Ontario (0.04)
(9 more...)

Genre: Research Report (0.50)

Industry:

Information Technology (0.48)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

BLOOM

#artificialintelligenceJul-12-2022, 17:13:14 GMT

Large language models (LLMs) have made a significant impact on AI research. These powerful, general models can take on a wide variety of new language tasks from a user's instructions. However, academia, nonprofits and smaller companies' research labs find it difficult to create, study, or even use LLMs as only a few industrial labs with the necessary resources and exclusive rights can fully access them. Today, we release BLOOM, the first multilingual LLM trained in complete transparency, to change this status quo -- the result of the largest collaboration of AI researchers ever involved in a single research project. With its 176 billion parameters, BLOOM is able to generate text in 46 natural languages and 13 programming languages.

bloom, language model, llm

#artificialintelligence

Country: Europe > France > Île-de-France > Paris > Paris (0.07)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

DeepMind AI learns simple physics like a baby

#artificialintelligenceJul-12-2022, 15:25:36 GMT

Even young babies are aware of the basic physics of everyday objects.Credit: Getty Inspired by research into how infants learn, computer scientists have created a program that can pick up simple physical rules about the behaviour of objects -- and express surprise when they seem to violate those rules. The results were published on 11 July in Nature Human Behaviour1. Developmental psychologists test how babies understand the motion of objects by tracking their gaze. When shown a video of, for example, a ball that suddenly disappears, the children express surprise, which researchers quantify by measuring how long the infants stare in a particular direction. Luis Piloto, a computer scientist at Google-owned company DeepMind in London, and his collaborators wanted to develop a similar test for artificial intelligence (AI).

computer scientist, deepmind ai learn simple physics, video, (6 more...)

#artificialintelligence

Country: North America > Canada > British Columbia (0.06)

Genre: Research Report (0.96)

Industry: Education > Curriculum > Subject-Specific Education (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

BLOOM: Inside the unconventional new mission to democratize AI - Channel969

#artificialintelligenceJul-12-2022, 10:00:46 GMT

However Meta's mannequin is offered solely upon request, and it has a license that limits its use to analysis functions. Hugging Face goes a step additional. The conferences detailing its work over the previous yr are recorded and uploaded on-line, and anybody can obtain the mannequin freed from cost and use it for analysis or to construct industrial purposes. An enormous focus for BigScience was to embed moral concerns into the mannequin from its inception, as a substitute of treating them as an afterthought. LLMs are skilled on tons of knowledge collected by scraping the web.

bloom, mannequin, unconventional new mission, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.33)

Add feedback

Inside a radical new project to democratize AI

MIT Technology ReviewJul-12-2022, 09:00:00 GMT

Unlike other, more famous large language models such as OpenAI's GPT-3 and Google's LaMDA, BLOOM (which stands for BigScience Large Open-science Open-access Multilingual Language Model) is designed to be as transparent as possible, with researchers sharing details about the data it was trained on, the challenges in its development, and the way they evaluated its performance. OpenAI and Google have not shared their code or made their models available to the public, and external researchers have very little understanding of how these models are trained. BLOOM was created over the last year by over 1,000 volunteer researchers in a project called BigScience, which was coordinated by AI startup Hugging Face using funding from the French government. It officially launched on July 12. The researchers hope developing an open-access LLM that performs as well as other leading models will lead to long-lasting changes in the culture of AI development and help democratize access to cutting-edge AI technology for researchers around the world.

bloom, language model, radical new project, (8 more...)

MIT Technology Review

AI-Alerts: 2022 > 2022-07 > AAAI AI-Alert for Jul 12, 2022 (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

Where is 'I' in 'AI' anymore?

#artificialintelligenceJul-12-2022, 04:54:25 GMT

Last month, a group of Cosmopolitan editors, alongside digital artist Karen X. Cheng and members of artificial intelligence research lab OpenAI, created the first-ever magazine cover designed by artificial intelligence. This is the first-ever magazine cover generated using DALLE-2. Words I never thought I'd be saying? An image I generated is the cover of @cosmopolitan for their first ever AI-generated magazine cover #dalle #dalle2 pic.twitter.com/x2oqiNMRVx Recently, OpenAI's GPT-3 also published a research thesis on itself.

compositionality, first-ever magazine cover, magazine cover, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

Add feedback

Inner Monologue: Embodied Reasoning through Planning with Language Models

Huang, Wenlong, Xia, Fei, Xiao, Ted, Chan, Harris, Liang, Jacky, Florence, Pete, Zeng, Andy, Tompson, Jonathan, Mordatch, Igor, Chebotar, Yevgen, Sermanet, Pierre, Brown, Noah, Jackson, Tomas, Luu, Linda, Levine, Sergey, Hausman, Karol, Ichter, Brian

arXiv.org Artificial IntelligenceJul-12-2022

Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to the language. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them - answers that change over time in response to the agent's own choices. In this work, we investigate to what extent LLMs used in such embodied contexts can reason over sources of feedback provided through natural language, without any additional training. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios. We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction. We find that closed-loop language feedback significantly improves high-level instruction completion on three domains, including simulated and real table top rearrangement tasks and long-horizon mobile manipulation tasks in a kitchen environment in the real world.

inner monologue, robot, robot action, (12 more...)

arXiv.org Artificial Intelligence

2207.05608

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (0.40)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DeepMind AI learns physics by watching videos that don't make sense

New ScientistJul-11-2022, 16:00:34 GMT

Teaching artificial intelligence to understand simple physics concepts, such as that one solid object can't occupy the same space as another, could lead to more capable software that takes less computational resources to train, say researchers at DeepMind. The UK-based company has previously created AI that can beat expert players at chess and Go, write computer software and solve the protein-folding problem. But these models are highly specialised and lack a general understanding of the world. As DeepMind's researchers say in their latest paper, "something fundamental is still missing". Now, Luis Piloto at DeepMind and his colleagues have created an AI called Physics Learning through Auto-encoding and Tracking Objects (PLATO) that is designed to understand that the physical world is composed of objects that follow basic physical laws.

deepmind ai learn physics, plato, video, (6 more...)

New Scientist

AI-Alerts: 2022 > 2022-07 > AAAI AI-Alert for Jul 12, 2022 (1.00)

Country:

North America > United States > New York (0.06)
Europe > United Kingdom > England > Hampshire > Southampton (0.06)

Genre: Research Report > New Finding (0.37)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Seat of Knowledge: Information-Centric Classification in AI - Class 2

#artificialintelligenceJul-11-2022, 14:50:42 GMT

Gadi Singer is Vice President and Director of Emergent AI Research at Intel Labs leading the development of the third wave of AI capabilities. The previous blog in this series introduced the concept of an information-centric classification of AI systems as a highly valuable view that is complementary to processing-based classifications such as Henry Kautz' taxonomy for neural symbolic computing. It also previewed a classification that emphasizes the high-level architectural choice related to information in the AI system. The first class of systems in this classification system with its'Fully Encapsulated Information' was detailed in the previous blog of this series. Systems in this class incorporate all information required for AI tasks in the weights and model parameters without leveraging any additional adjunct sources of information.

class 2, information, information source, (14 more...)

#artificialintelligence

Industry:

Education > Educational Setting > Online (0.40)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Artificial intelligence bot wrote scientific paper in 2 hours

#artificialintelligenceJul-11-2022, 13:40:36 GMT

Sign up for our newsletter for the latest tech news and scoops -- delivered daily to your inbox. A researcher from Sweden gave an AI algorithm known as GPT-3 a simple directive: "Write an academic thesis in 500 words about GPT-3 and add scientific references and citations inside the text." Researcher Almira Osmanovic Thunström said she stood in awe as the text began to generate. In front of her was what she called a "fairly good" research introduction that GPT-3 wrote about itself. After the successful experiment, Thunström, a Swedish researcher at Gothenburg University, sought to get a whole research paper out of GPT-3 and publish it in a peer-reviewed academic journal.

large language model, machine learning, natural language, (9 more...)

#artificialintelligence

Country: Europe > Sweden > Vaestra Goetaland > Gothenburg (0.26)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback