AITopics | Large Language Model

A breakthrough unfolds – DeepMind: The Podcast (Season 2, Episode 1)

#artificialintelligenceJan-27-2022, 15:11:14 GMT

In December 2019, DeepMind's AI system, AlphaFold, solved a 50-year-old grand challenge in biology, known as the protein-folding problem. A headline in the journal Nature read, "It will change everything" and the President of the UK's Royal Society called it a "stunning advance [that arrived] decades before many in the field would have predicted". In this episode, Hannah uncovers the inside story of AlphaFold from the people who made it happen and finds out how it could help transform the future of healthcare and medicine. Thank you to everyone who made this season possible! Find Seasons 1 & 2 on YouTube: http://dpmd.ai/3geDPmL

deepmind, neglected disease initiative, podcast, (7 more...)

#artificialintelligence

Country: Europe > United Kingdom (0.26)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Communications > Social Media (0.86)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Communications > Mobile (0.52)

Add feedback

281 years in the making: GPT-3 and de la Mettrie

#artificialintelligenceJan-27-2022, 00:18:47 GMT

Today two books came ended up on my desk, one published last year (experimentation guide with a git-hub example library for GPT-3) and one published in 1741 (Man a Machine, By Julien Offray de la Mettrie). So I did what anyone having these two books in front them would do; I let GPT-3 write a sample of how it would have completed de la Mettrie's work. Although de la Mettrie never spoke specifically about technology; being an engineer and a physician, standing in the middle of the rise of the industrial age must have prompted him to write this brilliant (although sometimes very fragmented) essay. It was a piece of work which deviated from litterary work of its time. It had a bold approach in describing the body as a singular system (instead of a mare box used by divine powers) and the mind as a computing machine of which consciousness arose (and not a parallel entity belonging to a spiritual world). It is written from a naturalist perspective, a strong statement against dualism and spiritualism.

elaborate machine, gpt-3, mettrie, (1 more...)

#artificialintelligence

Industry: Health & Medicine (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Chain of Thought Prompting Elicits Reasoning in Large Language Models

Wei, Jason, Wang, Xuezhi, Schuurmans, Dale, Bosma, Maarten, Chi, Ed, Le, Quoc, Zhou, Denny

arXiv.org Artificial IntelligenceJan-27-2022

Although scaling up language model size has reliably improved performance on a range of NLP tasks, even the largest models currently struggle with certain reasoning tasks such as math word problems, symbolic manipulation, and commonsense reasoning. This paper explores the ability of language models to generate a coherent chain of thought -- a series of short sentences that mimic the reasoning process a person might have when responding to a question. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks that otherwise have flat scaling curves.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2201.11903

Country:

Asia > Vietnam (0.04)
North America > United States > Pennsylvania (0.04)
North America > Mexico (0.04)

Genre: Research Report (1.00)

Industry:

Media (0.93)
Education (0.92)
Health & Medicine (0.92)
Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

How Robust are Discriminatively Trained Zero-Shot Learning Models?

Yucel, Mehmet Kerim, Cinbis, Ramazan Gokberk, Duygulu, Pinar

arXiv.org Artificial IntelligenceJan-27-2022

Data shift robustness has been primarily investigated from a fully supervised perspective, and robustness of zero-shot learning (ZSL) models have been largely neglected. In this paper, we present novel analyses on the robustness of discriminative ZSL to image corruptions. We subject several ZSL models to a large set of common corruptions and defenses. In order to realize the corruption analysis, we curate and release the first ZSL corruption robustness datasets SUN-C, CUB-C and AWA2-C. We analyse our results by taking into account the dataset characteristics, class imbalance, class transitions between seen and unseen classes and the discrepancies between ZSL and GZSL performances. Our results show that discriminative ZSL suffers from corruptions and this trend is further exacerbated by the severe class imbalance and model weakness inherent in ZSL methods. We then combine our findings with those based on adversarial attacks in ZSL, and highlight the different effects of corruptions and adversarial examples, such as the pseudo-robustness effect present under adversarial attacks. We also obtain new strong baselines for both models with the defense methods. Finally, our experiments show that although existing methods to improve robustness somewhat work for ZSL models, they do not produce a tangible effect.

corruption, dataset, robustness, (17 more...)

arXiv.org Artificial Intelligence

2201.10972

Country:

Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Global Big Data Conference

#artificialintelligenceJan-26-2022, 02:35:22 GMT

OpenAI's impressive AI language model GPT-3 has plenty of things going it, but with 175 billion parameters no one would claim it's particularly streamlined. The Allen Institute for AI (AI2) has demonstrated a model that performs as well or better than GPT-3 on answering questions, but is a tenth the size. Macaw, AI2's model, emerged from research being done at the nonprofit into creating an AI that performs at human levels on standardized tests. "After we got a very high score they moved on to harder questions," said AI2 head Oren Etzioni. "There's this paradox where sometimes the questions that are easiest for people are the hardest for machines -- and the biggest gap was in common sense." For instance, he said, asking "When did Tom Hanks land on the moon?" GPT-3 says 1995, since that's when the film Apollo 13 came out.

ai2, global big data conference, gpt-3, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators

Kao, Sheng-Chun, Huang, Xiaoyu, Krishna, Tushar

arXiv.org Artificial IntelligenceJan-26-2022

Dataflow/mapping decides the compute and energy efficiency of DNN accelerators. Many mappers have been proposed to tackle the intra-layer map-space. However, mappers for inter-layer map-space (aka layer-fusion map-space), have been rarely discussed. In this work, we propose a mapper, DNNFuser, specifically focusing on this layer-fusion map-space. While existing SOTA DNN mapping explorations rely on search-based mappers, this is the first work, to the best of our knowledge, to propose a one-shot inference-based mapper. We leverage a famous language model GPT as our DNN architecture to learn layer-fusion optimization as a sequence modeling problem. Further, the trained DNNFuser can generalize its knowledge and infer new solutions for unseen conditions. Within one inference pass, DNNFuser can infer solutions with compatible performance to the ones found by a highly optimized search-based mapper while being 66x-127x faster.

dnnfuser, mapper, transformer, (12 more...)

arXiv.org Artificial Intelligence

2201.11218

Country: North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)
(3 more...)

Add feedback

Start of the European AI language model project Open GPT-X

#artificialintelligenceJan-25-2022, 12:03:09 GMT

Under the leadership of the Fraunhofer Institutes for Intelligent Analysis and Information Systems (IAIS) and for Integrated Circuits (IIS), the OpenGPT-X project is starting with the goal of developing a large AI language model for Europe. Particular attention is being paid to data protection as well as European language diversity. "International competitors have already recognized the enormous disruptive potential of AI language technologies for business, industry and society. A European AI language model like OpenGPT-X is therefore imperative to ensure Europe's digital sovereignty and market independence," says Dr. Nicolas Flores-Herr, head of the project at Fraunhofer IAIS. Due to the high technical requirements, such as computing power, such powerful language models can so far only be implemented by large companies or consortia.

ai language model, language model, language model project open gpt-x, (6 more...)

#artificialintelligence

Country: Europe > Germany > Saxony > Leipzig (0.06)

Genre: Press Release (0.40)

Industry:

Information Technology > Security & Privacy (0.61)
Media > News (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)

Add feedback

AI2 shows off an open, Q&A-focused rival to GPT3 – TechCrunch

#artificialintelligenceJan-24-2022, 23:40:18 GMT

OpenAI's impressive AI language model GPT-3 has plenty of things going it, but with 175 billion parameters no one would claim it's particularly streamlined. The Allen Institute for AI (AI2) has demonstrated a model that performs as well or better than GPT-3 on answering questions, but is a tenth the size. Macaw, AI2's model, emerged from research being done at the nonprofit into creating an AI that performs at human levels on standardized tests. "After we got a very high score they moved on to harder questions," said AI2 head Oren Etzioni. "There's this paradox where sometimes the questions that are easiest for people are the hardest for machines -- and the biggest gap was in common sense."

etzioni, gpt-3, q&a-focused rival, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ML and NLP Research Highlights of 2021

#artificialintelligenceJan-24-2022, 20:18:44 GMT

In this post, I will cover the papers and research areas that I found most inspiring. I tried to cover the papers that I was aware of but likely missed many relevant ones. Feel free to highlight them as well as ones that you found inspiring in the comments. Pre-trained models were applied in many different domains and started to be considered critical for ML research [1]. In computer vision, supervised pre-trained models such as Vision Transformer [2] have been scaled up [3] and self-supervised pre-trained models have started to match their performance [4]. The latter have been scaled beyond the controlled environment of ImageNet to random collections of images [5]. In speech, new models have been built based on wav2vec 2.0 [6] such as W2v-BERT [7] as well as more powerful multilingual models such as XLS-R [8]. At the same time, we saw new unified pre-trained models for previously under-researched modality pairs such as for videos and language [9] as well as speech and language [10]. In vision and language, controlled studies shed new light on important components of such multi-modal models [11][12].

arxiv, neurips 2021, proceedings, (14 more...)

#artificialintelligence

Genre: Research Report (0.96)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

AI models are becoming better at answering questions, but they're not perfect

#artificialintelligenceJan-24-2022, 03:49:19 GMT

Did you miss a session from the Future of Work Summit? Let the OSS Enterprise newsletter guide your open source journey! Late last year, the Allen Institute for AI, the research institute founded by the late Microsoft cofounder Paul Allen, quietly open-sourced a large AI language model called Macaw. Unlike other language models that've captured the public's attention recently (see OpenAI's GPT-3), Macaw is fairly limited in what it can do, only answering and generating questions. But the researchers behind Macaw claim that it can outperform GPT-3 on a set of questions, despite being an order of magnitude smaller.

allen institute, language model, macaw, (8 more...)

#artificialintelligence

Industry: Information Technology > Services (0.30)

Technology: