AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Lost in Space Marking

arXiv.org Artificial IntelligenceAug-2-2022

Such a claim requires empirical support, but consideration Modern NLP is dominated by large pre-trained of common practice can also be offered models, systems which are large, complex, and to challenge it: for one, pre-tokenization such as costly to train. As a result, much research effort is punctuation separation and accent normalization is put into questions of tuning and configuring the not always applied consistently when moving on to various layers and training regimes for improving a downstream text. A model that was trained on untreated prediction quality on a growing number of text may find it difficult to process an NER tasks (Rogers et al., 2020). Unfortunately, not as dataset (for example) where punctuation is separated much research asks questions about the decisions from preceding words, rendering a word-finalmarking made at the most upstream parts of the models, tokenizer more robust to change; some tokenizers those that deal with input tokenization and subword like BERT's Wordpiece (Devlin et al., 2019) vocabulary creation. "mark" a class of tokens by omission, i.e. marking In this exploratory work, we isolate a single decision the non-initial pieces rather than initial ones. This point which appears to be resolved arbitrarily discrepancy surfaces edge case effects when compared by existing model developers, with no consensus with a seemingly-equivalent tokenizer like but also no underlying theory: should subword GPT-2's (Radford et al., 2019), which marks initial tokenizers mark word boundaries at the pieces but only if they are prepended by a space beginning or the end?

computational linguistic, morpheme, tokenizer, (13 more...)

arXiv.org Artificial Intelligence

2208.01561

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
(5 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Learning from flowsheets: A generative transformer model for autocompletion of flowsheets

Vogel, Gabriel, Balhorn, Lukas Schulze, Schweidtmann, Artur M.

arXiv.org Artificial IntelligenceAug-1-2022

We propose a novel method enabling autocompletion of chemical flowsheets. This idea is inspired by the autocompletion of text. We represent flowsheets as strings using the text-based SFILES 2.0 notation and learn the grammatical structure of the SFILES 2.0 language and common patterns in flowsheets using a transformer-based language model. We pre-train our model on synthetically generated flowsheets to learn the flowsheet language grammar. Then, we fine-tune our model in a transfer learning step on real flowsheet topologies. Finally, we use the trained model for causal language modeling to autocomplete flowsheets. Eventually, the proposed method can provide chemical engineers with recommendations during interactive flowsheet synthesis. The results demonstrate a high potential of this approach for future AI-assisted process synthesis.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2208.00859

Country:

Europe > Netherlands > South Holland > Delft (0.05)
Europe > Denmark (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Du, Nan, Huang, Yanping, Dai, Andrew M., Tong, Simon, Lepikhin, Dmitry, Xu, Yuanzhong, Krikun, Maxim, Zhou, Yanqi, Yu, Adams Wei, Firat, Orhan, Zoph, Barret, Fedus, Liam, Bosma, Maarten, Zhou, Zongwei, Wang, Tao, Wang, Yu Emma, Webster, Kellie, Pellat, Marie, Robinson, Kevin, Meier-Hellstern, Kathleen, Duke, Toju, Dixon, Lucas, Zhang, Kun, Le, Quoc V, Wu, Yonghui, Chen, Zhifeng, Cui, Claire

arXiv.org Artificial IntelligenceAug-1-2022

Scaling language models with more data, compute and parameters has driven significant progress in natural language processing. For example, thanks to scaling, GPT-3 was able to achieve strong results on in-context learning tasks. However, training these large dense models requires significant amounts of computing resources. In this paper, we propose and develop a family of language models named GLaM (Generalist Language Model), which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants. The largest GLaM has 1.2 trillion parameters, which is approximately 7x larger than GPT-3. It consumes only 1/3 of the energy used to train GPT-3 and requires half of the computation flops for inference, while still achieving better overall zero-shot and one-shot performance across 29 NLP tasks.

aclanthology, computational linguistic, glam, (14 more...)

arXiv.org Artificial Intelligence

2112.06905

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > New York > New York County > New York City (0.04)
(20 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large language models can't plan, even if they write fancy essays

#artificialintelligenceJul-31-2022, 23:25:18 GMT

This article is part of our coverage of the latest in AI research. Large language models like GPT-3 have advanced to the point that it has become difficult to measure the limits of their capabilities. When you have a very large neural network that can generate articles, write software code, and engage in conversations about sentience and life, you should expect it to be able to reason about tasks and plan as a human does, right? A study by researchers at Arizona State University, Tempe, shows that when it comes to planning and thinking methodically, LLMs perform very poorly, and suffer from many of the same failures observed in current deep learning systems. Interestingly, the study finds that, while very large LLMs like GPT-3 and PaLM pass many of the tests that were meant to evaluate the reasoning capabilities and artificial intelligence systems, they do so because these benchmarks are either too simplistic or too flawed and can be "cheated" through statistical tricks, something that deep learning systems are very good at.

benchmark, kambhampati, reasoning, (15 more...)

#artificialintelligence

Country: North America > United States > Arizona (0.26)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AlphaFold reveals the structure of the protein universe

#artificialintelligenceJul-31-2022, 07:45:24 GMT

To read about all our work on solving protein folding, go to deepmind.com/AlphaFold It's been one year since we released and open sourced AlphaFold, our AI system to predict the 3D structure of a protein just from its 1D amino acid sequence, and created the AlphaFold Protein Structure Database (AlphaFold DB) to freely share this scientific knowledge with the world. Proteins are the building blocks of life, they underpin every biological process in every living thing. And, because a protein's shape is closely linked with its function, knowing a protein's structure unlocks a greater understanding of what it does and how it works. We hoped this groundbreaking resource would help accelerate scientific research and discovery globally, and that other teams could learn from and build on the advances we made with AlphaFold to create further breakthroughs.

alphafold, biology, protein, (14 more...)

#artificialintelligence

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

'The entire protein universe': AI predicts shape of nearly every known protein

#artificialintelligenceJul-31-2022, 06:00:48 GMT

The structure of the vitellogenin protein -- a precursor of egg yolk -- as predicted by the AlphaFold tool.Credit: DeepMind From today, determining the 3D shape of almost any protein known to science will be as simple as typing in a Google search. Researchers have used AlphaFold -- the revolutionary artificial-intelligence (AI) network -- to predict the structures of some 200 million proteins from 1 million species, covering nearly every known protein on the planet. The data dump will be freely available on a database set up by DeepMind, Google's London-based AI company that developed AlphaFold, and the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), an intergovernmental organization near Cambridge, UK. "Essentially you can think of it covering the entire protein universe," DeepMind CEO Demis Hassabis, said at a press briefing. The 3D shape, or structure, of a protein is what determines its function in cells.

database, prediction, protein, (13 more...)

#artificialintelligence

AI-Alerts: 2022 > 2022-08 > AAAI AI-Alert for Aug 2, 2022 (1.00)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.25)
Europe > Germany (0.05)
Asia > South Korea > Seoul > Seoul (0.05)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.51)
Information Technology > Services (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

How We Accidentally Gave our Bots Their Personalities

#artificialintelligenceJul-31-2022, 04:21:01 GMT

A couple months ago we noticed that the process of optimizing our computer models to evaluate text can produce pretty cool personalities for different bots so we figured we'd share what we've learned so far. We hope these bots can help with some of the challenges we are facing with getting persistent state out of natural language generation. We hope writing about how we developed them can provide some tips for our users who are helping us create new bots. So what do we mean by personalities? Here's a few examples of the intermediate step and the desired output (a score) that we generated when we played a game where we told some of our earlier bots that we were writing this blog post.

bot, commentary, evaluation bot, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.55)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.37)

Add feedback

This Week's Awesome Tech Stories From Around the Web (Through July 30)

#artificialintelligenceJul-31-2022, 04:21:00 GMT

ARTIFICIAL INTELLIGENCE · DeepMind Has Predicted the Structure of Almost Every Protein Known to Science Melissa Heikkilä | MIT Technology Review

awesome tech story, week

#artificialintelligence

Industry: Media > News (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

Machine Learning Artificial intelligence Market 2022 by Top Key Players and Vendors

#artificialintelligenceJul-31-2022, 04:21:00 GMT

Machine Learning Artificial intelligence Market 2022 by Top Key Players and Vendors: AIBrain, Amazon, Anki, CloudMinds, Deepmind, etc…

learning artificial intelligence market 2022, top key player and vendor

#artificialintelligence

Industry: Media > News (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

Machine Learning Artificial intelligence Market 2022 by Top Key Players and Vendors: AIBrain, Amazon, Anki, CloudMinds, Deepmind, etc – The Post Newspaper

#artificialintelligenceJul-31-2022, 00:51:20 GMT

The study provides an overview of how business professionals in the Global Machine Learning Artificial intelligence Market report have established a globally unique model to strategize the policies to contain the deleterious impact of the COVID-19 pandemic. The report highlights those market sectors that present a positive growth trend and a positive future outlook for the market participants by 2022-2027. The segments that have witnessed increase in the annual sales, market share because of the significant factors like trade and other. The Machine Learning Artificial intelligence report outlines business models and marketing strategies incorporated by market players to sustain the competition and accelerate business growth in market. Through various market scenarios, it recommends some solutions to implement in future to stay ahead of the competition and gives detailed insights about the covid-19 impact on the market.

learning artificial intelligence market 2022, machine learning artificial intelligence market, top key player and vendor, (6 more...)

#artificialintelligence

Country: North America > United States > Texas > Dallas County > Dallas (0.06)

Genre:

Overview (0.61)
Research Report (0.39)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
Health & Medicine > Therapeutic Area > Immunology (0.49)
Health & Medicine > Epidemiology (0.49)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.42)

Add feedback