AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

LM-CORE: Language Models with Contextually Relevant External Knowledge

Kaur, Jivat Neet, Bhatia, Sumit, Aggarwal, Milan, Bansal, Rachit, Krishnamurthy, Balaji

arXiv.org Artificial IntelligenceAug-12-2022

Large transformer-based pre-trained language models have achieved impressive performance on a variety of knowledge-intensive tasks and can capture factual knowledge in their parameters. We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements. We posit that a more efficient alternative is to provide explicit access to contextually relevant structured knowledge to the model and train it to use that knowledge. We present LM-CORE -- a general framework to achieve this -- that allows \textit{decoupling} of the language model training from the external knowledge source and allows the latter to be updated without affecting the already trained model. Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks; can effectively handle knowledge updates; and performs well on two downstream tasks. We also present a thorough error analysis highlighting the successes and failures of LM-CORE.

computational linguistic, knowledge, lm-core, (16 more...)

arXiv.org Artificial Intelligence

2208.06458

Country:

Asia > India (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.05)
(32 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Leisure & Entertainment > Sports (0.93)
Media > Television (0.68)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Best AI Writing Tools

#artificialintelligenceAug-11-2022, 11:57:46 GMT

On the other hand, some AI writing tools process text using a third-party NLP created by a company that specializes in that specific technology. This option has become the norm for many of the most recently developed AI writing tools. In fact, more and more AI writing solutions are depending on a specific NLP known as Generative Pretrained Transformer 3 or GPT-3. Reputed as one of the most powerful language models around, GPT-3 has been trained to understand billions of words and parameters in order to understand how people write.

best ai, gpt-3

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.65)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

A beginner's guide to the AI apocalypse: The democratization of 'expertise'

#artificialintelligenceAug-11-2022, 03:05:15 GMT

In this series we examine some of the most popular doomsday scenarios prognosticated by modern AI experts. Previous articles include Misaligned Objectives, Artificial Stupidity, Wall-E Syndrome, Humanity Joins the Hivemind, and Killer Robots. We've covered a lot of ground in this series (see above), but nothing comes close to our next topic. The "democratization of expertise" might sound like a good thing -- democracy, expertise, what's not to like? But it's our intent to convince you that it's the single greatest AI-related threat our species faces by the time you finish reading this article.

chatbot, democratization, expertise, (15 more...)

#artificialintelligence

Country: Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.05)

Industry:

Leisure & Entertainment (0.70)
Media (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.71)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback

Rather than threaten jobs, artificial intelligence should collaborate with human writers – Stuff

#artificialintelligenceAug-10-2022, 21:52:14 GMT

In Sept. 2020, The Guardian published an opinion piece written by a program.

artificial intelligence, collaborate, threaten job

#artificialintelligence

Industry: Media > News (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

New-and-Improved Content Moderation Tooling

#artificialintelligenceAug-10-2022, 21:51:16 GMT

We are introducing a new-and-improved content moderation tool: The Moderation endpoint improves upon our previous content filter, and is available for free today to OpenAI API developers. To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content -- an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation. When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm -- content prohibited by our content policy.

classifier, moderation endpoint, new-and-improved content moderation tooling, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.99)

Add feedback

MIT Researchers use OpenAI Codex to Build an An ML-based Mathematics Problem-generator

#artificialintelligenceAug-10-2022, 13:10:44 GMT

OpenAI Codex is one of the most powerful language-to-code GPT3-based neural networking platform for high-speed programming. OpeAI Codex is used in a large number of AI Machine Learning projects in a safe AGI environment. As the demand for Codex programmers increase in the current era, we are witnessing a large number of AI researchers also taking to OpenAI's GPT3 offering to improve their understanding of neural networks for complex problems. In one such development, a group of machine learning researchers and faculty members belonging to the MIT, Columbia University, Harvard University, and the University of Waterloo have built a machine learning algorithm using OpenAI Codex. This new algorithm can solve, explain and generate complex mathematical problems.

codex, mathematical problem, openai codex, (9 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Trends in AI -- August 2022

#artificialintelligenceAug-10-2022, 11:26:24 GMT

While blockbuster research has slowed down slightly in the past month, probably because of the summer season, conferences are back at full speed in person: NAACL in Seattle, SIGIR in Madrid, and also ICML, for which we created a special guide with the help of GPT-3. Other news we'd like to highlight, to begin with are: Every month we analyze the most recent research literature and select a varied set of 10 papers you should know of. Why Scaling laws¹ is a pervasive empirical phenomenon in modern Neural Networks, where the error is observed to off as a power of the training set size, model size, or both. While some have embraced this fact to devise a research agenda focused on scaling up, many think there must be ways to build better models without the need for outrageous scale. This paper explores a technique -- data pruning -- that can improve the learning efficiency of NNs "beating" scaling laws.

architecture, key insight, training dataset, (15 more...)

#artificialintelligence

Country: Europe > Spain > Galicia > Madrid (0.24)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Eliminate the stress of repetitive tasks with AI No one wants to - Daniel Czarnecki on LinkedIn

#artificialintelligenceAug-10-2022, 07:55:48 GMT

The printed version of GPT-3 book is here, and it feels *SO* rewarding to hold it in my hands! 1 year of my and Shubham's work materialised in a sleek white book with a West African giraffe cover. It's been quite a journey filled with sleepless nights, brainstorming sessions, and countless revisions. I kept thinking how fortunate I am to be able to explore the most exciting artificial intelligence ecosystem in the format of O'Reilly book. I hope it will become a useful resource to the NLP community. If you want to get up to speed with the latest and greatest NLP (no matter your background!), this book is for you! Massive thank you to everyone who helped make this book possible.

daniel czarnecki, linkedin, repetitive task

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.39)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.39)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

Add feedback

Reducing Retraining by Recycling Parameter-Efficient Prompts

Lester, Brian, Yurtsever, Joshua, Shakeri, Siamak, Constant, Noah

arXiv.org Artificial IntelligenceAug-10-2022

Parameter-efficient methods are able to use a single frozen pre-trained large language model (LLM) to perform many tasks by learning task-specific soft prompts that modulate model behavior when concatenated to the input text. However, these learned prompts are tightly coupled to a given frozen model -- if the model is updated, corresponding new prompts need to be obtained. In this work, we propose and investigate several approaches to "Prompt Recycling'" where a prompt trained on a source model is transformed to work with the new target model. Our methods do not rely on supervised pairs of prompts, task-specific data, or training updates with the target model, which would be just as costly as re-tuning prompts with the target model from scratch. We show that recycling between models is possible (our best settings are able to successfully recycle $88.9\%$ of prompts, producing a prompt that out-performs baselines), but significant performance headroom remains, requiring improved recycling techniques.

linguistic, recycling, target model, (14 more...)

arXiv.org Artificial Intelligence

2208.05577

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

Debiased Large Language Models Still Associate Muslims with Uniquely Violent Acts

Hemmatian, Babak, Varshney, Lav R.

arXiv.org Artificial IntelligenceAug-10-2022

Recent work demonstrates a bias in the GPT-3 model towards generating violent text completions when prompted about Muslims, compared with Christians and Hindus. Two pre-registered replication attempts, one exact and one approximate, found only the weakest bias in the more recent Instruct Series version of GPT-3, fine-tuned to eliminate biased and toxic outputs. Few violent completions were observed. Additional pre-registered experiments, however, showed that using common names associated with the religions in prompts yields a highly significant increase in violent completions, also revealing a stronger second-order bias against Muslims. Names of Muslim celebrities from non-violent domains resulted in relatively fewer violent completions, suggesting that access to individualized information can steer the model away from using stereotypes. Nonetheless, content analysis revealed religion-specific violent themes containing highly offensive ideas regardless of prompt format. Our results show the need for additional debiasing of large language models to address higher-order schemas and associations.

completion, supplementary information, violent completion, (14 more...)

arXiv.org Artificial Intelligence

2208.04417

Country:

North America > United States > Illinois > Champaign County > Urbana (0.05)
Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.05)
North America > United States > New Mexico (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.70)
Media > News (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback