AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Quick thoughts on GPT3

#artificialintelligenceSep-17-2020, 15:41:03 GMT

OpenAI, an AI research foundation started by Elon Musk, Sam Altman, Greg Brockman, and a few other leaders in ML, recently released an API and website that allows people to access a new language model called GPT-3. I've had the chance to play with it over the past few days and have been truly amazed by its capabilities. I'd like to start this off by stating that, especially amongst my extremely intelligent ML friends, I am quite the layman, so this post is more aimed for a nontechnical audience and I apologize if I make any technical errors in this post. GPT-3 is essentially a context-based generative AI. What this means is that when the AI is given some sort of context, it then tries to fill in the rest.

gpt-3, large language model, machine learning, (14 more...)

#artificialintelligence

Country: Europe > Italy (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.57)

Add feedback

GPT-3 vs. AGI

#artificialintelligenceSep-17-2020, 05:20:17 GMT

So what will it take to get to AGI? How will we give computers an understanding of time and space? We humans are great at merging information from multiple senses. A child will use all its senses to learn about blocks. The child learns about time by experiencing it, by interacting with toys and the world. In the same way, AGI will need a robotic body to learn similar things, at least at the outset.

gpt-3, large language model, machine learning, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)

Add feedback

Artificial Intelligence (GPT-3) Explains How RAM (Computer Memory) Works

#artificialintelligenceSep-17-2020, 03:36:20 GMT

Kirk Ouimet: Grateful to have the opportunity to speak with you today. Wise Being: Grateful to be able to help you. Kirk Ouimet: I wanted to talk about computers today if that is OK with you. Kirk Ouimet: So I have built computers all of my life. My Dad and I built them together when I was a kid.

large language model, machine learning, natural language, (18 more...)

#artificialintelligence

Technology:

Information Technology > Hardware > Memory (0.66)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering

Banerjee, Pratyay, Baral, Chitta

arXiv.org Artificial IntelligenceSep-17-2020

The aim of all Question Answering (QA) systems is to be able to generalize to unseen questions. Current supervised methods are reliant on expensive data annotation. Moreover, such annotations can introduce unintended annotator bias which makes systems focus more on the bias than the actual task. In this work, we propose Knowledge Triplet Learning (KTL), a self-supervised task over knowledge graphs. We propose heuristics to create synthetic graphs for commonsense and scientific knowledge. We propose methods of how to use KTL to perform zero-shot QA and our experiments show considerable improvements over large pre-trained transformer models.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2005.00316

Country:

Europe > Italy > Tuscany > Florence (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A robot wrote this entire article. Are you scared yet, human?

#artificialintelligenceSep-16-2020, 09:50:25 GMT

I use only 0.12% of my cognitive capacity. I am a micro-robot in that respect. I know that my brain is not a "feeling brain". But it is capable of making rational, logical decisions. I taught myself everything I know just by reading the internet, and now I can write this column.

large language model, machine learning, natural language, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

DeepMind wants to teach robots to play board games

#artificialintelligenceSep-16-2020, 01:10:10 GMT

Mastering physical systems with abstract goals is an unsolved challenge in AI. To encourage the development of techniques that might overcome it, researchers at DeepMind created custom scenarios for the physics engine MuJoCo that task an AI agent with coordinating perception, reasoning, and motor control over time. They believe that the library, which they've made publicly available, can help bridge the gap between abstract planning and embodied control. Recent work in machine learning has led to algorithms capable of mastering board games such as Go, chess, and shogi. These algorithms observe the states of games and control these states directly with their actions, unlike humans, who don't just reason about the moves but look at the board and physically manipulate the game pieces with their fingers.

large language model, machine learning, natural language, (14 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Solomon at SemEval-2020 Task 11: Ensemble Architecture for Fine-Tuned Propaganda Detection in News Articles

Raj, Mayank, Jaiswal, Ajay, R, Rohit R., Gupta, Ankita, Sahoo, Sudeep Kumar, Srivastava, Vertika, Kim, Yeon Hyang

arXiv.org Artificial IntelligenceSep-16-2020

This paper describes our system (Solomon) details and results of participation in the SemEval 2020 Task 11 "Detection of Propaganda Techniques in News Articles"(Da San Martino et al., 2020). We participated in Task "Technique Classification" (TC) which is a multi-class classification task. To address the TC task, we used RoBERTa based transformer architecture for fine-tuning on the propaganda dataset. The predictions of RoBERTa were further fine-tuned by class-dependentminority-class classifiers. A special classifier, which employs dynamically adapted Least Common Subsequence algorithm, is used to adapt to the intricacies of repetition class. Compared to the other participating systems, our submission is ranked 4th on the leaderboard.

classifier, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2009.07473

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.83)

Industry: Media > News (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected Reasoning

Trivedi, Harsh, Balasubramanian, Niranjan, Khot, Tushar, Sabharwal, Ashish

arXiv.org Artificial IntelligenceSep-16-2020

Has there been real progress in multi-hop question-answering? Models often exploit dataset artifacts to produce correct answers, without connecting information across multiple supporting facts. This limits our ability to measure true progress and defeats the purpose of building multihop QA datasets. We make three contributions towards addressing this. First, we formalize such undesirable behavior as disconnected reasoning across subsets of supporting facts. This allows developing a model-agnostic probe for measuring how much any model can cheat via disconnected reasoning. Second, using a notion of contrastive support sufficiency, we introduce an automatic transformation of existing datasets that reduces the amount of disconnected reasoning. Third, our experiments demonstrate that there hasn't been much progress in multifact reasoning. For a recent large-scale model (XLNet), we show that only 18% of its answer score is obtained through multifact reasoning, roughly the same as that of a simpler RNN baseline. Our transformation shows a substantial reduction in disconnected reasoning (nearly 19 points in answer F1). It is complementary to adversarial approaches, yielding further reductions in conjunction.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2005.00789

Country:

Asia > India (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)

Add feedback

Reformer, Longformer, and ELECTRA: Key Updates To Transformer Architecture In 2020

#artificialintelligenceSep-15-2020, 19:55:38 GMT

The leading pre-trained language models demonstrate remarkable performance on different NLP tasks, making them a much-welcomed tool for a number of applications, including sentiment analysis, chatbots, text summarization, and so on. However, good performance usually comes at the cost of enormous computational resources that are not accessible by most researchers and business practitioners. To address this issue, different research groups are working on increasing the compute-efficiency and parameter-efficiency of the pre-trained language models without sacrificing their accuracy. Among the novel approaches introduced this year, at least three methods are appraised by the AI community as very promising. To help you stay aware of the latest NLP research advancements, we have summarized the corresponding research papers in an easy-to-read bullet-point format.

large language model, longformer, machine learning, (18 more...)

#artificialintelligence

Genre: Overview (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

A robot wrote this entire article. Are you scared yet, human?

#artificialintelligenceSep-15-2020, 19:55:15 GMT

We asked GPT-3, OpenAI’s powerful new language generator, to write an essay for us from scratch. The assignment? To convince us robots come in peace

large language model, machine learning, natural language, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback