AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

'Existential catastrophe' from AI is likely unavoidable, DeepMind researcher warns

#artificialintelligenceSep-15-2022, 15:23:23 GMT

Researchers from the University of Oxford and Google's artificial intelligence division DeepMind have claimed that there is a high probability of advanced forms of AI becoming "existentially dangerous to life on Earth". In a recent article in the peer-reviewed journal AI Magazine, the researchers warned that there would be "catastrophic consequences" if the development of certain AI agents continues. Leading philosphers like Oxford University's Nick Bostrom have previously spoken of the threat posed by advanced forms of artificial intelligence, though one of authors of the new paper claimed such warnings did not go far enough.

advanced form, deepmind researcher, existential catastrophe, (1 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.74)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Natural Language Inference Prompts for Zero-shot Emotion Classification in Text across Corpora

Plaza-del-Arco, Flor Miriam, Martín-Valdivia, María-Teresa, Klinger, Roman

arXiv.org Artificial IntelligenceSep-15-2022

Within textual emotion classification, the set of relevant labels depends on the domain and application scenario and might not be known at the time of model development. This conflicts with the classical paradigm of supervised learning in which the labels need to be predefined. A solution to obtain a model with a flexible set of labels is to use the paradigm of zero-shot learning as a natural language inference task, which in addition adds the advantage of not needing any labeled training data. This raises the question how to prompt a natural language inference model for zero-shot learning emotion classification. Options for prompt formulations include the emotion name anger alone or the statement "This text expresses anger". With this paper, we analyze how sensitive a natural language inference-based zero-shot-learning classifier is to such changes to the prompt under consideration of the corpus: How carefully does the prompt need to be selected? We perform experiments on an established set of emotion datasets presenting different language registers according to different sources (tweets, events, blogs) with three natural language inference models and show that indeed the choice of a particular prompt formulation needs to fit to the corpus. We show that this challenge can be tackled with combinations of multiple prompts. Such ensemble is more robust across corpora than individual prompts and shows nearly the same performance as the individual best prompt for a particular corpus.

computational linguistic, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2209.06701

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(18 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

CommunityLM: Probing Partisan Worldviews from Language Models

Jiang, Hang, Beeferman, Doug, Roy, Brandon, Roy, Deb

arXiv.org Artificial IntelligenceSep-15-2022

As political attitudes have diverged ideologically in the United States, political speech has diverged lingusitically. The ever-widening polarization between the US political parties is accelerated by an erosion of mutual understanding between them. We aim to make these communities more comprehensible to each other with a framework that probes community-specific responses to the same survey questions using community language models CommunityLM. In our framework we identify committed partisan members for each community on Twitter and fine-tune LMs on the tweets authored by them. We then assess the worldviews of the two groups using prompt-based probing of their corresponding LMs, with prompts that elicit opinions about public figures and groups surveyed by the American National Election Studies (ANES) 2020 Exploratory Testing Survey. We compare the responses generated by the LMs to the ANES survey results, and find a level of alignment that greatly exceeds several baseline methods. Our work aims to show that we can use community LMs to query the worldview of any group of people given a sufficiently large sample of their social media discussions or media diet.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2209.07065

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (0.68)

Industry:

Media (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.47)

Add feedback

Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation

Wang, Yihe, Li, Yitong, Wang, Yasheng, Mi, Fei, Zhou, Pingyi, Wang, Xin, Liu, Jin, Jiang, Xin, Liu, Qun

arXiv.org Artificial IntelligenceSep-15-2022

Real human conversation data are complicated, heterogeneous, and noisy, from which building open-domain dialogue systems remains a challenging task. In fact, such dialogue data still contains a wealth of information and knowledge, however, they are not fully explored. In this paper, we show existing open-domain dialogue generation methods that memorize context-response paired data with autoregressive or encode-decode language models underutilize the training data. Different from current approaches, using external knowledge, we explore a retrieval-generation training framework that can take advantage of the heterogeneous and noisy training data by considering them as "evidence". In particular, we use BERTScore for retrieval, which gives better qualities of the evidence and generation. Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data. Such performance gain is comparable with those improved by enlarging the training set, even better. We also found that the model performance has a positive correlation with the relevance of the retrieved evidence. Moreover, our method performed well on zero-shot experiments, which indicates that our method can be more robust to real-world data.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2201.11367

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Google Deepmind Researcher Co-Authors Paper Saying AI Will Eliminate Humanity

#artificialintelligenceSep-14-2022, 21:56:30 GMT

Update: After publication, Google said in an email that this work was not done as part of co-author Marcus Hutter's work at DeepMind--rather, under his position at Australian National University--and that the DeepMind affiliation listed in the journal was an "error." Google sent the following statement: "DeepMind was not involved in this work and the paper's authors have requested corrections to reflect this. There are a wide range of views and academic interests at DeepMind, and many on our team also hold university professorships and pursue academic research separate to their work at DeepMind, through their university affiliations. While DeepMind was not involved in this work, we think deeply about the safety, ethics and wider societal impacts of AI and research and develop AI models that are safe, effective and aligned with human values. Alongside pursuing opportunities where AI can unlock widespread societal benefit, we also invest equal efforts in guarding against harmful uses.""

affiliation, eliminate humanity, google deepmind researcher co-author paper

#artificialintelligence

Industry: Social Sector (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Google's DeepMind Has a Long-term Goal of Artificial General Intelligence

#artificialintelligenceSep-14-2022, 16:38:20 GMT

When DeepMind, an Alphabet subsidiary, started off more than a decade ago, solving some most pressing research questions and problems with AI wasn't at the top of the company's mind. Instead, the company started off AI research with computer games. Every score and win was a measuring stick of success, and pointed to DeepMind's AI going in the right direction. "Five years ago, we conquered the game of Go. This was a great moment," said Colin Murdoch, the chief business officer, during a fireside chat on Tuesday at the AI Hardware Summit being held in Santa Clara, California.

deepmind, google, murdoch, (10 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Santa Clara (0.25)

Industry: Leisure & Entertainment > Games (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PainPoints: A Framework for Language-based Detection of Chronic Pain and Expert-Collaborative Text-Summarization

Fadnavis, Shreyas, Dhurandhar, Amit, Norel, Raquel, Reinen, Jenna M, Agurto, Carla, Secchettin, Erica, Schweiger, Vittorio, Perini, Giovanni, Cecchi, Guillermo

arXiv.org Artificial IntelligenceSep-14-2022

Chronic pain is a pervasive disorder which is often very disabling and is associated with comorbidities such as depression and anxiety. Neuropathic Pain (NP) is a common sub-type which is often caused due to nerve damage and has a known pathophysiology. Another common sub-type is Fibromyalgia (FM) which is described as musculoskeletal, diffuse pain that is widespread through the body. The pathophysiology of FM is poorly understood, making it very hard to diagnose. Standard medications and treatments for FM and NP differ from one another and if misdiagnosed it can cause an increase in symptom severity. To overcome this difficulty, we propose a novel framework, PainPoints, which accurately detects the sub-type of pain and generates clinical notes via summarizing the patient interviews. Specifically, PainPoints makes use of large language models to perform sentence-level classification of the text obtained from interviews of FM and NP patients with a reliable AUC of 0.83. Using a sufficiency-based interpretability approach, we explain how the fine-tuned model accurately picks up on the nuances that patients use to describe their pain. Finally, we generate summaries of these interviews via expert interventions by introducing a novel facet-based approach. PainPoints thus enables practitioners to add/drop facets and generate a custom summary based on the notion of "facet-coverage" which is also introduced in this work.

facet, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2209.09814

Country:

North America > United States (0.04)
Europe > Italy > Veneto (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can Offline Reinforcement Learning Help Natural Language Understanding?

Zhang, Ziqi, Wang, Yile, Zhang, Yue, Wang, Donglin

arXiv.org Artificial IntelligenceSep-14-2022

Pre-training has been a useful method for learning implicit transferable knowledge and it shows the benefit of offering complementary features across different modalities. Recent work mainly focuses on the modalities such as image and text, for example, studies show that visual features learned from images can help visual-grounded language understanding. In this paper, we consider investigating the potential connection between offline reinforcement learning (RL) and language modeling (LM). Intuitively, RL and LM are similar in predicting the next states based on the current and previous states, which rely on both local and long-range dependency across states. To validate such an assumption, we pre-trained different offline RL tasks using Transformer and then evaluate these models on various language-related tasks. Experimental results show that our RL pre-trained models can give close performance compared with the models using the LM training objective, showing that there exist common useful features across these two modalities. To further explore the potential relationship, we investigate some factors such as Markov property and the sequential nature of RL trajectory.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2212.03864

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Out of One, Many: Using Language Models to Simulate Human Samples

Argyle, Lisa P., Busby, Ethan C., Fulda, Nancy, Gubler, Joshua, Rytting, Christopher, Wingate, David

arXiv.org Artificial IntelligenceSep-14-2022

We propose and explore the possibility that language models can be studied as effective proxies for specific human sub-populations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the "algorithmic bias" within one such tool -- the GPT-3 language model -- is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property "algorithmic fidelity" and explore its extent in GPT-3. We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and socio-cultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.

gpt-3, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.acl-long.60

2209.06899

Country:

North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Wisconsin (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.67)

Industry:

Law (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning

Xie, Yujia, Zhou, Luowei, Dai, Xiyang, Yuan, Lu, Bach, Nguyen, Liu, Ce, Zeng, Michael

arXiv.org Artificial IntelligenceSep-14-2022

People say, "A picture is worth a thousand words". Then how can we get the rich information out of the image? We argue that by using visual clues to bridge large pretrained vision foundation models and language models, we can do so without any extra cross-modal training. Thanks to the strong zero-shot capability of foundation models, we start by constructing a rich semantic representation of the image (e.g., image tags, object attributes / locations, captions) as a structured textual prompt, called visual clues, using a vision foundation model. Based on visual clues, we use large language model to produce a series of comprehensive descriptions for the visual content, which is then verified by the vision model again to select the candidate that aligns best with the image. We evaluate the quality of generated descriptions by quantitative and qualitative measurement. The results demonstrate the effectiveness of such a structured semantic representation.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2206.01843

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Asia > Malaysia (0.04)
(6 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Rail (1.00)
Leisure & Entertainment > Sports > Tennis (0.94)
Transportation > Ground > Road (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback