AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Could Elon Musk's xAI be exactly what the world needs?

New ScientistJul-26-2023, 18:00:00 GMT

ELON MUSK, not content with helming recent purchase Twitter alongside SpaceX and his other long-standing firms, has announced an artificial intelligence start-up called xAI. People have speculated that it might be an attempt to challenge OpenAI's ChatGPT, an AI-powered chatbot that has grown to 100 million monthly users in the blink of an eye. But a veil of mystery hangs over the venture, whose goal is "to understand the true nature of the universe". Musk isn't averse to grandiose statements and marketing bluff – a SpaceX mission to Mars is seemingly always just on the horizon – but a …

elon musk

New Scientist

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

AI Leaders Create Industry Watchdog as Government Scrutiny Grows

TIME - TechJul-26-2023, 11:33:22 GMT

Facing calls to put guardrails on artificial intelligence development, a group of tech companies including Alphabet Inc.'s Google and OpenAI Inc. are creating an industry body to ensure that AI models are safe. The effort, also backed by AI startup Anthropic and Microsoft Corp., aims to consolidate the expertise of member companies and create benchmarks for the industry, according to a statement Wednesday. The group, known as the Frontier Model Forum, said it welcomed participation from other organizations working on large-scale machine-learning platforms. The fast proliferation of generative AI tools such as OpenAI's ChatGPT, which can create text, photos and even video based on simple prompts, has put pressure on tech giants to tread carefully. The companies involved in the Frontier Model Forum have already agreed to put safeguards in place -- at the urging of the White House -- before Congress potentially passes binding regulations.

ai leader create industry watchdog, frontier model forum, openai, (1 more...)

TIME - Tech

Country:

North America > United States (0.75)
Europe > United Kingdom (0.07)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.89)

Add feedback

Congratulations to the #ICML2023 outstanding paper award winners

AIHubJul-26-2023, 10:55:52 GMT

This year's International Conference on Machine Learning (ICML) is taking place in Honolulu, Hawai'i from 23-29 July. The winners of the outstanding paper awards for 2023 have now been announced. This paper introduces an interesting approach that aims to address the challenge of obtaining a learning rate free optimal bound for non-smooth stochastic convex optimization. The authors propose a novel method that overcomes the limitations imposed by traditional learning rate selection in optimizing such problems. This research makes a valuable and practical contribution to the field of optimization.

congratulation, icml2023 outstanding paper award winner, university, (12 more...)

AIHub

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.25)
North America > United States > North Carolina (0.07)
North America > United States > Maryland (0.05)

Genre: Personal > Honors > Award (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Add feedback

How User Language Affects Conflict Fatality Estimates in ChatGPT

Kazenwadel, Daniel, Steinert, Christoph V.

arXiv.org Artificial IntelligenceJul-26-2023

OpenAI's ChatGPT language model has gained popularity as a powerful tool for complex problem-solving and information retrieval. However, concerns arise about the reproduction of biases present in the language-specific training data. In this study, we address this issue in the context of the Israeli-Palestinian and Turkish-Kurdish conflicts. Using GPT-3.5, we employed an automated query procedure to inquire about casualties in specific airstrikes, in both Hebrew and Arabic for the former conflict and Turkish and Kurdish for the latter. Our analysis reveals that GPT-3.5 provides 27$\pm$11 percent lower fatality estimates when queried in the language of the attacker than in the language of the targeted group. Evasive answers denying the existence of such attacks further increase the discrepancy, creating a novel bias mechanism not present in regular search engines. This language bias has the potential to amplify existing media biases and contribute to information bubbles, ultimately reinforcing conflicts.

airstrike, conflict, information, (15 more...)

arXiv.org Artificial Intelligence

2308.00072

Country:

Asia > Middle East > Republic of Türkiye (0.28)
Asia > Middle East > Syria (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(13 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety (1.00)
Law (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

Brahman, Faeze, Bhagavatula, Chandra, Pyatkin, Valentina, Hwang, Jena D., Li, Xiang Lorraine, Arai, Hirona J., Sanyal, Soumya, Sakaguchi, Keisuke, Ren, Xiang, Choi, Yejin

arXiv.org Artificial IntelligenceJul-26-2023

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex contextualized situations that are often counterfactual, e.g. "scheduling a doctor's appointment without a phone". While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (counterfactual) planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the implicit knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a novel task, Counterfactual Planning, that requires a revision of a plan to cope with a counterfactual situation. In both the original and counterfactual setting, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities.

computational linguistic, language model, step 1, (16 more...)

arXiv.org Artificial Intelligence

2305.19472

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California (0.14)
North America > Dominican Republic (0.04)
(13 more...)

Genre:

Workflow (1.00)
Research Report (1.00)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Games (0.46)
Education > Educational Setting (0.46)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Human Language?

Marcus, Gary, Leivada, Evelina, Murphy, Elliot

arXiv.org Artificial IntelligenceJul-26-2023

Artificial Intelligence applications show great potential for language-related tasks that rely on next-word prediction. The current generation of large language models have been linked to claims about human-like linguistic performance and their applications are hailed both as a key step towards Artificial General Intelligence and as major advance in understanding the cognitive, and even neural basis of human language. We analyze the contribution of large language models as theoretically informative representations of a target system vs. atheoretical powerful mechanistic tools, and we identify the key abilities that are still missing from the current state of development and exploitation of these models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.00109

Country:

North America > United States > New York (0.06)
North America > United States > Texas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.47)
Media > News (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Building and Testing a General Intelligence Embodied in a Humanoid Robot

Gildert, Suzanne, Rose, Geordie

arXiv.org Artificial IntelligenceJul-26-2023

Machines with human-level intelligence should be able to do most economically valuable work. This aligns a major economic incentive with the scientific grand challenge of building a human-like mind. Here we describe our approach to building and testing such a system. Our approach comprises a physical humanoid robotic system; a software based control system for robots of this type; a performance metric, which we call g+, designed to be a measure of human-like intelligence in humanoid robots; and an evolutionary algorithm for incrementally increasing scores on this performance metric. We introduce and describe the current status of each of these. We report on current and historical measurements of the g+ metric on the systems described here.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2307.1677

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France (0.04)
(5 more...)

Genre: Workflow (1.00)

Industry:

Leisure & Entertainment > Games (0.93)
Health & Medicine > Therapeutic Area (0.93)
Law (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.68)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.61)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Utilizing Large Language Models for Natural Interface to Pharmacology Databases

Lu, Hong, Li, Chuan, Li, Yinheng, Zhao, Jie

arXiv.org Artificial IntelligenceJul-26-2023

The drug development process necessitates that pharmacologists undertake various tasks, such as reviewing literature, formulating hypotheses, designing experiments, and interpreting results. Each stage requires accessing and querying vast amounts of information. In this abstract, we introduce a Large Language Model (LLM)-based Natural Language Interface designed to interact with structured information stored in databases. Our experiments demonstrate the feasibility and effectiveness of the proposed framework. This framework can generalize to query a wide range of pharmaceutical data and knowledge bases.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2307.15717

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.56)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Controllable Generation of Dialogue Acts for Dialogue Systems via Few-Shot Response Generation and Ranking

Ramirez, Angela, Agarwal, Karik, Juraska, Juraj, Garg, Utkarsh, Walker, Marilyn A.

arXiv.org Artificial IntelligenceJul-26-2023

Dialogue systems need to produce responses that realize multiple types of dialogue acts (DAs) with high semantic fidelity. In the past, natural language generators (NLGs) for dialogue were trained on large parallel corpora that map from a domain-specific DA and its semantic attributes to an output utterance. Recent work shows that pretrained language models (LLMs) offer new possibilities for controllable NLG using prompt-based learning. Here we develop a novel few-shot overgenerate-and-rank approach that achieves the controlled generation of DAs. We compare eight few-shot prompt styles that include a novel method of generating from textual pseudo-references using a textual style transfer approach. We develop six automatic ranking functions that identify outputs with both the correct DA and high semantic accuracy at generation time. We test our approach on three domains and four LLMs. To our knowledge, this is the first work on NLG for dialogue that automatically ranks outputs using both DA and attribute accuracy. For completeness, we compare our results to fine-tuned few-shot models trained with 5 to 100 instances per DA. Our results show that several prompt settings achieve perfect DA accuracy, and near perfect semantic accuracy (99.81%) and perform better than few-shot fine-tuning.

accuracy, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.1444

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

Chen, Mayee F., Roberts, Nicholas, Bhatia, Kush, Wang, Jue, Zhang, Ce, Sala, Frederic, Ré, Christopher

arXiv.org Artificial IntelligenceJul-26-2023

The quality of training data impacts the performance of pre-trained large language models (LMs). Given a fixed budget of tokens, we study how to best select data that leads to good downstream model performance across tasks. We develop a new framework based on a simple hypothesis: just as humans acquire interdependent skills in a deliberate order, language models also follow a natural order when learning a set of skills from their training data. If such an order exists, it can be utilized for improved understanding of LMs and for data-efficient training. Using this intuition, our framework formalizes the notion of a skill and of an ordered set of skills in terms of the associated data. First, using both synthetic and real data, we demonstrate that these ordered skill sets exist, and that their existence enables more advanced skills to be learned with less data when we train on their prerequisite skills. Second, using our proposed framework, we introduce an online data sampling algorithm, Skill-It, over mixtures of skills for both continual pre-training and fine-tuning regimes, where the objective is to efficiently learn multiple skills in the former and an individual skill in the latter. On the LEGO synthetic in the continual pre-training setting, Skill-It obtains 36.5 points higher accuracy than random sampling. On the Natural Instructions dataset in the fine-tuning setting, Skill-It reduces the validation loss on the target skill by 13.6% versus training on data associated with the target skill itself. We apply our skills framework on the recent RedPajama dataset to continually pre-train a 3B-parameter LM, achieving higher accuracy on the LM Evaluation Harness with 1B tokens than the baseline approach of sampling uniformly over data sources with 3B tokens.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2307.1443

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Law (0.93)
Education (0.92)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)

Add feedback