AITopics | Nalmpantis, Christoforos

Collaborating Authors

Nalmpantis, Christoforos

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Teaching Large Language Models to Reason with Reinforcement Learning

Havrilla, Alex, Du, Yuqing, Raparthy, Sharath Chandra, Nalmpantis, Christoforos, Dwivedi-Yu, Jane, Zhuravinskyi, Maksym, Hambro, Eric, Sukhbaatar, Sainbayar, Raileanu, Roberta

arXiv.org Artificial IntelligenceMar-7-2024

Simultaneously, Reinforcement Learning from Human Feedback (RLHF) (Bai et al., 2022; Ziegler et al., 2019; Ouyang et al., 2022) and instruction fine-tuning (Wei et al., 2021; Mishra et al., 2021) have made significant progress in aligning LLMs with human preferences. Improvements in model instructability have further increased apparent model capability by making complex behaviors more accessible via instruction prompting. This has led to a number of increasingly sophisticated prompting strategies augmenting LLM reasoning capabilities such as Chain-of-Thought (Wei et al., 2022) or Tree-of-Thoughts (Yao et al., 2023). Previous work in reinforcement learning (RL) such as AlphaGo (Silver et al., 2017), AlphaStar (Vinyals et al., 2019), and OpenAI Dota 2 (Berner et al., 2019) demonstrate that RL techniques can be used to train neural networks capable of sophisticated planning and reasoning in game environments. Cicero (Bakhtin et al., 2022) in particular succeeds in combining an RL trained planning agent with a dialogue fine-tuned LLM to achieve nearly super-human performance in the board game Diplomacy. Given these previous successes and the inherent interactive nature of problem solving, applying RL to LLM reasoning seems a natural next step. In this paper, we study how ideas from RL can be used to improve the reasoning capabilities of LLMs across a variety of reward schemes and model initializations. We begin by comparing the performance of different RL algorithms on reasoning tasks τ defined as a distribution of question answer tuples (Q, A).

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2403.04642

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding the Effects of RLHF on LLM Generalisation and Diversity

Kirk, Robert, Mediratta, Ishita, Nalmpantis, Christoforos, Luketina, Jelena, Hambro, Eric, Grefenstette, Edward, Raileanu, Roberta

arXiv.org Artificial IntelligenceJan-3-2024

Large language models (LLMs) fine-tuned with reinforcement learning from human feedback (RLHF) have been used in some of the most widely deployed AI models to date, such as OpenAI's ChatGPT or Anthropic's Claude. % , or Meta's LLaMA-2. While there has been significant work developing these methods, our understanding of the benefits and downsides of each stage in RLHF is still limited. To fill this gap, we present an extensive analysis of how each stage of the process (i.e.~supervised fine-tuning (SFT), reward modelling, and RLHF) affects two key properties: out-of-distribution (OOD) generalisation and output diversity. OOD generalisation is crucial given the wide range of real-world scenarios in which these models are being used, while output diversity refers to the model's ability to generate varied outputs and is important for a variety of use cases. We perform our analysis across two base models on both summarisation and instruction following tasks, the latter being highly relevant for current LLM use cases. We find that RLHF generalises better than SFT to new inputs, particularly as the distribution shift between train and test becomes larger. However, RLHF significantly reduces output diversity compared to SFT across a variety of measures, implying a tradeoff in current LLM fine-tuning methods between generalisation and diversity. Our results provide guidance on which fine-tuning method should be used depending on the application, and show that more research is needed to improve the tradeoff between generalisation and diversity.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.06452

Country:

Europe (1.00)
Asia (0.92)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neurons in Large Language Models: Dead, N-gram, Positional

Voita, Elena, Ferrando, Javier, Nalmpantis, Christoforos

arXiv.org Artificial IntelligenceSep-9-2023

We analyze a family of large language models in such a lightweight manner that can be done on a single GPU. Specifically, we focus on the OPT family of models ranging from 125m to 66b parameters and rely only on whether an FFN neuron is activated or not. First, we find that the early part of the network is sparse and represents many discrete features. Here, many neurons (more than 70% in some layers of the 66b model) are "dead", i.e. they never activate on a large collection of diverse data. At the same time, many of the alive neurons are reserved for discrete features and act as token and n-gram detectors. Interestingly, their corresponding FFN updates not only promote next token candidates as could be expected, but also explicitly focus on removing the information about triggering them tokens, i.e., current input. To the best of our knowledge, this is the first example of mechanisms specialized at removing (rather than adding) information from the residual stream. With scale, models become more sparse in a sense that they have more dead neurons and token detectors. Finally, some neurons are positional: them being activated or not depends largely (or solely) on position and less so (or not at all) on textual data. We find that smaller models have sets of neurons acting as position range indicators while larger models operate in a less explicit manner.

artificial intelligence, language model, natural language, (4 more...)

arXiv.org Artificial Intelligence

2309.04827

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

Add feedback

Augmented Language Models: a Survey

Mialon, Grégoire, Dessì, Roberto, Lomeli, Maria, Nalmpantis, Christoforos, Pasunuru, Ram, Raileanu, Roberta, Rozière, Baptiste, Schick, Timo, Dwivedi-Yu, Jane, Celikyilmaz, Asli, Grave, Edouard, LeCun, Yann, Scialom, Thomas

arXiv.org Artificial IntelligenceFeb-15-2023

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demonstrations. While adhering to a standard missing tokens prediction objective, such augmented LMs can use various, possibly non-parametric external modules to expand their context processing ability, thus departing from the pure language modeling paradigm. We therefore refer to them as Augmented Language Models (ALMs). The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks and even outperforming most regular LMs on several benchmarks. In this work, after reviewing current advance in ALMs, we conclude that this new research direction has the potential to address common limitations of traditional LMs such as interpretability, consistency, and scalability issues.

arxiv preprint arxiv, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2302.07842

Country: North America > United States (1.00)

Genre: Overview (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.67)
Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(4 more...)

Add feedback