AITopics | Chételat, Didier

Collaborating Authors

Chételat, Didier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InnerThoughts: Disentangling Representations and Predictions in Large Language Models

Chételat, Didier, Cotnareanu, Joseph, Thompson, Rylee, Zhang, Yingxue, Coates, Mark

arXiv.org Artificial IntelligenceJan-29-2025

Large language models (LLMs) contain substantial factual knowledge which is commonly elicited by multiple-choice question-answering prompts. Internally, such models process the prompt through multiple transformer layers, building varying representations of the problem within its hidden states. Ultimately, however, only the hidden state corresponding to the final layer and token position are used to predict the answer label. In this work, we propose instead to learn a small separate neural network predictor module on a collection of training questions, that take the hidden states from all the layers at the last temporal position as input and outputs predictions. In effect, such a framework disentangles the representational abilities of LLMs from their predictive abilities. On a collection of hard benchmarks, our method achieves considerable improvements in performance, sometimes comparable to supervised fine-tuning procedures, but at a fraction of the computational cost.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.17994

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Hint Marginalization for Improved Reasoning in Large Language Models

Pal, Soumyasundar, Chételat, Didier, Zhang, Yingxue, Coates, Mark

arXiv.org Artificial IntelligenceDec-17-2024

Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to generate a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Hint Marginalization, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode -- the most likely answer. Empirical evaluation on several benchmark datasets for arithmetic reasoning demonstrates the superiority of the proposed approach.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.13292

Country: North America > Canada (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Exploring the Power of Graph Neural Networks in Solving Linear Optimization Problems

Qian, Chendi, Chételat, Didier, Morris, Christopher

arXiv.org Machine LearningOct-16-2023

Recently, machine learning, particularly message-passing graph neural networks (MPNNs), has gained traction in enhancing exact optimization algorithms. For example, MPNNs speed up solving mixed-integer optimization problems by imitating computational intensive heuristics like strong branching, which entails solving multiple linear optimization problems (LPs). Despite the empirical success, the reasons behind MPNNs' effectiveness in emulating linear optimization remain largely unclear. Here, we show that MPNNs can simulate standard interior-point methods for LPs, explaining their practical success. Furthermore, we highlight how MPNNs can serve as a lightweight proxy for solving LPs, adapting to a given problem instance distribution. Empirically, we show that MPNNs solve LP relaxations of standard combinatorial optimization problems close to optimality, often surpassing conventional solvers and competing approaches in solving time.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

2310.10603

Country: North America (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Compare Nodes in Branch and Bound with Graph Neural Networks

Labassi, Abdel Ghani, Chételat, Didier, Lodi, Andrea

arXiv.org Artificial IntelligenceOct-30-2022

Branch-and-bound approaches in integer programming require ordering portions of the space to explore next, a problem known as node comparison. We propose a new siamese graph neural network model to tackle this problem, where the nodes are represented as bipartite graphs with attributes. Similar to prior work, we train our model to imitate a diving oracle that plunges towards the optimal solution. We evaluate our method by solving the instances in a plain framework where the nodes are explored according to their rank. On three NP-hard benchmarks chosen to be particularly primal-difficult, our approach leads to faster solving and smaller branch- and-bound trees than the default ranking function of the open-source solver SCIP, as well as competing machine learning methods. Moreover, these results generalize to instances larger than used for training. Code for reproducing the experiments can be found at https://github.com/ds4dm/learn2comparenodes.

artificial intelligence, machine learning, node, (15 more...)

arXiv.org Artificial Intelligence

2210.16934

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Combinatorial optimization and reasoning with graph neural networks

Cappart, Quentin, Chételat, Didier, Khalil, Elias, Lodi, Andrea, Morris, Christopher, Veličković, Petar

arXiv.org Machine LearningFeb-18-2021

Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring the fact that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning, especially graph neural networks (GNNs), as a key building block for combinatorial tasks, either as solvers or as helper functions. GNNs are an inductive bias that effectively encodes combinatorial and relational input due to their permutation-invariance and sparsity awareness. This paper presents a conceptual review of recent key advancements in this emerging field, aiming at both the optimization and machine learning researcher.

algorithm, constraint-based reasoning, deep learning, (20 more...)

arXiv.org Machine Learning

2102.09544

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Overview (1.00)

Industry: Transportation > Ground > Rail (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
(3 more...)

Add feedback

Change Point Detection by Cross-Entropy Maximization

Serre, Aurélien, Chételat, Didier, Lodi, Andrea

arXiv.org Machine LearningSep-2-2020

Many offline unsupervised change point detection algorithms rely on minimizing a penalized sum of segment-wise costs. We extend this framework by proposing to minimize a sum of discrepancies between segments. In particular, we propose to select the change points so as to maximize the cross-entropy between successive segments, balanced by a penalty for introducing new change points. We propose a dynamic programming algorithm to solve this problem and analyze its complexity. Experiments on two challenging datasets demonstrate the advantages of our method compared to three state-of-the-art approaches.

algorithm, optimization problem, survey article, (18 more...)

arXiv.org Machine Learning

2009.01358

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Exact Combinatorial Optimization with Graph Convolutional Neural Networks

Gasse, Maxime, Chételat, Didier, Ferroni, Nicola, Charlin, Laurent, Lodi, Andrea

arXiv.org Machine LearningJun-7-2019

Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs. We train our model via imitation learning from the strong branching expert rule, and demonstrate on a series of hard problems that our approach produces policies that improve upon state-of-the-art machine-learning methods for branching and generalize to instances significantly larger than seen during training. Moreover, we improve for the first time over expert-designed branching rules implemented in a state-of-the-art solver on large problems. Code for reproducing all the experiments can be found at https://github.com/ds4dm/learn2branch.

deep learning, neural network, solver, (17 more...)

arXiv.org Machine Learning

1906.01629

Country: North America > United States (0.46)

Genre:

Overview (0.68)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback