AITopics | Labeau, Matthieu

Collaborating Authors

Labeau, Matthieu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Differentiable Surrogate Losses for Structured Prediction

Yang, Junjie, Labeau, Matthieu, d'Alché-Buc, Florence

arXiv.org Machine LearningNov-18-2024

Structured prediction involves learning to predict complex structures rather than simple scalar values. The main challenge arises from the non-Euclidean nature of the output space, which generally requires relaxing the problem formulation. Surrogate methods build on kernel-induced losses or more generally, loss functions admitting an Implicit Loss Embedding, and convert the original problem into a regression task followed by a decoding step. However, designing effective losses for objects with complex structures presents significant challenges and often requires domain-specific expertise. In this work, we introduce a novel framework in which a structured loss function, parameterized by neural networks, is learned directly from output training data through Contrastive Learning, prior to addressing the supervised surrogate regression problem. As a result, the differentiable loss not only enables the learning of neural networks due to the finite dimension of the surrogate space but also allows for the prediction of new structures of the output data via a decoding strategy based on gradient descent. Numerical experiments on supervised graph prediction problems show that our approach achieves similar or even better performance than methods based on a pre-defined kernel.

artificial intelligence, inductive learning, machine learning, (12 more...)

arXiv.org Machine Learning

2411.11682

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

Revisiting Hierarchical Text Classification: Inference and Metrics

Plaud, Roman, Labeau, Matthieu, Saillenfest, Antoine, Bonald, Thomas

arXiv.org Artificial IntelligenceOct-11-2024

Hierarchical text classification (HTC) is the task of assigning labels to a text within a structured space organized as a hierarchy. Recent works treat HTC as a conventional multilabel classification problem, therefore evaluating it as such. We instead propose to evaluate models based on specifically designed hierarchical metrics and we demonstrate the intricacy of metric choice and prediction inference method. We introduce a new challenging dataset and we evaluate fairly, recent sophisticated models, comparing them with a range of simple but strong baselines, including a new theoretically motivated loss. Finally, we show that those baselines are very often competitive with the latest models. This highlights the importance of carefully considering the evaluation methodology when proposing new methods for HTC. Code implementation and dataset are available at \url{https://github.com/RomanPlaud/revisitingHTC}.

machine learning, natural language, text classification, (18 more...)

arXiv.org Artificial Intelligence

2410.01305

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Any2Graph: Deep End-To-End Supervised Graph Prediction With An Optimal Transport Loss

Krzakala, Paul, Yang, Junjie, Flamary, Rémi, d'Alché-Buc, Florence, Laclau, Charlotte, Labeau, Matthieu

arXiv.org Artificial IntelligenceJun-14-2024

We propose Any2graph, a generic framework for end-to-end Supervised Graph Prediction (SGP) i.e. a deep learning model that predicts an entire graph for any kind of input. The framework is built on a novel Optimal Transport loss, the Partially-Masked Fused Gromov-Wasserstein, that exhibits all necessary properties (permutation invariance, differentiability and scalability) and is designed to handle any-sized graphs. Numerical experiments showcase the versatility of the approach that outperform existing competitors on a novel challenging synthetic dataset and a variety of real-world tasks such as map construction from satellite image (Sat2Graph) or molecule prediction from fingerprint (Fingerprint2Graph).

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.12269

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

The Impact of Word Splitting on the Semantic Content of Contextualized Word Representations

Soler, Aina Garí, Labeau, Matthieu, Clavel, Chloé

arXiv.org Artificial IntelligenceFeb-22-2024

When deriving contextualized word representations from language models, a decision needs to be made on how to obtain one for out-of-vocabulary (OOV) words that are segmented into subwords. What is the best way to represent these words with a single vector, and are these representations of worse quality than those of in-vocabulary Figure 1: Example of one of our settings where we calculate words? We carry out an intrinsic evaluation the cosine similarity between the representations of embeddings from different models of an OOV word and a known word. We test different on semantic similarity tasks involving OOV ways of creating one embedding for an OOV word ( 4), words. Our analysis reveals, among other such as AVG and LNG, on two similarity tasks ( 3).

artificial intelligence, computational linguistic, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.14616

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Energy > Oil & Gas (0.58)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance

Yang, Junjie, Labeau, Matthieu, d'Alché-Buc, Florence

arXiv.org Machine LearningSep-28-2023

Pairwise comparison of graphs is key to many applications in Machine learning ranging from clustering, kernel-based classification/regression and more recently supervised graph prediction. Distances between graphs usually rely on informative representations of these structured objects such as bag of substructures or other graph embeddings. A recently popular solution consists in representing graphs as metric measure spaces, allowing to successfully leverage Optimal Transport, which provides meaningful distances allowing to compare them: the Gromov-Wasserstein distances. However, this family of distances overlooks edge attributes, which are essential for many structured objects. In this work, we introduce an extension of Gromov-Wasserstein distance for comparing graphs whose both nodes and edges have features. We propose novel algorithms for distance and barycenter computation. We empirically show the effectiveness of the novel distance in learning tasks where graphs occur in either input space or output space, such as classification and graph prediction.

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Machine Learning

2309.16604

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Improving Multimodal fusion via Mutual Dependency Maximisation

Colombo, Pierre, Chapuis, Emile, Labeau, Matthieu, Clavel, Chloe

arXiv.org Artificial IntelligenceSep-9-2021

Multimodal sentiment analysis is a trending area of research, and the multimodal fusion is one of its most active topic. Acknowledging humans communicate through a variety of channels (i.e visual, acoustic, linguistic), multimodal systems aim at integrating different unimodal representations into a synthetic one. So far, a consequent effort has been made on developing complex architectures allowing the fusion of these modalities. However, such systems are mainly trained by minimising simple losses such as $L_1$ or cross-entropy. In this work, we investigate unexplored penalties and propose a set of new objectives that measure the dependency between modalities. We demonstrate that our new penalties lead to a consistent improvement (up to $4.3$ on accuracy) across a large variety of state-of-the-art models on two well-known sentiment analysis datasets: \texttt{CMU-MOSI} and \texttt{CMU-MOSEI}. Our method not only achieves a new SOTA on both datasets but also produces representations that are more robust to modality drops. Finally, a by-product of our methods includes a statistical network which can be used to interpret the high dimensional representations learnt by the model.

deep learning, neural network, representation, (20 more...)

arXiv.org Artificial Intelligence

2109.00922

Country:

Europe (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.69)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.69)

Add feedback

Code-switched inspired losses for generic spoken dialog representations

Chapuis, Emile, Colombo, Pierre, Labeau, Matthieu, Clavel, Chloe

arXiv.org Artificial IntelligenceSep-9-2021

While there has been a growing interest in pretraining for dialog A crucial step in conversational AI is the identification (Mehri et al., 2019; Zhang et al., 2019d), the focus of underlying information of the user's utterance has mainly been on English datasets. Thus, these (e.g communicative intent or dialog acts, and works can not be directly applied to our multilingual emotions). This requires modeling utterance-level setting. Additionally, available multilingual information (Mitkov, 2014; Williams et al., 2014), pretraining objectives (Lample and Conneau, 2019; to capture immediate nuances of the user utterance; Liu et al., 2020; Xue et al., 2020; Qi et al., 2021) and discourse-level features (Thornbury and Slade, face two main limitations when applied to dialog 2006), to capture patterns over long ranges of the modeling: (1) they are a generalization of monolingual conversation. An added difficulty to this modeling objectives that use flat input text, whereas problem is that most people in the world are bilingual hierarchy has been shown to be a powerful prior (Grosjean and Li, 2013): therefore, progress for dialog modeling. This is a reflection of a dialog on these systems is limited by their inability to process itself, for example, context plays an essential role more than one language (English being the in the labeling of dialog acts.

deep learning, preprint arxiv, speech recognition, (21 more...)

arXiv.org Artificial Intelligence

2108.12465

Country:

Europe (0.67)
North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Hierarchical Pre-training for Sequence Labelling in Spoken Dialog

Chapuis, Emile, Colombo, Pierre, Manica, Matteo, Labeau, Matthieu, Clavel, Chloe

arXiv.org Artificial IntelligenceSep-23-2020

Sequence labelling tasks like Dialog Act and Emotion/Sentiment identification are a key component of spoken dialog systems. In this work, we propose a new approach to learn generic representations adapted to spoken dialog, which we evaluate on a new benchmark we call Sequence labellIng evaLuatIon benChmark fOr spoken laNguagE benchmark (\texttt{SILICONE}). \texttt{SILICONE} is model-agnostic and contains 10 different datasets of various sizes. We obtain our representations with a hierarchical encoder based on transformer architectures, for which we extend two well-known pre-training objectives. Pre-training is performed on OpenSubtitles: a large corpus of spoken dialog containing over $2.3$ billion of tokens. We demonstrate how hierarchical encoders achieve competitive results with consistently fewer parameters compared to state-of-the-art models and we show their importance for both pre-training and fine-tuning.

dataset, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2009.11152

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback