AITopics | montpellier

Collaborating Authors

montpellier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptations of AI models for querying the LandMatrix database in natural language

Kbir, Fatiha Ait, Bourgoin, Jérémy, Decoupes, Rémy, Gradeler, Marie, Interdonato, Roberto

arXiv.org Artificial IntelligenceDec-17-2024

The Land Matrix initiative (https://landmatrix.org) and its global observatory aim to provide reliable data on large-scale land acquisitions to inform debates and actions in sectors such as agriculture, extraction, or energy in low- and middle-income countries. Although these data are recognized in the academic world, they remain underutilized in public policy, mainly due to the complexity of access and exploitation, which requires technical expertise and a good understanding of the database schema. The objective of this work is to simplify access to data from different database systems. The methods proposed in this article are evaluated using data from the Land Matrix. This work presents various comparisons of Large Language Models (LLMs) as well as combinations of LLM adaptations (Prompt Engineering, RAG, Agents) to query different database systems (GraphQL and REST queries). The experiments are reproducible, and a demonstration is available online: https://github.com/tetis-nlp/landmatrix-graphql-python.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.12961

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.06)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report (0.64)

Industry: Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Semi Supervised Heterogeneous Domain Adaptation via Disentanglement and Pseudo-Labelling

Dantas, Cassio F., Gaetano, Raffaele, Ienco, Dino

arXiv.org Artificial IntelligenceJun-20-2024

Semi-supervised domain adaptation methods leverage information from a source labelled domain with the goal of generalizing over a scarcely labelled target domain. While this setting already poses challenges due to potential distribution shifts between domains, an even more complex scenario arises when source and target data differs in modality representation (e.g. they are acquired by sensors with different characteristics). For instance, in remote sensing, images may be collected via various acquisition modes (e.g. optical or radar), different spectral characteristics (e.g. RGB or multi-spectral) and spatial resolutions. Such a setting is denoted as Semi-Supervised Heterogeneous Domain Adaptation (SSHDA) and it exhibits an even more severe distribution shift due to modality heterogeneity across domains.To cope with the challenging SSHDA setting, here we introduce SHeDD (Semi-supervised Heterogeneous Domain Adaptation via Disentanglement) an end-to-end neural framework tailored to learning a target domain classifier by leveraging both labelled and unlabelled data from heterogeneous data sources. SHeDD is designed to effectively disentangle domain-invariant representations, relevant for the downstream task, from domain-specific information, that can hinder the cross-modality transfer. Additionally, SHeDD adopts an augmentation-based consistency regularization mechanism that takes advantages of reliable pseudo-labels on the unlabelled target samples to further boost its generalization ability on the target domain. Empirical evaluations on two remote sensing benchmarks, encompassing heterogeneous data in terms of acquisition modes and spectral/spatial resolutions, demonstrate the quality of SHeDD compared to both baseline and state-of-the-art competing approaches. Our code is publicly available here: https://github.com/tanodino/SSHDA/

domain adaptation, heterogeneous domain adaptation, target domain, (14 more...)

arXiv.org Artificial Intelligence

2406.14087

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.05)
North America > United States > Nevada > Clark County > Las Vegas (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Cooperative learning of Pl@ntNet's Artificial Intelligence algorithm: how does it work and how can we improve it?

Lefort, Tanguy, Affouard, Antoine, Charlier, Benjamin, Lombardo, Jean-Christophe, Chouet, Mathias, Goëau, Hervé, Salmon, Joseph, Bonnet, Pierre, Joly, Alexis

arXiv.org Artificial IntelligenceJun-5-2024

Deep learning models for plant species identification rely on large annotated datasets. The PlantNet system enables global data collection by allowing users to upload and annotate plant observations, leading to noisy labels due to diverse user skills. Achieving consensus is crucial for training, but the vast scale of collected data makes traditional label aggregation strategies challenging. Existing methods either retain all observations, resulting in noisy training data or selectively keep those with sufficient votes, discarding valuable information. Additionally, as many species are rarely observed, user expertise can not be evaluated as an inter-user agreement: otherwise, botanical experts would have a lower weight in the AI training step than the average user. Our proposed label aggregation strategy aims to cooperatively train plant identification AI models. This strategy estimates user expertise as a trust score per user based on their ability to identify plant species from crowdsourced data. The trust score is recursively estimated from correctly identified species given the current estimated labels. This interpretable score exploits botanical experts' knowledge and the heterogeneity of users. Subsequently, our strategy removes unreliable observations but retains those with limited trusted annotations, unlike other approaches. We evaluate PlantNet's strategy on a released large subset of the PlantNet database focused on European flora, comprising over 6M observations and 800K users. We demonstrate that estimating users' skills based on the diversity of their expertise enhances labeling performance. Our findings emphasize the synergy of human annotation and data filtering in improving AI performance for a refined dataset. We explore incorporating AI-based votes alongside human input. This can further enhance human-AI interactions to detect unreliable observations.

aggregation strategy, label aggregation strategy, vote, (15 more...)

arXiv.org Artificial Intelligence

2406.03356

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.05)
Europe > Western Europe (0.04)
Europe > Spain (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A lexicon obtained and validated by a data-driven approach for organic residues valorization in emerging and developing countries

Rakotomalala, Christiane, Paillat, Jean-Marie, Feder, Frédéric, Avadí, Angel, Thuriès, Laurent, Vermeire, Marie-Liesse, Médoc, Jean-Michel, Wassenaar, Tom, Hottelart, Caroline, Kieffer, Lilou, Ndjie, Elisa, Picart, Mathieu, Tchamgoue, Jorel, Tulle, Alvin, Valade, Laurine, Boyer, Annie, Duchamp, Marie-Christine, Roche, Mathieu

arXiv.org Artificial IntelligenceJun-2-2024

The text mining method presented in this paper was used for annotation of terms related to biological transformation and valorization of organic residues in agriculture in low and middle-income country. Specialized lexicon was obtained through different steps: corpus and extraction of terms, annotation of extracted terms, selection of relevant terms.

montpellier, recyclage et risque, valorization, (11 more...)

arXiv.org Artificial Intelligence

2406.00682

Country:

Africa > Saint Helena, Ascension and Tristan da Cunha (0.29)
North America > Central America (0.14)
Asia > North Korea (0.14)
(132 more...)

Genre: Research Report (0.64)

Industry: Food & Agriculture > Agriculture (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

A two-head loss function for deep Average-K classification

Garcin, Camille, Servajean, Maximilien, Joly, Alexis, Salmon, Joseph

arXiv.org Artificial IntelligenceMar-31-2023

Average-K classification is an alternative to top-K classification in which the number of labels returned varies with the ambiguity of the input image but must average to K over all the samples. A simple method to solve this task is to threshold the softmax output of a model trained with the cross-entropy loss. This approach is theoretically proven to be asymptotically consistent, but it is not guaranteed to be optimal for a finite set of samples. In this paper, we propose a new loss function based on a multi-label classification head in addition to the classical softmax. This second head is trained using pseudo-labels generated by thresholding the softmax head while guaranteeing that K classes are returned on average. We show that this approach allows the model to better capture ambiguities between classes and, as a result, to return more consistent sets of possible classes. Experiments on two datasets from the literature demonstrate that our approach outperforms the softmax baseline, as well as several other loss functions more generally designed for weakly supervised multi-label classification. The gains are larger the higher the uncertainty, especially for classes with few samples.

artificial intelligence, classification, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.18118

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > France > Occitanie > Hérault > Montpellier (0.05)
South America (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Structuring ontologies in a context of collaborative system modelling

Chaib, Romy Lynn, Thomopoulos, Rallou, Macombe, Catherine

arXiv.org Artificial IntelligenceJan-13-2023

Prospective studies require discussing and collaborating with the stakeholders to create scenarios of the possible evolution of the studied value-chain. However, stakeholders don't always use the same words when referring to one idea. Constructing an ontology and homogenizing vocabularies is thus crucial to identify key variables which serve in the construction of the needed scenarios. Nevertheless, it is a very complex and timeconsuming task. In this paper we present the method we used to manually build ontologies adapted to the needs of two complementary system-analysis models (namely the "Godet" and the "MyChoice" models), starting from interviews of the agri-food system's stakeholders.

artificial intelligence, godet method, stakeholder, (15 more...)

arXiv.org Artificial Intelligence

2301.05478

Country: Europe > France > Occitanie > Hérault > Montpellier (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Towards a Data-Driven Requirements Engineering Approach: Automatic Analysis of User Reviews

Wei, Jialiang, Courbis, Anne-Lise, Lambolais, Thomas, Xu, Binbin, Bernard, Pierre Louis, Dray, Gérard

arXiv.org Artificial IntelligenceJun-29-2022

We are concerned by Data Driven Requirements Engineering, and in particular the consideration of user's reviews. These online reviews are a rich source of information for extracting new needs and improvement requests. In this work, we provide an automated analysis using CamemBERT, which is a state-of-the-art language model in French. We created a multi-label classification dataset of 6000 user reviews from three applications in the Health & Fitness field. The results are encouraging and suggest that it's possible to identify automatically the reviews concerning requests for new features. Dataset is available at: https://github.com/Jl-wei/APIA2022-French-user-reviews-classification-dataset.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.5281/zenodo.7261877

2206.14669

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.05)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Consumer Health (0.48)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback

Bounds of MIN_NCC and MAX_NCC and filtering scheme for graph domain variables

Justeau-Allaire, Dimitri, Birnbaum, Philippe, Lorca, Xavier

arXiv.org Artificial IntelligenceMay-3-2021

Graph domain variables and constraints are an extension of constraint programming introduced by Dooms et al. This approach had been further investigated by Fages in its PhD thesis. On the other hand, Beldiceanu et al. presented a generic filtering scheme for global constraints based on graph properties. This scheme strongly relies on the computation of graph properties' bounds and can be used in the context of graph domain variables and constraints with a few adjustments. Bounds of MIN_NCC and MAX_NCC had been defined for the graph-based representation of global constraint for the path_with_loops graph class. In this note, we generalize those bounds for graph domain variables and for any graph class. We also provide a filtering scheme for any graph class and arbitrary bounds.

max ncc, min ncc, ncc, (13 more...)

arXiv.org Artificial Intelligence

2105.00663

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.06)
Oceania > New Caledonia > South Province > Noumea (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

The GeoLifeCLEF 2020 Dataset

Cole, Elijah, Deneu, Benjamin, Lorieul, Titouan, Servajean, Maximilien, Botella, Christophe, Morris, Dan, Jojic, Nebojsa, Bonnet, Pierre, Joly, Alexis

arXiv.org Machine LearningApr-8-2020

Understanding the geographic distribution of species is a key concern in conservation. By pairing species occurrences with environmental features, researchers can model the relationship between an environment and the species which may be found there. To facilitate research in this area, we present the GeoLifeCLEF 2020 dataset, which consists of 1.9 million species observations paired with high-resolution remote sensing imagery, land cover data, and altitude, in addition to traditional low-resolution climate and soil variables. We also discuss the GeoLifeCLEF 2020 competition, which aims to use this dataset to advance the state-of-the-art in location-based species recommendation.

dataset, france, imagery, (13 more...)

arXiv.org Machine Learning

2004.04192

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.06)
North America > United States > Nevada (0.04)
North America > United States > Alaska (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (0.40)

Industry: Energy (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

A Language-Agnostic Model for Semantic Source Code Labeling

Gelman, Ben, Hoyle, Bryan, Moore, Jessica, Saxe, Joshua, Slater, David

arXiv.org Machine LearningJun-3-2019

Code search and comprehension have become more difficult in recent years due to the rapid expansion of available source code. Current tools lack a way to label arbitrary code at scale while maintaining up-to-date representations of new programming languages, libraries, and functionalities. Comprehensive labeling of source code enables users to search for documents of interest and obtain a high-level understanding of their contents. We use Stack Overflow code snippets and their tags to train a language-agnostic, deep convolutional neural network to automatically predict semantic labels for source code documents. On Stack Overflow code snippets, we demonstrate a mean area under ROC of 0.957 over a long-tailed list of 4,508 tags. We also manually validate the model outputs on a diverse set of unlabeled source code documents retrieved from Github, and we obtain a top-1 accuracy of 86.6%. This strongly indicates that the model successfully transfers its knowledge from Stack Overflow snippets to arbitrary source code documents.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

doi: 10.1145/3243127.3243132

1906.01032

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.05)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Washington > Pierce County > Tacoma (0.04)
(3 more...)

Genre: Research Report (0.66)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback