AITopics | Boudiaf, Malik

Collaborating Authors

Boudiaf, Malik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SaulLM-7B: A pioneering Large Language Model for Law

Colombo, Pierre, Pires, Telmo Pessoa, Boudiaf, Malik, Culver, Dominic, Melo, Rui, Corro, Caio, Martins, Andre F. T., Esposito, Fabrizio, Raposo, Vera Lúcia, Morgado, Sofia, Desa, Michael

arXiv.org Artificial IntelligenceMar-7-2024

In this paper, we introduce SaulLM-7B, a large language model (LLM) tailored for the legal domain. With 7 billion parameters, SaulLM-7B is the first LLM designed explicitly for legal text comprehension and generation. Leveraging the Mistral 7B architecture as its foundation, SaulLM-7B is trained on an English legal corpus of over 30 billion tokens. SaulLM-7B exhibits state-of-the-art proficiency in understanding and processing legal documents. Additionally, we present a novel instructional fine-tuning method that leverages legal datasets to further enhance SaulLM-7B's performance in legal tasks. SaulLM-7B is released under the MIT License.

large language model, machine learning, preprint arxiv, (19 more...)

arXiv.org Artificial Intelligence

2403.03883

Country:

North America > United States (0.46)
Europe > Portugal > Lisbon > Lisbon (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.82)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Bag of Tricks for Fully Test-Time Adaptation

Mounsaveng, Saypraseuth, Chiaroni, Florent, Boudiaf, Malik, Pedersoli, Marco, Ayed, Ismail Ben

arXiv.org Artificial IntelligenceNov-9-2023

Fully Test-Time Adaptation (TTA), which aims at adapting models to data drifts, has recently attracted wide interest. Numerous tricks and techniques have been proposed to ensure robust learning on arbitrary streams of unlabeled data. However, assessing the true impact of each individual technique and obtaining a fair comparison still constitutes a significant challenge. To help consolidate the community's knowledge, we present a categorization of selected orthogonal TTA techniques, including small batch normalization, stream rebalancing, reliable sample selection, and network confidence calibration. We meticulously dissect the effect of each approach on different scenarios of interest. Through our analysis, we shed light on trade-offs induced by those techniques between accuracy, the computational power required, and model complexity. We also uncover the synergy that arises when combining techniques and are able to establish new state-of-the-art results.

artificial intelligence, batch size, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.02416

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.34)

Add feedback

Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models

Colombo, Pierre, Pellegrain, Victor, Boudiaf, Malik, Storchan, Victor, Tami, Myriam, Ayed, Ismail Ben, Hudelot, Celine, Piantanida, Pablo

arXiv.org Artificial IntelligenceOct-21-2023

Proprietary and closed APIs are becoming increasingly common to process natural language, and are impacting the practical applications of natural language processing, including few-shot classification. Few-shot classification involves training a model to perform a new classification task with a handful of labeled data. This paper presents three contributions. First, we introduce a scenario where the embedding of a pre-trained model is served through a gated API with compute-cost and data-privacy constraints. Second, we propose a transductive inference, a learning paradigm that has been overlooked by the NLP community. Transductive inference, unlike traditional inductive learning, leverages the statistics of unlabeled data. We also introduce a new parameter-free transductive regularizer based on the Fisher-Rao loss, which can be used on top of the gated API embeddings. This method fully utilizes unlabeled data, does not share any label with the third-party API provider and could serve as a baseline for future research. Third, we propose an improved experimental setting and compile a benchmark of eight datasets involving multiclass classification in four different languages, with up to 151 classes. We evaluate our methods using eight backbone models, along with an episodic evaluation over 1,000 episodes, which demonstrate the superiority of transductive inference over the standard inductive setting.

large language model, machine learning, preprint arxiv, (17 more...)

arXiv.org Artificial Intelligence

2310.13998

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

In Search for a Generalizable Method for Source Free Domain Adaptation

Boudiaf, Malik, Denton, Tom, van Merriënboer, Bart, Dumoulin, Vincent, Triantafillou, Eleni

arXiv.org Artificial IntelligenceJun-24-2023

Source-free domain adaptation (SFDA) is compelling because it allows adapting an off-the-shelf model to a new domain using only unlabelled data. In this work, we apply existing SFDA techniques to a challenging set of naturally-occurring distribution shifts in bioacoustics, which are very different from the ones commonly studied in computer vision. We find existing methods perform differently relative to each other than observed in vision benchmarks, and sometimes perform worse than no adaptation at all. We propose a new simple method which outperforms the existing methods on our new shifts while exhibiting strong performance on a range of vision datasets. Our findings suggest that existing SFDA methods are not as generalizable as previously thought and that considering diverse modalities can be a useful avenue for designing more robust models.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2302.06658

Country: North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Open-Set Likelihood Maximization for Few-Shot Learning

Boudiaf, Malik, Bennequin, Etienne, Tami, Myriam, Toubhans, Antoine, Piantanida, Pablo, Hudelot, Céline, Ayed, Ismail Ben

arXiv.org Artificial IntelligenceMay-19-2023

We tackle the Few-Shot Open-Set Recognition (FSOSR) problem, i.e. classifying instances among a set of classes for which we only have a few labeled samples, while simultaneously detecting instances that do not belong to any known class. We explore the popular transductive setting, which leverages the unlabelled query instances at inference. Motivated by the observation that existing transductive methods perform poorly in open-set scenarios, we propose a generalization of the maximum likelihood principle, in which latent scores down-weighing the influence of potential outliers are introduced alongside the usual parametric model. Our formulation embeds supervision constraints from the support set and additional penalties discouraging overconfident predictions on the query set. We proceed with a block-coordinate descent, with the latent scores and parametric model co-optimized alternately, thereby benefiting from each other. We call our resulting formulation \textit{Open-Set Likelihood Optimization} (OSLO). OSLO is interpretable and fully modular; it can be applied on top of any pre-trained model seamlessly. Through extensive experiments, we show that our method surpasses existing inductive and transductive methods on both aspects of open-set recognition, namely inlier classification and outlier detection.

data mining, machine learning, oslo, (15 more...)

arXiv.org Artificial Intelligence

2301.0839

Country: Europe > Norway > Eastern Norway > Oslo (0.51)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
(2 more...)

Add feedback

Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions

Chiaroni, Florent, Boudiaf, Malik, Mitiche, Amar, Ayed, Ismail Ben

arXiv.org Artificial IntelligenceOct-8-2022

We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general maximum a posteriori (MAP) perspective of clustering distributions, which emphasizes that the statistical models underlying the existing distortion-based methods may not be descriptive enough. Instead, we optimize a mixed-variable objective measuring the conformity of data within each cluster to the introduced sBeta density function, whose parameters are constrained and estimated jointly with binary assignment variables. Our versatile formulation approximates a variety of parametric densities for modeling simplex data, and enables to control the cluster-balance bias. This yields highly competitive performances for unsupervised adjustments of black-box model predictions in a variety of scenarios. Our code and comparisons with the existing simplex-clustering approaches along with our introduced softmax-prediction benchmarks are publicly available: https://github.com/fchiaroni/Clustering_Softmax_Predictions.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2208.00287

Country:

North America > United States (0.46)
North America > Canada (0.28)
Europe > France (0.28)

Genre: Research Report (1.00)

Industry: Transportation > Air (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

KNIFE: Kernelized-Neural Differential Entropy Estimation

Pichler, Georg, Colombo, Pierre, Boudiaf, Malik, Koliander, Gunther, Piantanida, Pablo

arXiv.org Machine LearningFeb-14-2022

Mutual Information (MI) has been widely used as a loss regularizer for training neural networks. This has been particularly effective when learn disentangled or compressed representations of high dimensional data. However, differential entropy (DE), another fundamental measure of information, has not found widespread use in neural network training. Although DE offers a potentially wider range of applications than MI, off-the-shelf DE estimators are either non differentiable, computationally intractable or fail to adapt to changes in the underlying distribution. These drawbacks prevent them from being used as regularizers in neural networks training. To address shortcomings in previously proposed estimators for DE, here we introduce KNIFE, a fully parameterized, differentiable kernel-based estimator of DE. The flexibility of our approach also allows us to construct KNIFE-based estimators for conditional (on either discrete or continuous variables) DE, as well as MI. We empirically validate our method on high-dimensional synthetic data and further apply it to guide the training of neural networks for real-world tasks. Our experiments on a large variety of tasks, including visual domain adaptation, textual fair classification, and textual fine-tuning demonstrate the effectiveness of KNIFE-based estimation. Code can be found at https://github.com/g-pichler/knife.

kernelized-neural differential entropy estimation, machine learning, neural network, (1 more...)

arXiv.org Machine Learning

2202.06618

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Transductive Information Maximization For Few-Shot Learning

Boudiaf, Malik, Masud, Ziko Imtiaz, Rony, Jérôme, Dolz, José, Piantanida, Pablo, Ayed, Ismail Ben

arXiv.org Machine LearningOct-23-2020

We introduce Transductive Infomation Maximization (TIM) for few-shot learning. Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task, in conjunction with a supervision loss based on the support set. Furthermore, we propose a new alternating-direction solver for our mutual-information loss, which substantially speeds up transductive-inference convergence over gradient-based optimization, while yielding similar accuracy. TIM inference is modular: it can be used on top of any base-training feature extractor. Following standard transductive few-shot settings, our comprehensive experiments demonstrate that TIM outperforms state-of-the-art methods significantly across various datasets and networks, while used on top of a fixed feature extractor trained with simple cross-entropy on the base classes, without resorting to complex meta-learning schemes. It consistently brings between 2% and 5% improvement in accuracy over the best performing method, not only on all the well-established few-shot benchmarks but also on more challenging scenarios,with domain shifts and larger numbers of classes.

artificial intelligence, deep learning, neural network, (14 more...)

arXiv.org Machine Learning

2008.11297

Country: North America > Canada (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback