AITopics | Aminbeidokhti, Masih

Collaborating Authors

Aminbeidokhti, Masih

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Source-Free Domain Adaptation for YOLO Object Detection

Varailhon, Simon, Aminbeidokhti, Masih, Pedersoli, Marco, Granger, Eric

arXiv.org Artificial IntelligenceSep-24-2024

Source-free domain adaptation (SFDA) is a challenging problem in object detection, where a pre-trained source model is adapted to a new target domain without using any source domain data for privacy and efficiency reasons. Most state-of-the-art SFDA methods for object detection have been proposed for Faster-RCNN, a detector that is known to have high computational complexity. This paper focuses on domain adaptation techniques for real-world vision systems, particularly for the YOLO family of single-shot detectors known for their fast baselines and practical applications. Our proposed SFDA method - Source-Free YOLO (SF-YOLO) - relies on a teacher-student framework in which the student receives images with a learned, target domain-specific augmentation, allowing the model to be trained with only unlabeled target data and without requiring feature alignment. A challenge with self-training using a mean-teacher architecture in the absence of labels is the rapid decline of accuracy due to noisy or drifting pseudo-labels. To address this issue, a teacher-to-student communication mechanism is introduced to help stabilize the training and reduce the reliance on annotated target data for model selection. Despite its simplicity, our approach is competitive with state-of-the-art detectors on several challenging benchmark datasets, even sometimes outperforming methods that use source data for adaptation.

artificial intelligence, detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2409.16538

Country:

North America > Canada (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average

Fournier, Louis, Nabli, Adel, Aminbeidokhti, Masih, Pedersoli, Marco, Belilovsky, Eugene, Oyallon, Edouard

arXiv.org Machine LearningMay-27-2024

The performance of deep neural networks is enhanced by ensemble methods, which average the output of several models. However, this comes at an increased cost at inference. Weight averaging methods aim at balancing the generalization of ensembling and the inference speed of a single model by averaging the parameters of an ensemble of models. Yet, naive averaging results in poor performance as models converge to different loss basins, and aligning the models to improve the performance of the average is challenging. Alternatively, inspired by distributed training, methods like DART and PAPA have been proposed to train several models in parallel such that they will end up in the same basin, resulting in good averaging accuracy. However, these methods either compromise ensembling accuracy or demand significant communication between models during training. In this paper, we introduce WASH, a novel distributed method for training model ensembles for weight averaging that achieves state-of-the-art image classification accuracy. WASH maintains models within the same basin by randomly shuffling a small percentage of weights during training, resulting in diverse models and lower communication costs compared to standard parameter averaging methods.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2405.17517

Country:

North America > United States (0.28)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge

Medeiros, Heitor Rapela, Aminbeidokhti, Masih, Pena, Fidel Guerrero, Latortue, David, Granger, Eric, Pedersoli, Marco

arXiv.org Artificial IntelligenceApr-11-2024

A common practice in deep learning consists of training large neural networks on massive datasets to perform accurately for different domains and tasks. While this methodology may work well in numerous application areas, it only applies across modalities due to a larger distribution shift in data captured using different sensors. This paper focuses on the problem of adapting a large object detection model to one or multiple modalities while being efficient. To do so, we propose ModTr as an alternative to the common approach of fine-tuning large models. ModTr consists of adapting the input with a small transformation network trained to minimize the detection loss directly. The original model can therefore work on the translated inputs without any further change or fine-tuning to its parameters. Experimental results on translating from IR to RGB images on two well-known datasets show that this simple ModTr approach provides detectors that can perform comparably or better than the standard fine-tuning without forgetting the original knowledge. This opens the doors to a more flexible and efficient service-based detection pipeline in which, instead of using a different detector for each modality, a unique and unaltered server is constantly running, where multiple modalities with the corresponding translations can query it.

artificial intelligence, detector, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2404.01492

Country:

North America > Canada (0.14)
North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Domain Generalization by Rejecting Extreme Augmentations

Aminbeidokhti, Masih, Peña, Fidel A. Guerrero, Medeiros, Heitor Rapela, Dubail, Thomas, Granger, Eric, Pedersoli, Marco

arXiv.org Artificial IntelligenceOct-10-2023

Data augmentation is one of the most effective techniques for regularizing deep learning models and improving their recognition performance in a variety of tasks and domains. However, this holds for standard in-domain settings, in which the training and test data follow the same distribution. For the out-of-domain case, where the test data follow a different and unknown distribution, the best recipe for data augmentation is unclear. In this paper, we show that for out-of-domain and domain generalization settings, data augmentation can provide a conspicuous and robust improvement in performance. To do that, we propose a simple training procedure: (i) use uniform sampling on standard data augmentation transformations; (ii) increase the strength transformations to account for the higher data variance expected when working out-of-domain, and (iii) devise a new reward function to reject extreme transformations that can harm the training. With this procedure, our data augmentation scheme achieves a level of accuracy that is comparable to or better than state-of-the-art methods on benchmark domain generalization datasets. Code: \url{https://github.com/Masseeh/DCAug}

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.0667

Country:

North America > United States (0.14)
North America > Canada (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

HalluciDet: Hallucinating RGB Modality for Person Detection Through Privileged Information

Medeiros, Heitor Rapela, Pena, Fidel A. Guerrero, Aminbeidokhti, Masih, Dubail, Thomas, Granger, Eric, Pedersoli, Marco

arXiv.org Artificial IntelligenceOct-6-2023

A powerful way to adapt a visual recognition model to a new domain is through image translation. However, common image translation approaches only focus on generating data from the same distribution of the target domain. In visual recognition tasks with complex images, such as pedestrian detection on aerial images with a large cross-modal shift in data distribution from Infrared (IR) to RGB images, a translation focused on generation might lead to poor performance as the loss focuses on irrelevant details for the task. In this paper, we propose HalluciDet, an IR-RGB image translation model for object detection that, instead of focusing on reconstructing the original image on the IR modality, is guided directly on reducing the detection loss of an RGB detector, and therefore avoids the need to access RGB data. This model produces a new image representation that enhances the object of interest in the scene and greatly improves detection performance. We empirically compare our approach against state-of-the-art image translation methods as well as with the commonly used fine-tuning on IR, and show that our method improves detection accuracy in most cases, by exploiting the privileged information encoded in a pre-trained RGB detector.

artificial intelligence, detector, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.04662

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (0.48)
Automobiles & Trucks (0.48)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Emotion Recognition with Spatial Attention and Temporal Softmax Pooling

Aminbeidokhti, Masih, Pedersoli, Marco, Cardinal, Patrick, Granger, Eric

arXiv.org Machine LearningOct-3-2019

Video-based emotion recognition is a challenging task because it requires to distinguish the small deformations of the human face that represent emotions, while being invariant to stronger visual differences due to different identities. State-of-the-art methods normally use complex deep learning models such as recurrent neural networks (RNNs, LSTMs, GRUs), convolutional neural networks (CNNs, C3D, residual networks) and their combination. In this paper, we propose a simpler approach that combines a CNN pre-trained on a public dataset of facial images with (1) a spatial attention mechanism, to localize the most important regions of the face for a given emotion, and (2) temporal softmax pooling, to select the most important frames of the given video. Results on the challenging EmotiW dataset show that this approach can achieve higher accuracy than more complex approaches.

deep learning, emotion recognition, neural network, (16 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-27202-9_29

1910.01254

Country:

North America > United States (0.15)
North America > Canada (0.14)

Genre: Research Report (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback