AITopics | saner

Collaborating Authors

saner

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Resistance to Noisy Label Fitting by Reweighting Gradient in SAM

Luong, Hoang-Chau, Nguyen-Quang, Thuc, Tran, Minh-Triet

arXiv.org Artificial IntelligenceNov-26-2024

These authors contributed equally to this work. Noisy labels pose a substantial challenge in machine learning, often resulting in overfitting and poor generalization. Sharpness-Aware Minimization (SAM), as demonstrated by Foret et al. (2021), improves generalization over traditional Stochastic Gradient Descent (SGD) in classification tasks with noisy labels by implicitly slowing noisy learning. While SAM's ability to generalize in noisy environments has been studied in several simplified settings, its full potential in more realistic training settings remains underexplored. In this work, we analyze SAM's behavior at each iteration, identifying specific components of the gradient vector that contribute significantly to its robustness against noisy labels. Based on these insights, we propose SANER (Sharpness-Aware Noise-Explicit Reweighting), an effective variant that enhances SAM's ability to manage noisy fitting rate. Our experiments on CIFAR-10, CIFAR-100, and Mini-WebVision demonstrate that SANER consistently outperforms SAM, achieving up to an 8% increase on CIFAR-100 with 50% label noise. The issue of noisy labels due to human error annotation has been commonly observed in many largescale datasets such as CIFAR-10N, CIFAR-100N (Wei et al., 2022), Clothing1M (Xiao et al., 2015), and WebVision (Li et al., 2017). Over-parameterized deep neural networks, which have enough capacity to memorize entire large datasets, can easily overfit such noisy label data, leading to poor generalization performance (Zhang et al., 2021). Moreover, the lottery ticket hypothesis (Frankle & Carbin, 2019) indicates that only a subset of the network's parameters is crucial for generalization. This highlights the importance of noise-robust learning, where the goal is to train a robust classifier despite the presence of inaccurate or noisy labels in the training dataset. Sharpness-Aware Minimization (SAM), introduced by Foret et al. (2021), is an optimizer designed to find better generalization by searching for flat minima. It has shown superior performance over SGD in various tasks, especially in classification tasks involving noisy labels Baek et al. (2024). Understanding the mechanisms behind the success of SAM is crucial for further improvements in handling label noise.

accuracy, noisy train accuracy, saner, (13 more...)

arXiv.org Artificial Intelligence

2411.17132

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Evaluating Continual Learning on a Home Robot

Powers, Sam, Gupta, Abhinav, Paxton, Chris

arXiv.org Artificial IntelligenceJun-4-2023

Therefore, we split the action prediction problem into two steps: (1) we predict a Most Relevant Point, or MRP, which tells us which region of the world the policy must attend to; and (2) we reactively predict actions which determine where the robot should move in relation to that MRP: for example, how to approach the handle of an oven and when to close the gripper to grasp it. These two operations are performed sequentially using a modified PointNet++ (Qi et al., 2017) model that we refer to as Attention-based PointNet (A-PointNet), shown in Figure 2. The MRP Predictor can then be agnostic to the position of the robot, instead focusing on the features of the object relevant to the overall task, while the Action Predictor can learn to focus on features relevant just to what the next action should be. For example, in Figure 7, the MRP Predictor learns to focus on the handle; the Action Predictor focuses on the angle of the oven door. Image Pre-Processing First we convert the RGB and depth images into a point cloud. We augment the point cloud of the current timestep with our context c, the point cloud from the beginning of the episode. This aids both in combating occlusion, as well as in disambiguating between similar observations that occur during different trajectories. To reduce compute, we crop the working area to 1m, and down-sample using grid pooling, with a resolution of 1cm for the current timestep and 2.5cm for the context. Specifically, we select a random point in each voxel, to reduce overfitting.

artificial intelligence, learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.02413

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (0.42)

Industry:

Education (1.00)
Health & Medicine (0.74)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback