sam
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- Europe > Switzerland (0.04)
- Asia > China (0.04)
- Oceania > Australia (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Europe > Austria (0.04)
- (2 more...)
A Single-Step, Sharpness-Aware Minimization is All You Need to Achieve Efficient and Accurate Sparse Training
Sparse training stands as a landmark approach in addressing the considerable training resource demands imposed by the continuously expanding size of Deep Neural Networks (DNNs). However, the training of a sparse DNN encounters great challenges in achieving optimal generalization ability despite the efforts from the state-of-the-art sparse training methodologies. To unravel the mysterious reason behind the difficulty of sparse training, we connect the network sparsity with neural loss functions structure, and identify the cause of such difficulty lies in chaotic loss surface. In light of such revelation, we propose $S^{2} - SAM$, characterized by a **S**ingle-step **S**harpness_**A**ware **M**inimization that is tailored for **S**parse training.
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > Oklahoma > Beaver County (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.68)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Segment Anything without Supervision
The Segmentation Anything Model (SAM) requires labor-intensive data labeling. We present Unsupervised SAM (UnSAM) for promptable and automatic whole-image segmentation that does not require human annotations. UnSAM utilizes a divide-and-conquer strategy to "discover" the hierarchical structure of visual scenes. For all pixels within a segment, a bottom-up clustering method is employed to iteratively merge them into larger groups, thereby forming a hierarchical structure. These unsupervised multi-granular masks are then utilized to supervise model training.
Segment Any Change
Visual foundation models have achieved remarkable results in zero-shot image classification and segmentation, but zero-shot change detection remains an open problem. In this paper, we propose the segment any change models (AnyChange), a new type of change detection model that supports zero-shot prediction and generalization on unseen change types and data distributions.AnyChange is built on the segment anything model (SAM) via our training-free adaptation method, bitemporal latent matching.By revealing and exploiting intra-image and inter-image semantic similarities in SAM's latent space, bitemporal latent matching endows SAM with zero-shot change detection capabilities in a training-free way. We also propose a point query mechanism to enable AnyChange's zero-shot object-centric change detection capability.We perform extensive experiments to confirm the effectiveness of AnyChange for zero-shot change detection.AnyChange sets a new record on the SECOND benchmark for unsupervised change detection, exceeding the previous SOTA by up to 4.4\% F _1 score, and achieving comparable accuracy with negligible manual annotations (1 pixel per image) for supervised change detection.
Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization
Can we modify the training data distribution to encourage the underlying optimization method toward finding solutions with superior generalization performance on in-distribution data? In this work, we approach this question for the first time by comparing the inductive bias of gradient descent (GD) with that of sharpness-aware minimization (SAM). By studying a two-layer CNN, we rigorously prove that SAM learns different features more uniformly, particularly in early epochs. That is, SAM is less susceptible to simplicity bias compared to GD. We also show that examples constraining features that are learned early are separable from the rest based on the model's output.
Sam's Club is adding AI to the shopping experience. Why are privacy advocacy groups worried?
Sam's Club is going register-free and introducing an all-digital, AI-powered shopping experience for its customers, a move that has privacy advocates worried that the new AI tool could be used to unfairly target some customers with higher-priced items based on their shopping habits. The all-digital approach started with the reconstruction of a Sam's Club in Grapevine, a suburb of Dallas, that was severely damaged in 2022 by a tornado. When the retail location opened two years later it was the first of its kind to ditch its registers for a "Scan and Go" program that allowed customers to scan each item placed in their physical cart and pay through a mobile app. This program has since been piloted in nine Dallas metro locations and one store in Missouri, Retail Dive reported. Instead of handing a receipt to a Sam's Club employee to review before leaving the store, customers walk through an arch that's equipped with AI-powered cameras to capture images of the items in the cart and electronically match them with the items paid for through the app. Sam's Club did not disclose when the AI technology would be coming to California stores but Sam's Club has outlets in Torrance, Fountain Valley, El Monte and Riverside.
- North America > United States > California (0.62)
- North America > United States > Missouri (0.25)
- Retail (1.00)
- Information Technology > Security & Privacy (0.36)
Attention-Guided Integration of CLIP and SAM for Precise Object Masking in Robotic Manipulation
Muttaqien, Muhammad A., Motoda, Tomohiro, Hanai, Ryo, Yukiyasu, Domae
Attention-Guided Integration of CLIP and SAM for Precise Object Masking in Robotic Manipulation 1 st Muhammad A. Muttaqien Automation Research T eam National Institute of AIST Tokyo, Japan muha.muttaqien@aist.go.jp 2 nd Tomohiro Motoda Automation Research T eam National Institute of AIST Tokyo, Japan tomohiro.motoda@aist.go.jp 3 rd Ryo Hanai Automation Research T eam National Institute of AIST Tokyo, Japan ryo.hanai@aist.go.jp 4 th Domae Y ukiyasu Automation Research T eam National Institute of AIST Tokyo, Japan domae.yukiyasu@aist.go.jp Abstract --This paper introduces a novel pipeline to enhance the precision of object masking for robotic manipulation within the specific domain of masking products in convenience stores. The approach integrates two advanced AI models, CLIP and SAM, focusing on their synergistic combination and the effective use of multimodal data (image and text). Emphasis is placed on utilizing gradient-based attention mechanisms and customized datasets to fine-tune performance. While CLIP, SAM, and Grad-CAM are established components, their integration within this structured pipeline represents a significant contribution to the field. The resulting segmented masks, generated through this combined approach, can be effectively utilized as inputs for robotic systems, enabling more precise and adaptive object manipulation in the context of convenience store products. I NTRODUCTION In recent years, the ability to recognize and manipulate specific objects within well-defined domains, such as products in convenience stores, has become increasingly important in the field of robotic manipulation [1] [2] [3]. As robots are expected to perform more complex tasks in diverse environments, the need for precise object identification and interaction grows, particularly in domains where a high level of accuracy is crucial. For instance, in convenience stores (Figure 1), robots must reliably identify and handle a wide variety of products, each with unique visual characteristics, to automate tasks such as stocking, sorting, and customer assistance.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (1.00)
- Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- North America > Canada (0.04)
Monge SAM: Robust Reparameterization-Invariant Sharpness-Aware Minimization Based on Loss Geometry
Jacobsen, Albert Kjøller, Arvanitidis, Georgios
Recent studies on deep neural networks show that flat minima of the loss landscape correlate with improved generalization. Sharpness-aware minimization (SAM) efficiently finds flat regions by updating the parameters according to the gradient at an adversarial perturbation. The perturbation depends on the Euclidean metric, making SAM non-invariant under reparametrizations, which blurs sharpness and generalization. We propose Monge SAM (M-SAM), a reparametrization invariant version of SAM by considering a Riemannian metric in the parameter space induced naturally by the loss surface. Compared to previous approaches, M-SAM works under any modeling choice, relies only on mild assumptions while being as computationally efficient as SAM. We theoretically argue that M-SAM varies between SAM and gradient descent (GD), which increases robustness to hyperparameter selection and reduces attraction to suboptimal equilibria like saddle points. We demonstrate this behavior both theoretically and empirically on a multi-modal representation alignment task.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.14)