AITopics | hq-sam

Supplementary Material: Segment Anything in High Quality

Neural Information Processing SystemsFeb-12-2026, 12:47:55 GMT

In this supplementary material, Section 1 first presents the additional experimental analysis of our HQ-SAM, including more zero-shot transfer comparisons to SAM on both image and video benchmarks. SAM vs. HQ-SAM on V arious Backbones In Table 1, we provide a comprehensive comparison Table 2: Results on Y ouTubeVIS 2019 validation set and HQ-YTVIS test set using ViT -L based SAM. In Table 2, HQ-SAM achieves consistent gains of 1.4 points in Tube Mask AP, Robustness to Input Box Prompts In Table 4, we compare HQ-SAM to SAM by adding various scales of noises to the input ground truth box prompts. "center" point of Ground Truth (GT) masks, which is at a maximal value location in a mask's interior Results not obtained in a zero-shot manner (i.e. the training HQ-SAM improves over SAM, but still cannot achieve fully correct mask prediction. HQ-SAM produces significantly more accurate boundaries.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback

Segment Anything in High Quality

Neural Information Processing SystemsFeb-12-2026, 12:47:52 GMT

Despite being trained with 1.1 billion masks, SAM's mask prediction

large language model, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)

Add feedback

Segment Anything in High Quality

Neural Information Processing SystemsDec-25-2025, 14:28:10 GMT

The recent Segment Anything Model (SAM) represents a big leap in scaling up segmentation models, allowing for powerful zero-shot capabilities and flexible prompting. Despite being trained with 1.1 billion masks, SAM's mask prediction quality falls short in many cases, particularly when dealing with objects that have intricate structures. We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability. Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask.

electronic proceedings, high quality, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

Uncertainty-aware Fine-tuning of Segmentation Foundation Models

Neural Information Processing SystemsNov-19-2025, 01:11:59 GMT

The framework relies on two methodological innovations.

machine learning, natural language, segmentation, (19 more...)

Neural Information Processing Systems

Country: Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)
Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

5f999632c48f87cffb214e575581e4a9-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:08:34 GMT

dataset, hq-sam, segmentation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)
Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

5f828e38160f31935cfe9f67503ad17c-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 18:58:42 GMT

artificial intelligence, hq-sam, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Vision (0.50)

Add feedback

5f828e38160f31935cfe9f67503ad17c-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 18:58:40 GMT

large language model, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)

Add feedback

Segment Anything in High Quality

Neural Information Processing SystemsJan-18-2025, 17:40:54 GMT

The recent Segment Anything Model (SAM) represents a big leap in scaling up segmentation models, allowing for powerful zero-shot capabilities and flexible prompting. Despite being trained with 1.1 billion masks, SAM's mask prediction quality falls short in many cases, particularly when dealing with objects that have intricate structures. We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability. Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask.

dataset, high quality, hq-sam

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.56)

Add feedback

How Much You Ate? Food Portion Estimation on Spoons

Sharma, Aaryam, Czarnecki, Chris, Chen, Yuhao, Xi, Pengcheng, Xu, Linlin, Wong, Alexander

arXiv.org Artificial IntelligenceMay-11-2024

Monitoring dietary intake is a crucial aspect of promoting healthy living. In recent years, advances in computer vision technology have facilitated dietary intake monitoring through the use of images and depth cameras. However, the current state-of-the-art image-based food portion estimation algorithms assume that users take images of their meals one or two times, which can be inconvenient and fail to capture food items that are not visible from a top-down perspective, such as ingredients submerged in a stew. To address these limitations, we introduce an innovative solution that utilizes stationary user-facing cameras to track food items on utensils, not requiring any change of camera perspective after installation. The shallow depth of utensils provides a more favorable angle for capturing food items, and tracking them on the utensil's surface offers a significantly more accurate estimation of dietary intake without the need for post-meal image capture. The system is reliable for estimation of nutritional content of liquid-solid heterogeneous mixtures such as soups and stews. Through a series of experiments, we demonstrate the exceptional potential of our method as a non-invasive, user-friendly, and highly accurate dietary intake monitoring tool.

estimation, segmentation, utensil, (15 more...)

arXiv.org Artificial Intelligence

2405.08717

Country:

North America > Canada (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation

Liu, Xianjie, Fu, Keren, Zhao, Qijun

arXiv.org Artificial IntelligenceDec-30-2023

Abstract--Segmenting any object represents a crucial step towards achieving artificial general intelligence, and the "Segment Anything Model" (SAM) has significantly advanced the development of foundational models in computer vision. We have high expectations regarding whether SAM can enhance highly accurate dichotomous image segmentation. In fact, the evidence presented in this article demonstrates that by inputting SAM with simple prompt boxes and utilizing the results output by SAM as input for IS5Net, we can greatly improve the effectiveness of highly accurate dichotomous image segmentation. Over the last few months, there has points/boxes/masks to provide information for the decoder. The impressive that embed the extracted image features, connected outputs, and understanding capabilities of these large models have left users cue labels together for the final mask prediction.

computer vision, is5net, segmentation, (9 more...)

arXiv.org Artificial Intelligence

2401.00248

Country: Asia > China > Sichuan Province (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Filters

Collaborating Authors

hq-sam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Supplementary Material: Segment Anything in High Quality

Segment Anything in High Quality

Segment Anything in High Quality

Uncertainty-aware Fine-tuning of Segmentation Foundation Models

5f999632c48f87cffb214e575581e4a9-Paper-Conference.pdf

5f828e38160f31935cfe9f67503ad17c-Supplemental-Conference.pdf

5f828e38160f31935cfe9f67503ad17c-Paper-Conference.pdf

Segment Anything in High Quality

How Much You Ate? Food Portion Estimation on Spoons

Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation