AITopics | vfm

Collaborating Authors

vfm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Revisiting Semi-Supervised Learning in the Era of Foundation Models

Neural Information Processing SystemsJun-17-2026, 09:45:58 GMT

Semi-supervised learning (SSL) enhances model performance by leveraging abundant unlabeled data alongside limited labeled data. As vision foundation models (VFMs) become central to modern vision applications, this paper revisits SSL in the context of these powerful pre-trained models. We conduct a systematic study on tasks where frozen VFMs underperform and reveal several key insights when fine-tuning them. First, parameter-efficient fine-tuning (PEFT) using only labeled data often surpasses traditional SSL methods--even without access to unlabeled data. Second, pseudo-labels generated by PEFT models offer valuable supervisory signals for unlabeled data, and different PEFT techniques yield complementary pseudo-labels. These findings motivate a simple yet effective SSL baseline for the VFM era: ensemble pseudo-labeling across diverse PEFT methods and VFM backbones.

artificial intelligence, experiment, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada > Ontario (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Online Segment Any 3D Thing as Instance Tracking

Neural Information Processing SystemsJun-12-2026, 00:15:47 GMT

Online, real-time, and fine-grained 3D segmentation constitutes a fundamental capability for embodied intelligent agents to perceive and comprehend their operational environments. Recent advancements employ predefined object queries to aggregate semantic information from Vision Foundation Models (VFMs) outputs that are lifted into 3D point clouds, facilitating spatial information propagation through inter-query interactions. Nevertheless, perception, whether human or robotic, is an inherently dynamic process, rendering temporal understanding a critical yet overlooked dimension within these prevailing query-based pipelines. This deficiency in temporal reasoning can exacerbate issues such as the over-segmentation commonly produced by VFMs, necessitating more handcrafted post-processing. Therefore, to further unlock the temporal environmental perception capabilities of embodied agents, our work reconceptualizes online 3D segmentation as an instance tracking problem (AutoSeg3D).

artificial intelligence, name change, proceedings, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.80)

Add feedback

Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation

Neural Information Processing SystemsMar-22-2026, 00:29:01 GMT

The emerging vision foundation model (VFM) has inherited the ability to generalize to unseen images.Nevertheless, the key challenge of domain-generalized semantic segmentation (DGSS) lies in the domain gap attributed to the cross-domain styles, i.e., the variance of urban landscape and environment dependencies.Hence, maintaining the style-invariant property with varying domain styles becomes the key bottleneck in harnessing VFM for DGSS. The frequency space after Haar wavelet transformation provides a feasible way to decouple the style information from the domain-invariant content, since the content and style information are retained in the low-and high-frequency components of the space, respectively. To this end, we propose a novel Frequency-Adapted (FADA) learning scheme to advance the frontier.Its overall idea is to separately tackle the content and style information by frequency tokens throughout the learning process.Particularly, the proposed FADA consists of two branches, i.e., low-and high-frequency branches. The former one is able to stabilize the scene content, while the latter one learns the scene styles and eliminates its impact to DGSS. Experiments conducted on various DGSS settings show the state-of-the-art performance of our FADA and its versatility to a variety of VFMs.Source code is available at \url{https://github.com/BiQiWHU/FADA}.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise

Neural Information Processing SystemsMar-18-2026, 22:36:13 GMT

Deep neural networks have demonstrated remarkable performance in various vision tasks, but their success heavily depends on the quality of the training data. Noisy labels are a critical issue in medical datasets and can significantly degrade model performance. Previous clean sample selection methods have not utilized the well pre-trained features of vision foundation models (VFMs) and assumed that training begins from scratch. In this paper, we propose CUFIT, a curriculum fine-tuning paradigm of VFMs for medical image classification under label noise. Our method is motivated by the fact that linear probing of VFMs is relatively unaffected by noisy samples, as it does not update the feature extractor of the VFM, thus robustly classifying the training samples. Subsequently, curriculum fine-tuning of two adapters is conducted, starting with clean sample selection from the linear probing phase. Our experimental results demonstrate that CUFIT outperforms previous methods across various medical image benchmarks.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

UniDSeg: Unified Cross-Domain 3D Semantic Segmentation via Visual Foundation Models Prior Y ao Wu

Neural Information Processing SystemsFeb-17-2026, 16:53:46 GMT

The essence of simultaneously solving cross-domain tasks is to enhance the general-izability of the encoder.

machine learning, natural language, semantic segmentation, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.15)
Asia > China > Fujian Province > Xiamen (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (1.00)
Transportation > Ground (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Sensing and Signal Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CurriculumFine-tuningofVisionFoundationModel forMedicalImageClassificationUnderLabelNoise

Neural Information Processing SystemsFeb-9-2026, 07:37:22 GMT

Our experimental results demonstrate that CUFIT outperforms previous methods across various medical image benchmarks.

artificial intelligence, machine learning, noisy label, (19 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Instructional Material (0.69)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

15b780350b302a1bf9a3bd273f5c15a4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 10:08:01 GMT

This work has been extended to different geometries [7, 23] and various applications[60,9,13,24].

artificial intelligence, machine learning, xlogpt, (18 more...)

Neural Information Processing Systems

Country: Europe > Greece (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

EMVP: Embracing Visual Foundation Model for Visual Place Recognition with Centroid-Free Probing

Neural Information Processing SystemsNov-20-2025, 05:11:28 GMT

Specifically, it achieves 93.9%, 96.5%, and 94.6% Recall@1 on the MSLS V alidation, Pitts250k-test, and SPED datasets, respectively, while saving 64.3% of trainable parameters compared with the existing SOT A PEFT method.

descriptor, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Czechia > Prague (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Vision4PPG: Emergent PPG Analysis Capability of Vision Foundation Models for Vital Signs like Blood Pressure

Kataria, Saurabh, Ermis, Ayca, Panchumarthi, Lovely Yeswanth, Wang, Minxiao, Hu, Xiao

arXiv.org Artificial IntelligenceOct-14-2025

Photoplethysmography (PPG) sensor in wearable and clinical devices provides valuable physiological insights in a non-invasive and real-time fashion. Specialized Foundation Models (FM) or repurposed time-series FMs are used to benchmark physiological tasks. Our experiments with fine-tuning FMs reveal that Vision FM (VFM) can also be utilized for this purpose and, in fact, surprisingly leads to state-of-the-art (SOT A) performance on many tasks, notably blood pressure estimation. We leverage VFMs by simply transforming one-dimensional PPG signals into image-like two-dimensional representations, such as the Short-Time Fourier transform (STFT). Using the latest VFMs, such as DINOv3 and SIGLIP-2, we achieve promising performance on other vital signs and blood lab measurement tasks as well. Our proposal, Vision4PPG, unlocks a new class of FMs to achieve SOT A performance with notable generalization to other 2D input representations, including STFT phase and recurrence plots. Our work improves upon prior investigations of vision models for PPG by conducting a comprehensive study, comparing them to state-of-the-art time-series FMs, and demonstrating the general PPG processing ability by reporting results on six additional tasks. Thus, we provide clinician-scientists with a new set of powerful tools that is also computationally efficient, thanks to Parameter-Efficient Fine-Tuning (PEFT) techniques. 1 Introduction

arxiv preprint arxiv, data quality, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.10366

Genre: Research Report (0.82)

Industry: