AITopics | annotation cost

Collaborating Authors

annotation cost

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Active Learning with Foundation Model Priors: Efficient Learning under Class Imbalance

Zhang, Jiancheng, Li, Meiqing, Zhang, Qi, Zhu, Yinglun

arXiv.org Machine LearningJun-9-2026

Real-world datasets across image and text domains are often characterized by skewed class distributions and noisy annotations, which jointly degrade model performance, particularly on minority classes. Among existing solutions, active learning offers an effective and efficient paradigm by selectively querying the most informative and balanced samples for annotation. We propose an innovative active learning framework that mitigates class imbalance and selects the most informative samples to annotate. Leveraging foundation model priors, our algorithm enables imbalance-aware co-decisions between foundation model and small model to tackle noisy and imbalanced labels across various domains. We introduce the first study to systematically explore active learning under the dual challenges of label noise and class imbalance across image and text domains. Extensive experiments on imbalanced datasets demonstrate that our method achieves substantial annotation savings-over 50% compared to the best active learning baseline-while preserving performance and robustness to label noise.

large language model, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2606.0763

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior

Cheng-Chun Hsu, Kuang-Jui Hsu, Chung-Chi Tsai, Yen-Yu Lin, Yung-Yu Chuang

Neural Information Processing SystemsFeb-14-2026, 20:41:23 GMT

Neural Information Processing Systems http://nips.cc/

annotation, segmentation, semantic segmentation, (15 more...)

Neural Information Processing Systems

Country:

Asia > Taiwan (0.04)
North America > Canada (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

OnePositiveLabelisSufficient: Single-PositiveMulti-LabelLearningwithLabel Enhancement

Neural Information Processing SystemsFeb-10-2026, 13:28:48 GMT

Experiments on twelve corrupted MLL datasets show the effectiveness of SMILEoverseveral existing SPMLL approaches.

artificial intelligence, classification, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
Asia > China (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.47)

Add feedback

Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision

Neural Information Processing SystemsDec-27-2025, 05:54:06 GMT

Large language models (LLMs) have demonstrated remarkable capabilities in various tasks. However, their suitability for domain-specific tasks, is limited due to their immense scale at deployment, susceptibility to misinformation, and more importantly, high data annotation costs. We propose a novel Interactive Multi-Fidelity Learning (IMFL) framework for cost-effective development of small domain-specific LMs under limited annotation budgets. Our approach formulates the domain-specific fine-tuning process as a multi-fidelity learning problem, focusing on identifying the optimal acquisition strategy that balances between low-fidelity automatic LLM annotations and high-fidelity human annotations to maximize model performance. We further propose an exploration-exploitation query strategy that enhances annotation diversity and informativeness, incorporating two innovative designs: 1) prompt retrieval that selects in-context examples from human-annotated samples to improve LLM annotation, and 2) variable batch size that controls the order for choosing each fidelity to facilitate knowledge distillation, ultimately enhancing annotation quality. Extensive experiments on financial and medical tasks demonstrate that IMFL achieves superior performance compared with single fidelity annotations. Given a limited budget of human annotation, IMFL significantly outperforms the $\bf 3\times$ human annotation baselines in all four tasks and achieves very close performance as $\bf 5\times$ human annotation on two of the tasks. These promising results suggest that the high human annotation costs in domain-specific tasks can be significantly reduced by employing IMFL, which utilizes fewer human annotations, supplemented with cheaper and faster LLM (e.g., GPT-3.5)

annotation, cost-effective adaptation, interactive multi-fidelity learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Actively Testing Your Model While It Learns: Realizing Label-Efficient Learning in Practice

Neural Information Processing SystemsDec-25-2025, 17:06:25 GMT

In active learning (AL), we focus on reducing the data annotation cost from the model training perspective. However, testing'', which often refers to the model evaluation process of using empirical risk to estimate the intractable true generalization risk, also requires data annotations. The annotation cost for testing'' (model evaluation) is under-explored. Even in works that study active model evaluation or active testing (AT), the learning and testing ends are disconnected. In this paper, we propose a novel active testing while learning (ATL) framework that integrates active learning with active testing.

model evaluation, name change, realizing label-efficient learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.86)

Add feedback

Are all Frames Equal? Active Sparse Labeling for Video Action Detection

Neural Information Processing SystemsDec-24-2025, 06:53:01 GMT

Video action detection requires annotations at every frame, which drastically increases the labeling cost. In this work, we focus on efficient labeling of videos for action detection to minimize this cost. We propose active sparse labeling (ASL), a novel active learning strategy for video action detection. Sparse labeling will reduce the annotation cost but poses two main challenges; 1) how to estimate the utility of annotating a single frame for action detection as detection is performed at video level?, and 2) how these sparse labels can be used for action detection which require annotations on all the frames? This work attempts to address these challenges within a simple active learning framework. For the first challenge, we propose a novel frame-level scoring mechanism aimed at selecting most informative frames in a video. Next, we introduce a novel loss formulation which enables training of action detection model with these sparsely selected frames. We evaluate the proposed approach on two different action detection benchmark datasets, UCF-101-24 and J-HMDB-21, and observed that active sparse labeling can be very effective in saving annotation costs. We demonstrate that the proposed approach performs better than random selection, outperforming all other baselines, with performance comparable to supervised approach using merely 10% annotations.

action detection, active sparse, frame equal, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

From Passive Perception to Active Memory: A Weakly Supervised Image Manipulation Localization Framework Driven by Coarse-Grained Annotations

Guo, Zhiqing, Xi, Dongdong, Li, Songlin, Yang, Gaobo

arXiv.org Artificial IntelligenceNov-26-2025

Image manipulation localization (IML) faces a fundamental trade-off between minimizing annotation cost and achieving fine-grained localization accuracy. Existing fully-supervised IML methods depend heavily on dense pixel-level mask annotations, which limits scalability to large datasets or real-world deployment. In contrast, the majority of existing weakly-supervised IML approaches are based on image-level labels, which greatly reduce annotation effort but typically lack precise spatial localization. To address this dilemma, we propose BoxPromptIML, a novel weakly-supervised IML framework that effectively balances annotation cost and localization performance. Specifically, we propose a coarse region annotation strategy, which can generate relatively accurate manipulation masks at lower cost. To improve model efficiency and facilitate deployment, we further design an efficient lightweight student model, which learns to perform fine-grained localization through knowledge distillation from a fixed teacher model based on the Segment Anything Model (SAM). Moreover, inspired by the human subconscious memory mechanism, our feature fusion module employs a dual-guidance strategy that actively contextualizes recalled prototypical patterns with real-time observational cues derived from the input. Instead of passive feature extraction, this strategy enables a dynamic process of knowledge recollection, where long-term memory is adapted to the specific context of the current image, significantly enhancing localization accuracy and robustness. Extensive experiments across both in-distribution and out-of-distribution datasets show that Box-PromptIML outperforms or rivals fully-supervised models, while maintaining strong generalization, low annotation cost, and efficient deployment characteristics.

artificial intelligence, localization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.20359

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

WaveFuse-AL: Cyclical and Performance-Adaptive Multi-Strategy Active Learning for Medical Images

Thakur, Nishchala, Kochhar, Swati, Bathula, Deepti R., Gupta, Sukrit

arXiv.org Artificial IntelligenceNov-20-2025

Active learning reduces annotation costs in medical imaging by strategically selecting the most informative samples for labeling. However, individual acquisition strategies often exhibit inconsistent behavior across different stages of the active learning cycle. We propose Cyclical and Performance-Adaptive Multi-Strategy Active Learning (WaveFuse-AL), a novel framework that adaptively fuses multiple established acquisition strategies-BALD, BADGE, Entropy, and CoreSet throughout the learning process. WaveFuse-AL integrates cyclical (sinusoidal) temporal priors with performance-driven adaptation to dynamically adjust strategy importance over time. We evaluate WaveFuse-AL on three medical imaging benchmarks: APTOS-2019 (multi-class classification), RSNA Pneumonia Detection (binary classification), and ISIC-2018 (skin lesion segmentation). Experimental results demonstrate that WaveFuse-AL consistently outperforms both single-strategy and alternating-strategy baselines, achieving statistically significant performance improvements (on ten out of twelve metric measurements) while maximizing the utility of limited annotation budgets.

artificial intelligence, machine learning, segmentation, (12 more...)

arXiv.org Artificial Intelligence

2511.15132

Country:

Asia > India (0.15)
North America > United States > Wisconsin (0.14)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.35)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

When Does Supervised Training Pay Off? The Hidden Economics of Object Detection in the Era of Vision-Language Models

Al-Hamadani, Samer

arXiv.org Artificial IntelligenceOct-21-2025

Object detection constitutes a foundational computer vision capability enabling diverse applications from autonomous vehicles to retail analytics, with modern deep learning approaches achieving remarkable technical performance exceeding 90% mean Average Precision on standardized benchmarks [1, 2]. However, technical accuracy represents only one dimension of deployment viability, as real-world system selection requires evaluating cost-effectiveness--the relationship between detection performance and total economic investment required to achieve that performance [3, 4]. Traditional supervised detectors, exemplified by the YOLO architecture family [2, 5], rely fundamentally on manually annotated training data, with industry reports estimating annotation costs between $0.10 and $0.50 per bounding box [6, 7], translating to $9,000-$45,000 for establishing 100-category detection systems with sufficient training data. Vision-Language Models represent an alternative paradigm achieving object detection through zero-shot inference without task-specific supervision [8-10]. Pre-trained on billions of image-text pairs, VLMs accept natural language object descriptions and generate bounding box predictions through learned visual-linguistic alignment, fundamentally eliminating annotation requirements.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2510.11302

Genre: