AITopics | thresholding

Collaborating Authors

thresholding

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Data-Centric Perspective on the Influence of Image Data Quality in Machine Learning Models

Chen, Pei-Han, Chung, Szu-Chi

arXiv.org Artificial IntelligenceSep-30-2025

In machine learning, research has traditionally focused on model development, with relatively less attention paid to training data. As model architectures have matured and marginal gains from further refinements diminish, data quality has emerged as a critical factor. However, systematic studies on evaluating and ensuring dataset quality in the image domain remain limited. This study investigates methods for systematically assessing image dataset quality and examines how various image quality factors influence model performance. Using the publicly available and relatively clean CIFAKE dataset, we identify common quality issues and quantify their impact on training. Building on these findings, we develop a pipeline that integrates two community-developed tools, CleanVision and Fastdup. We analyze their underlying mechanisms and introduce several enhancements, including automatic threshold selection to detect problematic images without manual tuning. Experimental results demonstrate that not all quality issues exert the same level of impact. While convolutional neural networks show resilience to certain distortions, they are particularly vulnerable to degradations that obscure critical visual features, such as blurring and severe downscaling. To assess the performance of existing tools and the effectiveness of our proposed enhancements, we formulate the detection of low-quality images as a binary classification task and use the F1 score as the evaluation metric. Our automatic thresholding method improves the F1 score from 0.6794 to 0.9468 under single perturbations and from 0.7447 to 0.8557 under dual perturbations. For near-duplicate detection, our deduplication strategy increases the F1 score from 0.4576 to 0.7928. These results underscore the effectiveness of our workflow and provide a foundation for advancing data quality assessment in image-based machine learning.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.2442

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

SST: Self-training with Self-adaptive Thresholding for Semi-supervised Learning

Zhao, Shuai, Huang, Heyan, Li, Xinge, Chen, Xiaokang, Wang, Rui

arXiv.org Artificial IntelligenceJun-3-2025

Neural networks have demonstrated exceptional performance in supervised learning, benefiting from abundant high-quality annotated data. However, obtaining such data in real-world scenarios is costly and labor-intensive. Semi-supervised learning (SSL) offers a solution to this problem. Recent studies, such as Semi-ViT and Noisy Student, which employ consistency regularization or pseudo-labeling, have demonstrated significant achievements. However, they still face challenges, particularly in accurately selecting sufficient high-quality pseudo-labels due to their reliance on fixed thresholds. Recent methods such as FlexMatch and FreeMatch have introduced flexible or self-adaptive thresholding techniques, greatly advancing SSL research. Nonetheless, their process of updating thresholds at each iteration is deemed time-consuming, computationally intensive, and potentially unnecessary. To address these issues, we propose Self-training with Self-adaptive Thresholding (SST), a novel, effective, and efficient SSL framework. SST introduces an innovative Self-Adaptive Thresholding (SAT) mechanism that adaptively adjusts class-specific thresholds based on the model's learning progress. SAT ensures the selection of high-quality pseudo-labeled data, mitigating the risks of inaccurate pseudo-labels and confirmation bias. Extensive experiments demonstrate that SST achieves state-of-the-art performance with remarkable efficiency, generalization, and scalability across various architectures and datasets. Semi-SST-ViT-Huge achieves the best results on competitive ImageNet-1K SSL benchmarks, with 80.7% / 84.9% Top-1 accuracy using only 1% / 10% labeled data. Compared to the fully-supervised DeiT-III-ViT-Huge, which achieves 84.8% Top-1 accuracy using 100% labeled data, our method demonstrates superior performance using only 10% labeled data.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.ipm.2025.104158

2506.00467

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Beijing > Beijing (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Fujian Province (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.73)

Add feedback

KEVS: Enhancing Segmentation of Visceral Adipose Tissue in Pre-Cystectomy CT with Gaussian Kernel Density Estimation

Boucher, Thomas, Tetlow, Nicholas, Fung, Annie, Dewar, Amy, Arina, Pietro, Kerneis, Sven, Whittle, John, Mazomenos, Evangelos B.

arXiv.org Artificial IntelligenceMay-13-2025

Purpose: The distribution of visceral adipose tissue (VAT) in cystectomy patients is indicative of the incidence of post-operative complications. Existing VAT segmentation methods for computed tomography (CT) employing intensity thresholding have limitations relating to inter-observer variability. Moreover, the difficulty in creating ground-truth masks limits the development of deep learning (DL) models for this task. This paper introduces a novel method for VAT prediction in pre-cystectomy CT, which is fully automated and does not require ground-truth VAT masks for training, overcoming aforementioned limitations. Methods: We introduce the Kernel density Enhanced VAT Segmentator ( KEVS), combining a DL semantic segmentation model, for multi-body feature prediction, with Gaussian kernel density estimation analysis of predicted subcutaneous adipose tissue to achieve accurate scan-specific predictions of VAT in the abdominal cavity. Uniquely for a DL pipeline, KEVS does not require ground-truth VAT masks. Results: We verify the ability of KEVS to accurately segment abdominal organs in unseen CT data and compare KEVS VAT segmentation predictions to existing state-of-the-art (SOTA) approaches in a dataset of 20 pre-cystectomy CT scans, collected from University College London Hospital (UCLH-Cyst), with expert ground-truth annotations. KEVS presents a 4.80% and 6.02% improvement in Dice Coefficient over the second best DL and thresholding-based VAT segmentation techniques respectively when evaluated on UCLH-Cyst. Conclusion: This research introduces KEVS; an automated, SOTA method for the prediction of VAT in pre-cystectomy CT which eliminates inter-observer variability and is trained entirely on open-source CT datasets which do not contain ground-truth VAT masks.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11548-025-03380-7

2503.22592

Country: Europe > United Kingdom (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Providers & Services (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Sparse PCA via Covariance Thresholding

Yash Deshpande, Andrea Montanari

Neural Information Processing SystemsFeb-9-2025, 23:00:17 GMT

In sparse principal component analysis we are given noisy observations of a lowrank matrix of dimension n p and seek to reconstruct it under additional sparsity assumptions.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach to Question Answering

Iannelli, Michael, Kuchipudi, Sneha, Dvorak, Vera

arXiv.org Artificial IntelligenceDec-6-2024

Retrieval Augmented Generation (RAG) enables Large Language Models (LLMs) to generalize to new information by decoupling reasoning capabilities from static knowledge bases. Traditional RAG enhancements have explored vertical scaling -- assigning subtasks to specialized modules -- and horizontal scaling -- replicating tasks across multiple agents -- to improve performance. However, real-world applications impose diverse Service Level Agreements (SLAs) and Quality of Service (QoS) requirements, involving trade-offs among objectives such as reducing cost, ensuring answer quality, and adhering to specific operational constraints. In this work, we present a systems-oriented approach to multi-agent RAG tailored for real-world Question Answering (QA) applications. By integrating task-specific non-functional requirements -- such as answer quality, cost, and latency -- into the system, we enable dynamic reconfiguration to meet diverse SLAs. Our method maps these Service Level Objectives (SLOs) to system-level parameters, allowing the generation of optimal results within specified resource constraints. We conduct a case study in the QA domain, demonstrating how dynamic re-orchestration of a multi-agent RAG system can effectively manage the trade-off between answer quality and cost. By adjusting the system based on query intent and operational conditions, we systematically balance performance and resource utilization. This approach allows the system to meet SLOs for various query types, showcasing its practicality for real-world applications.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.06832

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.06)
Europe > Greece (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Enhancing 3D Object Detection by Using Neural Network with Self-adaptive Thresholding

Liu, Houze, Wang, Chongqing, Zhan, Xiaoan, Zheng, Haotian, Che, Chang

arXiv.org Artificial IntelligenceMay-13-2024

Robust 3D object detection remains a pivotal concern in the domain of autonomous field robotics. Despite notable enhancements in detection accuracy across standard datasets, real-world urban environments, characterized by their unstructured and dynamic nature, frequently precipitate an elevated incidence of false positives, thereby undermining the reliability of existing detection paradigms. In this context, our study introduces an advanced post-processing algorithm that modulates detection thresholds dynamically relative to the distance from the ego object. Traditional perception systems typically utilize a uniform threshold, which often leads to decreased efficacy in detecting distant objects. In contrast, our proposed methodology employs a Neural Network with a self-adaptive thresholding mechanism that significantly attenuates false negatives while concurrently diminishing false positives, particularly in complex urban settings. Empirical results substantiate that our algorithm not only augments the performance of 3D object detection models in diverse urban and adverse weather scenarios but also establishes a new benchmark for adaptive thresholding techniques in field robotics.

arxiv preprint arxiv, detection, false positive, (13 more...)

arXiv.org Artificial Intelligence

2405.07479

Country: North America > United States > New York (0.05)

Genre: Research Report (0.40)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Sparse PCA via Covariance Thresholding Andrea Montanari Electrical Engineering Electrical Engineering and Statistics Stanford University

Neural Information Processing SystemsMar-13-2024, 13:37:14 GMT

In sparse principal component analysis we are given noisy observations of a lowrank matrix of dimension n p and seek to reconstruct it under additional sparsity assumptions.

algorithm, matrix, thresholding, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

The Analysis and Extraction of Structure from Organizational Charts

Manali, Nikhil, Doermann, David, Desai, Mahesh

arXiv.org Artificial IntelligenceNov-16-2023

Organizational charts, also known as org charts, are critical representations of an organization's structure and the hierarchical relationships between its components and positions. However, manually extracting information from org charts can be error-prone and time-consuming. To solve this, we present an automated and end-to-end approach that uses computer vision, deep learning, and natural language processing techniques. Additionally, we propose a metric to evaluate the completeness and hierarchical accuracy of the extracted information. This approach has the potential to improve organizational restructuring and resource utilization by providing a clear and concise representation of the organizational structure. Our study lays a foundation for further research on the topic of hierarchical chart analysis.

graph, information, node, (14 more...)

arXiv.org Artificial Intelligence

2311.10234

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Stochastic Natural Thresholding Algorithms

Grotheer, Rachel, Li, Shuang, Ma, Anna, Needell, Deanna, Qin, Jing

arXiv.org Artificial IntelligenceJun-7-2023

Sparse signal recovery is one of the most fundamental problems in various applications, including medical imaging and remote sensing. Many greedy algorithms based on the family of hard thresholding operators have been developed to solve the sparse signal recovery problem. More recently, Natural Thresholding (NT) has been proposed with improved computational efficiency. This paper proposes and discusses convergence guarantees for stochastic natural thresholding algorithms by extending the NT from the deterministic version with linear measurements to the stochastic version with a general objective function. We also conduct various numerical experiments on linear and nonlinear measurements to demonstrate the performance of StoNT.

artificial intelligence, machine learning, stontp, (17 more...)

arXiv.org Artificial Intelligence

2306.0473

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Kentucky > Fayette County > Lexington (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines

Li, Andrew C., Chen, Zizhao, Vaezipoor, Pashootan, Klassen, Toryn Q., Icarte, Rodrigo Toro, McIlraith, Sheila A.

arXiv.org Artificial IntelligenceNov-23-2022

Natural and formal languages provide an effective mechanism for humans to specify instructions and reward functions. We investigate how to generate policies via RL when reward functions are specified in a symbolic language captured by Reward Machines, an increasingly popular automaton-inspired structure. We are interested in the case where the mapping of environment state to a symbolic (here, Reward Machine) vocabulary -- commonly known as the labelling function -- is uncertain from the perspective of the agent. We formulate the problem of policy learning in Reward Machines with noisy symbolic abstractions as a special class of POMDP optimization problem, and investigate several methods to address the problem, building on existing and new techniques, the latter focused on predicting Reward Machine state, rather than on grounding of individual symbols. We analyze these methods and evaluate them experimentally under varying degrees of uncertainty in the correct interpretation of the symbolic vocabulary. We verify the strength of our approach and the limitation of existing methods via an empirical investigation on both illustrative, toy domains and partially observable, deep RL domains.

machine learning, reinforcement learning, rm state, (15 more...)

arXiv.org Artificial Intelligence

2211.10902

Country:

North America > Canada > Ontario > Toronto (0.29)
Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
South America > Chile (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback