AITopics | Pullman

Collaborating Authors

Pullman

Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise

Shi, Yuanjie, Li, Peihong, Zhang, Zijian, Doppa, Janardhan Rao, Yan, Yan

arXiv.org Machine LearningApr-13-2026

Most methods for learning with noisy labels require privileged knowledge such as noise transition matrices, clean subsets or pretrained feature extractors, resources typically unavailable when robustness is most needed. We propose Conformal Margin Risk Minimization (CMRM), a plug-and-play envelope framework that improves any classification loss under label noise by adding a single quantile-calibrated regularization term, with no privileged knowledge or training pipeline modification. CMRM measures the confidence margin between the observed label and competing labels, and thresholds it with a conformal quantile estimated per batch to focus training on high-margin samples while suppressing likely mislabeled ones. We derive a learning bound for CMRM under arbitrary label noise requiring only mild regularity of the margin distribution. Across five base methods and six benchmarks with synthetic and real-world noise, CMRM consistently improves accuracy (up to +3.39%), reduces conformal prediction set size (up to -20.44%) and does not hurt under 0% noise, showing that CMRM captures a method-agnostic uncertainty signal that existing mechanisms did not exploit.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2604.06468

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Washington > Whitman County > Pullman (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

Geometric Calibration and Neutral Zones for Uncertainty-Aware Multi-Class Classification

Das, Soumojit, Dasgupta, Nairanjana, Dutta, Prashanta

arXiv.org Machine LearningDec-2-2025

Modern artificial intelligence systems make critical decisions yet often fail silently when uncertain -- even well-calibrated models provide no mechanism to identify \textit{which specific predictions} are unreliable. We develop a geometric framework addressing both calibration and instance-level uncertainty quantification for neural network probability outputs. Treating probability vectors as points on the $(c-1)$-dimensional probability simplex equipped with the Fisher--Rao metric, we construct: (i) Additive Log-Ratio (ALR) calibration maps that reduce exactly to Platt scaling for binary problems while extending naturally to multi-class settings, and (ii) geometric reliability scores that translate calibrated probabilities into actionable uncertainty measures, enabling principled deferral of ambiguous predictions to human review. Theoretical contributions include: consistency of the calibration estimator at rate $O_p(n^{-1/2})$ via M-estimation theory (Theorem~1), and tight concentration bounds for reliability scores with explicit sub-Gaussian parameters enabling sample size calculations for validation set design (Theorem~2). We conjecture Neyman--Pearson optimality of our neutral zone construction based on connections to Bhattacharyya coefficients. Empirical validation on Adeno-Associated Virus classification demonstrates that the two-stage framework captures 72.5\% of errors while deferring 34.5\% of samples, reducing automated decision error rates from 16.8\% to 6.9\%. Notably, calibration alone yields marginal accuracy gains; the operational benefit arises primarily from the reliability scoring mechanism, which applies to any well-calibrated probability output. This work bridges information geometry and statistical learning, offering formal guarantees for uncertainty-aware classification in applications requiring rigorous validation.

calibration, prediction, reliability score, (15 more...)

arXiv.org Machine Learning

2511.2096

Country:

North America > United States > Washington > Whitman County > Pullman (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Direct Prediction Set Minimization via Bilevel Conformal Classifier Training

Shi, Yuanjie, Shahrokhi, Hooman, Jia, Xuesong, Chen, Xiongzhi, Doppa, Janardhan Rao, Yan, Yan

arXiv.org Machine LearningJun-10-2025

Conformal prediction (CP) is a promising uncertainty quantification framework which works as a wrapper around a black-box classifier to construct prediction sets (i.e., subset of candidate classes) with provable guarantees. However, standard calibration methods for CP tend to produce large prediction sets which makes them less useful in practice. This paper considers the problem of integrating conformal principles into the training process of deep classifiers to directly minimize the size of prediction sets. We formulate conformal training as a bilevel optimization problem and propose the {\em Direct Prediction Set Minimization (DPSM)} algorithm to solve it. The key insight behind DPSM is to minimize a measure of the prediction set size (upper level) that is conditioned on the learned quantile of conformity scores (lower level). We analyze that DPSM has a learning bound of $O(1/\sqrt{n})$ (with $n$ training samples), while prior conformal training methods based on stochastic approximation for the quantile has a bound of $Ω(1/s)$ (with batch size $s$ and typically $s \ll \sqrt{n}$). Experiments on various benchmark datasets and deep models show that DPSM significantly outperforms the best prior conformal training baseline with $20.46\%\downarrow$ in the prediction set size and validates our theory.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Machine Learning

2506.06599

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > Whitman County > Pullman (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

AIMI: Leveraging Future Knowledge and Personalization in Sparse Event Forecasting for Treatment Adherence

Mamun, Abdullah, Cook, Diane J., Ghasemzadeh, Hassan

arXiv.org Artificial IntelligenceMar-20-2025

Adherence to prescribed treatments is crucial for individuals with chronic conditions to avoid costly or adverse health outcomes. For certain patient groups, intensive lifestyle interventions are vital for enhancing medication adherence. Accurate forecasting of treatment adherence can open pathways to developing an on-demand intervention tool, enabling timely and personalized support. With the increasing popularity of smartphones and wearables, it is now easier than ever to develop and deploy smart activity monitoring systems. However, effective forecasting systems for treatment adherence based on wearable sensors are still not widely available. We close this gap by proposing Adherence Forecasting and Intervention with Machine Intelligence (AIMI). AIMI is a knowledge-guided adherence forecasting system that leverages smartphone sensors and previous medication history to estimate the likelihood of forgetting to take a prescribed medication. A user study was conducted with 27 participants who took daily medications to manage their cardiovascular diseases. We designed and developed CNN and LSTM-based forecasting models with various combinations of input features and found that LSTM models can forecast medication adherence with an accuracy of 0.932 and an F-1 score of 0.936. Moreover, through a series of ablation studies involving convolutional and recurrent neural network architectures, we demonstrate that leveraging known knowledge about future and personalized training enhances the accuracy of medication adherence forecasting. Code available: https://github.com/ab9mamun/AIMI.

artificial intelligence, machine learning, participant, (19 more...)

arXiv.org Artificial Intelligence

2503.16091

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
North America > United States > Washington > Whitman County > Pullman (0.04)
Europe > United Kingdom (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Consumer Health (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Surrogates for Offline Black-Box Optimization via Gradient Matching

Hoang, Minh, Fadhel, Azza, Deshwal, Aryan, Doppa, Janardhan Rao, Hoang, Trong Nghia

arXiv.org Artificial IntelligenceFeb-26-2025

Offline design optimization problem arises in numerous science and engineering applications including material and chemical design, where expensive online experimentation necessitates the use of in silico surrogate functions to predict and maximize the target objective over candidate designs. Although these surrogates can be learned from offline data, their predictions are often inaccurate outside the offline data regime. This challenge raises a fundamental question about the impact of imperfect surrogate model on the performance gap between its optima and the true optima, and to what extent the performance loss can be mitigated. Although prior work developed methods to improve the robustness of surrogate models and their associated optimization processes, a provably quantifiable relationship between an imperfect surrogate and the corresponding performance gap, as well as whether prior methods directly address it, remain elusive. To shed light on this important question, we present a theoretical framework to understand offline black-box optimization, by explicitly bounding the optimization quality based on how well the surrogate matches the latent gradient field that underlines the offline data. Inspired by our theoretical analysis, we propose a principled black-box gradient matching algorithm to create effective surrogate models for offline optimization, improving over prior approaches on various real-world benchmarks.

algorithm, match-opt, target function, (13 more...)

arXiv.org Artificial Intelligence

2503.01883

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Washington > Whitman County > Pullman (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Air (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Provably-Stable Neural Network-Based Control of Nonlinear Systems

Li, Anran, Swensen, John P., Hosseinzadeh, Mehdi

arXiv.org Artificial IntelligenceJan-31-2025

In recent years, Neural Networks (NNs) have been employed to control nonlinear systems due to their potential capability in dealing with situations that might be difficult for conventional nonlinear control schemes. However, to the best of our knowledge, the current literature on NN-based control lacks theoretical guarantees for stability and tracking performance. This precludes the application of NN-based control schemes to systems where stringent stability and performance guarantees are required. To address this gap, this paper proposes a systematic and comprehensive methodology to design provably-stable NN-based control schemes for affine nonlinear systems. Rigorous analysis is provided to show that the proposed approach guarantees stability of the closed-loop system with the NN in the loop. Also, it is shown that the resulting NN-based control scheme ensures that system states asymptotically converge to a neighborhood around the desired equilibrium point, with a tunable proximity threshold. The proposed methodology is validated and evaluated via simulation studies on an inverted pendulum and experimental studies on a Parrot Bebop 2 drone.

artificial intelligence, control scheme, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.engappai.2024.109252

2502.00248

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Washington > Whitman County > Pullman (0.04)
(12 more...)

Genre: Research Report > Experimental Study (0.48)

Industry: Energy (0.52)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Guaranteed-Stable Neural Network Approach for Optimal Control of Nonlinear Systems

Li, Anran, Swensen, John P., Hosseinzadeh, Mehdi

arXiv.org Artificial IntelligenceJan-28-2025

A promising approach to optimal control of nonlinear systems involves iteratively linearizing the system and solving an optimization problem at each time instant to determine the optimal control input. Since this approach relies on online optimization, it can be computationally expensive, and thus unrealistic for systems with limited computing resources. One potential solution to this issue is to incorporate a Neural Network (NN) into the control loop to emulate the behavior of the optimal control scheme. Ensuring stability and reference tracking in the resulting NN-based closed-loop system requires modifications to the primary optimization problem. These modifications often introduce non-convexity and nonlinearity with respect to the decision variables, which may surpass the capabilities of existing solvers and complicate the generation of the training dataset. To address this issue, this paper develops a Neural Optimization Machine (NOM) to solve the resulting optimization problems. The central concept of a NOM is to transform the optimization challenges into the problem of training a NN. Rigorous proofs demonstrate that when a NN trained on data generated by the NOM is used in the control loop, all signals remain bounded and the system states asymptotically converge to a neighborhood around the desired equilibrium point, with a tunable proximity threshold. Simulation and experimental studies are provided to illustrate the effectiveness of the proposed methodology.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2501.17333

Country:

North America > United States > Washington > Whitman County > Pullman (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(10 more...)

Genre: Research Report > Promising Solution (0.54)

Industry: Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Safe and Efficient Robot Action Planning in the Presence of Unconcerned Humans

Amiri, Mohsen, Hosseinzadeh, Mehdi

arXiv.org Artificial IntelligenceJan-22-2025

This paper proposes a robot action planning scheme that provides an efficient and probabilistically safe plan for a robot interacting with an unconcerned human -- someone who is either unaware of the robot's presence or unwilling to engage in ensuring safety. The proposed scheme is predictive, meaning that the robot is required to predict human actions over a finite future horizon; such predictions are often inaccurate in real-world scenarios. One possible approach to reduce the uncertainties is to provide the robot with the capability of reasoning about the human's awareness of potential dangers. This paper discusses that by using a binary variable, so-called danger awareness coefficient, it is possible to differentiate between concerned and unconcerned humans, and provides a learning algorithm to determine this coefficient by observing human actions. Moreover, this paper argues how humans rely on predictions of other agents' future actions (including those of robots in human-robot interaction) in their decision-making. It also shows that ignoring this aspect in predicting human's future actions can significantly degrade the efficiency of the interaction, causing agents to deviate from their optimal paths. The proposed robot action planning scheme is verified and validated via extensive simulation and experimental studies on a LoCoBot WidowX-250.

artificial intelligence, machine learning, robot, (18 more...)

arXiv.org Artificial Intelligence

2501.13203

Country:

Europe > Italy (0.04)
Asia > China > Hong Kong (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(16 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.46)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)

Add feedback

Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures

Dhingra, Pratyush, Doppa, Janardhan Rao, Pande, Partha Pratim

arXiv.org Artificial IntelligenceJan-16-2025

Transformer architectures have become the standard neural network model for various machine learning applications including natural language processing and computer vision. However, the compute and memory requirements introduced by transformer models make them challenging to adopt for edge applications. Furthermore, fine-tuning pre-trained transformers (e.g., foundation models) is a common task to enhance the model's predictive performance on specific tasks/applications. Existing transformer accelerators are oblivious to complexities introduced by fine-tuning. In this paper, we propose the design of a three-dimensional (3D) heterogeneous architecture referred to as Atleus that incorporates heterogeneous computing resources specifically optimized to accelerate transformer models for the dual purposes of fine-tuning and inference. Specifically, Atleus utilizes non-volatile memory and systolic array for accelerating transformer computational kernels using an integrated 3D platform. Moreover, we design a suitable NoC to achieve high performance and energy efficiency. Finally, Atleus adopts an effective quantization scheme to support model compression. Experimental results demonstrate that Atleus outperforms existing state-of-the-art by up to 56x and 64.5x in terms of performance and energy efficiency respectively

architecture, atleus, computation, (15 more...)

arXiv.org Artificial Intelligence

2501.09588

Country:

Asia > Middle East > Oman > Al Wusta Governorate > Haima (0.06)
North America > United States > Washington > Whitman County > Pullman (0.04)
North America > United States > Oregon > Benton County > Corvallis (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Power-Efficient Actuation for Insect-Scale Autonomous Underwater Vehicles

Longwell, Cody R., Trygstad, Conor K., Perez-Arancibia, Nestor O.

arXiv.org Artificial IntelligenceNov-26-2024

We present a new evolution of the Very Little Eel-Inspired roBot, the VLEIBot++, a 900-mg swimmer driven by two 10-mg bare high-work density (HWD) actuators, whose functionality is based on the use of shape-memory alloy (SMA) wires. An actuator of this type consumes an average power of about 40 mW during in-air operation. We integrated onboard power and computation into the VLEIBot++ using a custom-built printed circuit board (PCB) and an 11-mAh 3.7-V 507-mg single-cell lithium-ion (Li-Ion) battery, which in conjunction enable autonomous swimming for about 20 min on a single charge. This robot can swim at speeds of up to 18.7 mm/s (0.46 Bl/s) and is the first subgram microswimmer with onboard power, actuation, and computation developed to date. Unfortunately, the approach employed to actuate VLEIBot++ prototypes is infeasible for underwater applications because a typical 10-mg bare SMA-based microactuator requires an average power on the order of 800 mW when operating underwater. To address this issue, we introduce a new 13-mg power-efficient high-performance SMA-based microactuator that can function with similar power requirements (approx. 80 mW on average) and actuation performance (approx. 3 mm at low frequencies) in air and water. This design is based on the use of a sealed flexible air-capsule that encloses the SMA wires that drive the microactuator with the purpose of passively controlling the heat-transfer rate of the thermal system. Furthermore, this new power-efficient encapsulated actuator requires low voltages of excitation (3 to 4 V) and simple power electronics to function. The breakthroughs presented in this paper represent a path towards the creation of insect-scale autonomous underwater vehicles (AUVs).

actuator, artificial intelligence, microactuator, (15 more...)

arXiv.org Artificial Intelligence

2411.18001

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Energy > Energy Storage (0.34)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.84)

Add feedback