AITopics | arXiv.org Machine Learning

Plotting

arXiv.org Machine Learning

Reliable algorithm selection for machine learning-guided design

arXiv.org Machine LearningMar-26-2025

Algorithms for machine learning-guided design, or design algorithms, use machine learning-based predictions to propose novel objects with desired property values. Given a new design task -- for example, to design novel proteins with high binding affinity to a therapeutic target -- one must choose a design algorithm and specify any hyperparameters and predictive and/or generative models involved. How can these decisions be made such that the resulting designs are successful? This paper proposes a method for design algorithm selection, which aims to select design algorithms that will produce a distribution of design labels satisfying a user-specified success criterion -- for example, that at least ten percent of designs' labels exceed a threshold. It does so by combining designs' predicted property values with held-out labeled data to reliably forecast characteristics of the label distributions produced by different design algorithms, building upon techniques from prediction-powered inference. The method is guaranteed with high probability to return design algorithms that yield successful label distributions (or the null set if none exist), if the density ratios between the design and labeled data distributions are known. We demonstrate the method's effectiveness in simulated protein and RNA design tasks, in settings with either known or estimated density ratios.

artificial intelligence, configuration, machine learning, (15 more...)

arXiv.org Machine Learning

2503.20767

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Graph-Based Uncertainty-Aware Self-Training with Stochastic Node Labeling

Liu, Tom, Wu, Anna, Li, Chao

arXiv.org Machine LearningMar-26-2025

Self-training has become a popular semi-supervised learning technique for leveraging unlabeled data. However, the over-confidence of pseudo-labels remains a key challenge. In this paper, we propose a novel \emph{graph-based uncertainty-aware self-training} (GUST) framework to combat over-confidence in node classification. Drawing inspiration from the uncertainty integration idea introduced by Wang \emph{et al.}~\cite{wang2024uncertainty}, our method largely diverges from previous self-training approaches by focusing on \emph{stochastic node labeling} grounded in the graph topology. Specifically, we deploy a Bayesian-inspired module to estimate node-level uncertainty, incorporate these estimates into the pseudo-label generation process via an expectation-maximization (EM)-like step, and iteratively update both node embeddings and adjacency-based transformations. Experimental results on several benchmark graph datasets demonstrate that our GUST framework achieves state-of-the-art performance, especially in settings where labeled data is extremely sparse.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Machine Learning

2503.22745

Country: Asia (0.29)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

SMT-EX: An Explainable Surrogate Modeling Toolbox for Mixed-Variables Design Exploration

Robani, Mohammad Daffa, Saves, Paul, Palar, Pramudita Satria, Zuhal, Lavi Rizki, Morlier, oseph

arXiv.org Machine LearningMar-25-2025

Surrogate models are of high interest for many engineering applications, serving as cheap-to-evaluate time-efficient approximations of black-box functions to help engineers and practitioners make decisions and understand complex systems. As such, the need for explainability methods is rising and many studies have been performed to facilitate knowledge discovery from surrogate models. To respond to these enquiries, this paper introduces SMT-EX, an enhancement of the open-source Python Surrogate Modeling Toolbox (SMT) that integrates explainability techniques into a state-of-the-art surrogate modelling framework. More precisely, SMT-EX includes three key explainability methods: Shapley Additive Explanations, Partial Dependence Plot, and Individual Conditional Expectations. A peculiar explainability dependency of SMT has been developed for such purpose that can be easily activated once the surrogate model is built, offering a user-friendly and efficient tool for swift insight extraction. The effectiveness of SMT-EX is showcased through two test cases. The first case is a 10-variable wing weight problem with purely continuous variables and the second one is a 3-variable mixed-categorical cantilever beam bending problem. Relying on SMT-EX analyses for these problems, we demonstrate its versatility in addressing a diverse range of problem characteristics. SMT-Explainability is freely available on Github: https://github.com/SMTorg/smt-explainability .

data mining, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

doi: 10.2514/6.2025-0777

2503.19496

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry:

Transportation > Air (0.88)
Aerospace & Defense (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Debiasing Kernel-Based Generative Models

Qin, Tian, Huang, Wei-Min

arXiv.org Machine LearningMar-25-2025

We propose a novel two-stage framework of generative models named Debiasing Kernel-Based Generative Models (DKGM) with the insights from kernel density estimation (KDE) and stochastic approximation. In the first stage of DKGM, we employ KDE to bypass the obstacles in estimating the density of data without losing too much image quality. One characteristic of KDE is oversmoothing, which makes the generated image blurry. Therefore, in the second stage, we formulate the process of reducing the blurriness of images as a statistical debiasing problem and develop a novel iterative algorithm to improve image quality, which is inspired by the stochastic approximation. Extensive experiments illustrate that the image quality of DKGM on CIFAR10 is comparable to state-of-the-art models such as diffusion models and GAN models. The performance of DKGM on CelebA 128x128 and LSUN (Church) 128x128 is also competitive. We conduct extra experiments to exploit how the bandwidth in KDE affects the sample diversity and debiasing effect of DKGM. The connections between DKGM and score-based models are also discussed.

dkgm, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2503.20825

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

Add feedback

Interpretable Deep Regression Models with Interval-Censored Failure Time Data

Yuan, Changhui, Zhao, Shishun, Li, Shuwei, Song, Xinyuan, Chen, Zhao

arXiv.org Machine LearningMar-25-2025

Deep neural networks (DNNs) have become powerful tools for modeling complex data structures through sequentially integrating simple functions in each hidden layer. In survival analysis, recent advances of DNNs primarily focus on enhancing model capabilities, especially in exploring nonlinear covariate effects under right censoring. However, deep learning methods for interval-censored data, where the unobservable failure time is only known to lie in an interval, remain underexplored and limited to specific data type or model. This work proposes a general regression framework for interval-censored data with a broad class of partially linear transformation models, where key covariate effects are modeled parametrically while nonlinear effects of nuisance multi-modal covariates are approximated via DNNs, balancing interpretability and flexibility. We employ sieve maximum likelihood estimation by leveraging monotone splines to approximate the cumulative baseline hazard function. To ensure reliable and tractable estimation, we develop an EM algorithm incorporating stochastic gradient descent. We establish the asymptotic properties of parameter estimators and show that the DNN estimator achieves minimax-optimal convergence. Extensive simulations demonstrate superior estimation and prediction accuracy over state-of-the-art methods. Applying our method to the Alzheimer's Disease Neuroimaging Initiative dataset yields novel insights and improved predictive performance compared to traditional approaches.

artificial intelligence, interpretable deep regression model, machine learning, (3 more...)

arXiv.org Machine Learning

2503.19763

Genre: Research Report (0.69)

Industry:

Law > Civil Rights & Constitutional Law (0.73)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.53)

Add feedback

Causal Bayesian Optimization with Unknown Graphs

Durand, Jean, Annadani, Yashas, Bauer, Stefan, Parbhoo, Sonali

arXiv.org Machine LearningMar-25-2025

Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal graph, which is a limitation in many real-world scenarios where such graphs are unknown. To address this, we propose a new method for the CBO framework that operates without prior knowledge of the causal graph. Consistent with causal bandit theory, we demonstrate through theoretical analysis and that focusing on the direct causal parents of the target variable is sufficient for optimization, and provide empirical validation in the context of CBO. Furthermore we introduce a new method that learns a Bayesian posterior over the direct parents of the target variable. This allows us to optimize the outcome variable while simultaneously learning the causal structure. Our contributions include a derivation of the closed-form posterior distribution for the linear case. In the nonlinear case where the posterior is not tractable, we present a Gaussian Process (GP) approximation that still enables CBO by inferring the parents of the outcome variable. The proposed method performs competitively with existing benchmarks and scales well to larger graphs, making it a practical tool for real-world applications where causal information is incomplete.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2503.19554

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs

Ryabchenko, Alexander, Attias, Idan, Roy, Daniel M.

arXiv.org Machine LearningMar-25-2025

We study online learning with oblivious losses and delays under a novel ``capacity constraint'' that limits how many past rounds can be tracked simultaneously for delayed feedback. Under ``clairvoyance'' (i.e., delay durations are revealed upfront each round) and/or ``preemptibility'' (i.e., we have ability to stop tracking previously chosen round feedback), we establish matching upper and lower bounds (up to logarithmic terms) on achievable regret, characterizing the ``optimal capacity'' needed to match the minimax rates of classical delayed online learning, which implicitly assume unlimited capacity. Our algorithms achieve minimax-optimal regret across all capacity levels, with performance gracefully degrading under suboptimal capacity. For $K$ actions and total delay $D$ over $T$ rounds, under clairvoyance and assuming capacity $C = \Omega(\log(T))$, we achieve regret $\widetilde{\Theta}(\sqrt{TK + DK/C + D\log(K)})$ for bandits and $\widetilde{\Theta}(\sqrt{(D+T)\log(K)})$ for full-information feedback. When replacing clairvoyance with preemptibility, we require a known maximum delay bound $d_{\max}$, adding $\smash{\widetilde{O}(d_{\max})}$ to the regret. For fixed delays $d$ (i.e., $D=Td$), the minimax regret is $\Theta\bigl(\sqrt{TK(1+d/C)+Td\log(K)}\bigr)$ and the optimal capacity is $\Theta(\min\{K/\log(K),d\}\bigr)$ in the bandit setting, while in the full-information setting, the minimax regret is $\Theta\bigl(\sqrt{T(d+1)\log(K)}\bigr)$ and the optimal capacity is $\Theta(1)$. For round-dependent and fixed delays, our upper bounds are achieved using novel scheduling policies, based on Pareto-distributed proxy delays and batching techniques. Crucially, our work unifies delayed bandits, label-efficient learning, and online scheduling frameworks, demonstrating that robust online learning under delayed feedback is possible with surprisingly modest tracking capacity.

artificial intelligence, delay scheduling, regime, (16 more...)

arXiv.org Machine Learning

2503.19856

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Bayesian Optimization of a Lightweight and Accurate Neural Network for Aerodynamic Performance Prediction

Shihua, James M., Saves, Paul, Liem, Rhea P., Morlier, Joseph

arXiv.org Machine LearningMar-25-2025

Ensuring high accuracy and efficiency of predictive models is paramount in the aerospace industry, particularly in the context of multidisciplinary design and optimization processes. These processes often require numerous evaluations of complex objective functions, which can be computationally expensive and time-consuming. To build efficient and accurate predictive models, we propose a new approach that leverages Bayesian Optimization (BO) to optimize the hyper-parameters of a lightweight and accurate Neural Network (NN) for aerodynamic performance prediction. To clearly describe the interplay between design variables, hierarchical and categorical kernels are used in the BO formulation. We demonstrate the efficiency of our approach through two comprehensive case studies, where the optimized NN significantly outperforms baseline models and other publicly available NNs in terms of accuracy and parameter efficiency. For the drag coefficient prediction task, the Mean Absolute Percentage Error (MAPE) of our optimized model drops from 0.1433\% to 0.0163\%, which is nearly an order of magnitude improvement over the baseline model. Additionally, our model achieves a MAPE of 0.82\% on a benchmark aircraft self-noise prediction problem, significantly outperforming existing models (where their MAPE values are around 2 to 3\%) while requiring less computational resources. The results highlight the potential of our framework to enhance the scalability and performance of NNs in large-scale MDO problems, offering a promising solution for the aerospace industry.

data mining, machine learning, optimization, (20 more...)

arXiv.org Machine Learning

2503.19479

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (1.00)

Industry:

Aerospace & Defense (1.00)
Transportation > Air (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

Add feedback

Causal invariant geographic network representations with feature and structural distribution shifts

Wang, Yuhan, He, Silu, Luo, Qinyao, Yuan, Hongyuan, Zhao, Ling, Zhu, Jiawei, Li, Haifeng

arXiv.org Machine LearningMar-25-2025

The existing methods learn geographic network representations through deep graph neural networks (GNNs) based on the i.i.d. assumption. However, the spatial heterogeneity and temporal dynamics of geographic data make the out-of-distribution (OOD) generalisation problem particularly salient. The latter are particularly sensitive to distribution shifts (feature and structural shifts) between testing and training data and are the main causes of the OOD generalisation problem. Spurious correlations are present between invariant and background representations due to selection biases and environmental effects, resulting in the model extremes being more likely to learn background representations. The existing approaches focus on background representation changes that are determined by shifts in the feature distributions of nodes in the training and test data while ignoring changes in the proportional distributions of heterogeneous and homogeneous neighbour nodes, which we refer to as structural distribution shifts. We propose a feature-structure mixed invariant representation learning (FSM-IRL) model that accounts for both feature distribution shifts and structural distribution shifts. To address structural distribution shifts, we introduce a sampling method based on causal attention, encouraging the model to identify nodes possessing strong causal relationships with labels or nodes that are more similar to the target node. Inspired by the Hilbert-Schmidt independence criterion, we implement a reweighting strategy to maximise the orthogonality of the node representations, thereby mitigating the spurious correlations among the node representations and suppressing the learning of background representations. Our experiments demonstrate that FSM-IRL exhibits strong learning capabilities on both geographic and social network datasets in OOD scenarios.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Machine Learning

doi: 10.1016/j.future.2025.107814

2503.19382

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology (0.49)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Stop Walking in Circles! Bailing Out Early in Projected Gradient Descent

Doldo, Philip, Everett, Derek, Khanna, Amol, Nguyen, Andre T, Raff, Edward

arXiv.org Machine LearningMar-25-2025

Projected Gradient Descent (PGD) under the $L_\infty$ ball has become one of the defacto methods used in adversarial robustness evaluation for computer vision (CV) due to its reliability and efficacy, making a strong and easy-to-implement iterative baseline. However, PGD is computationally demanding to apply, especially when using thousands of iterations is the current best-practice recommendation to generate an adversarial example for a single image. In this work, we introduce a simple novel method for early termination of PGD based on cycle detection by exploiting the geometry of how PGD is implemented in practice and show that it can produce large speedup factors while providing the \emph{exact} same estimate of model robustness as standard PGD. This method substantially speeds up PGD without sacrificing any attack strength, enabling evaluations of robustness that were previously computationally intractable.

artificial intelligence, iteration, machine learning, (18 more...)

arXiv.org Machine Learning

2503.19347

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.95)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback