AITopics

2507.16999

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)

arXiv.org Machine LearningJul-24-2025

Sequential Bayesian Design for Efficient Surrogate Construction in the Inversion of Darcy Flows

Wang, Hongji, Wang, Hongqiao, Ying, Jinyong, Zhou, Qingping

Inverse problems governed by partial differential equations (PDEs) play a crucial role in various fields, including computational science, image processing, and engineering. Particularly, Darcy flow equation is a fundamental equation in fluid mechanics, which plays a crucial role in understanding fluid flow through porous media. Bayesian methods provide an effective approach for solving PDEs inverse problems, while their numerical implementation requires numerous evaluations of computationally expensive forward solvers. Therefore, the adoption of surrogate models with lower computational costs is essential. However, constructing a globally accurate surrogate model for high-dimensional complex problems demands high model capacity and large amounts of data. To address this challenge, this study proposes an efficient locally accurate surrogate that focuses on the high-probability regions of the true likelihood in inverse problems, with relatively low model complexity and few training data requirements. Additionally, we introduce a sequential Bayesian design strategy to acquire the proposed surrogate since the high-probability region of the likelihood is unknown. The strategy treats the posterior evolution process of sequential Bayesian design as a Gaussian process, enabling algorithmic acceleration through one-step ahead prior. The complete algorithmic framework is referred to as Sequential Bayesian design for locally accurate surrogate (SBD-LAS). Finally, three experiments based the Darcy flow equation demonstrate the advantages of the proposed method in terms of both inversion accuracy and computational speed.

artificial intelligence, machine learning, surrogate model, (18 more...)

2507.17713

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Hayakawa, Takashi, Asai, Satoshi

Debiased maximum-likelihood estimators for hazard ratios under machine-learning adjustment

arXiv.org Machine LearningJul-24-2025

Previous studies have shown that hazard ratios between treatment groups estimated with the Cox model are uninterpretable because the indefinite baseline hazard of the model fails to identify temporal change in the risk set composition due to treatment assignment and unobserved factors among multiple, contradictory scenarios. To alleviate this problem, especially in studies based on observational data with uncontrolled dynamic treatment and real-time measurement of many covariates, we propose abandoning the baseline hazard and using machine learning to explicitly model the change in the risk set with or without latent variables. For this framework, we clarify the context in which hazard ratios can be causally interpreted, and then develop a method based on Neyman orthogonality to compute debiased maximum-likelihood estimators of hazard ratios. Computing the constructed estimators is more efficient than computing those based on weighted regression with marginal structural Cox models. Numerical simulations confirm that the proposed method identifies the ground truth with minimal bias. These results lay the foundation for developing a useful, alternative method for causal inference with uncontrolled, observational data in modern epidemiology.

artificial intelligence, estimator, machine learning, (18 more...)

2507.17686

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJul-24-2025

Should Bias Always be Eliminated? A Principled Framework to Use Data Bias for OOD Generation

Li, Yan, Chen, Guangyi, Deng, Yunlong, Li, Zijian, Tang, Zeyu, Wu, Anpeng, Zhang, Kun

Most existing methods for adapting models to out-of-distribution (OOD) domains rely on invariant representation learning to eliminate the influence of biased features. However, should bias always be eliminated -- and if not, when should it be retained, and how can it be leveraged? To address these questions, we first present a theoretical analysis that explores the conditions under which biased features can be identified and effectively utilized. Building on this theoretical foundation, we introduce a novel framework that strategically leverages bias to complement invariant representations during inference. The framework comprises two key components that leverage bias in both direct and indirect ways: (1) using invariance as guidance to extract predictive ingredients from bias, and (2) exploiting identified bias to estimate the environmental condition and then use it to explore appropriate bias-aware predictors to alleviate environment gaps. We validate our approach through experiments on both synthetic datasets and standard domain generalization benchmarks. Results consistently demonstrate that our method outperforms existing approaches, underscoring its robustness and adaptability.

artificial intelligence, machine learning, prediction, (13 more...)

2507.17001

Country: Europe > Netherlands (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Pérez, Aritz, Echegoyen, Carlos, Santafé, Guzmán

Decentralized Federated Learning of Probabilistic Generative Classifiers

arXiv.org Artificial IntelligenceJul-24-2025

--Federated learning is a paradigm of increasing relevance in real world applications, aimed at building a global model across a network of heterogeneous users without requiring the sharing of private data. We focus on model learning over decentralized architectures, where users collaborate directly to update the global model without relying on a central server . In this context, the current paper proposes a novel approach to collaboratively learn probabilistic generative classifiers with a parametric form. The framework is composed by a communication network over a set of local nodes, each of one having its own local data, and a local updating rule. The proposal involves sharing local statistics with neighboring nodes, where each node aggregates the neighbors' information and iteratively learns its own local classifier, which progressively converges to a global model. Extensive experiments demonstrate that the algorithm consistently converges to a globally competitive model across a wide range of network topologies, network sizes, local dataset sizes, and extreme non-i.i.d. In recent years, federated learning (FL) [1], [2] has gained increasing attention from both the research community [3], [4] and private companies [5], [6], as it enables the development of machine learning models across multiple users without requiring data centralization. This design inherently offers a fundamental layer of privacy while reducing the costs associated with massive data storage. FL traditionally achieves this by using a user-server architecture, where users train local models and share updates with a central server that aggregates them to build a global model [7], [8]. In contrast, decentralized FL [4], [9], [10] eliminates the need for a central server by enabling users to communicate directly and collaboratively train machine learning models.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2507.17285

Country: Europe > Spain (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.94)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Maasch, Jacqueline, Neiswanger, Willie, Ermon, Stefano, Kuleshov, Volodymyr

Probabilistic Graphical Models: A Concise Tutorial

arXiv.org Artificial IntelligenceJul-24-2025

Probabilistic graphical modeling is a branch of machine learning that uses probability distributions to describe the world, make predictions, and support decision-making under uncertainty. Underlying this modeling framework is an elegant body of theory that bridges two mathematical traditions: probability and graph theory. This framework provides compact yet expressive representations of joint probability distributions, yielding powerful generative models for probabilistic reasoning. This tutorial provides a concise introduction to the formalisms, methods, and applications of this modeling framework. After a review of basic probability and graph theory, we explore three dominant themes: (1) the representation of multivariate distributions in the intuitive visual language of graphs, (2) algorithms for learning model parameters and graphical structures from data, and (3) algorithms for inference, both exact and approximate.

artificial intelligence, machine learning, variational inference, (19 more...)

2507.17116

Country: North America > United States > California > Los Angeles County > Los Angeles (0.27)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(5 more...)

arXiv.org Machine LearningJul-23-2025

A Distributional View of High Dimensional Optimization

Benning, Felix

This PhD thesis presents a distributional view of optimization in place of a worst-case perspective. We motivate this view with an investigation of the failure point of classical optimization. Subsequently we consider the optimization of a randomly drawn objective function. This is the setting of Bayesian Optimization. After a review of Bayesian optimization we outline how such a distributional view may explain predictable progress of optimization in high dimension. It further turns out that this distributional view provides insights into optimal step size control of gradient descent. To enable these results, we develop mathematical tools to deal with random input to random functions and a characterization of non-stationary isotropic covariance kernels. Finally, we outline how assumptions about the data, specifically exchangability, can lead to random objective functions in machine learning and analyze their landscape.

artificial intelligence, high-dimensional non-convex optimization, machine learning, (20 more...)

2507.16315

Genre:

Workflow (0.92)
Research Report > New Finding (0.45)
Instructional Material > Course Syllabus & Notes (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

arXiv.org Machine LearningJul-23-2025

The Joys of Categorical Conformal Prediction

Caprio, Michele

Conformal prediction (CP) is an Uncertainty Representation technique that delivers finite-sample calibrated prediction regions for any underlying Machine Learning model. Its status as an Uncertainty Quantification (UQ) tool, though, has remained conceptually opaque: While Conformal Prediction Regions (CPRs) give an ordinal representation of uncertainty (larger regions typically indicate higher uncertainty), they lack the capability to cardinally quantify it (twice as large regions do not imply twice the uncertainty). We adopt a category-theoretic approach to CP -- framing it as a morphism, embedded in a commuting diagram, of two newly-defined categories -- that brings us three joys. First, we show that -- under minimal assumptions -- CP is intrinsically a UQ mechanism, that is, its cardinal UQ capabilities are a structural feature of the method. Second, we demonstrate that CP bridges (and perhaps subsumes) the Bayesian, frequentist, and imprecise probabilistic approaches to predictive statistical reasoning. Finally, we show that a CPR is the image of a covariant functor. This observation is relevant to AI privacy: It implies that privacy noise added locally does not break the global coverage guarantee.

artificial intelligence, category, machine learning, (18 more...)

2507.04441

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
(2 more...)

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Machine LearningJul-23-2025

Structural Effect and Spectral Enhancement of High-Dimensional Regularized Linear Discriminant Analysis

Zhang, Yonghan, Pu, Zhangni, Yan, Lu, Hu, Jiang

Regularized linear discriminant analysis (RLDA) is a widely used tool for classification and dimensionality reduction, but its performance in high-dimensional scenarios is inconsistent. Existing theoretical analyses of RLDA often lack clear insight into how data structure affects classification performance. To address this issue, we derive a non-asymptotic approximation of the misclassification rate and thus analyze the structural effect and structural adjustment strategies of RLDA. Based on this, we propose the Spectral Enhanced Discriminant Analysis (SEDA) algorithm, which optimizes the data structure by adjusting the spiked eigenvalues of the population covariance matrix. By developing a new theoretical result on eigenvectors in random matrix theory, we derive an asymptotic approximation on the misclassification rate of SEDA. The bias correction algorithm and parameter selection strategy are then obtained. Experiments on synthetic and real datasets show that SEDA achieves higher classification accuracy and dimensionality reduction compared to existing LDA methods.

artificial intelligence, machine learning, misclassification rate, (18 more...)

2507.16682

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJul-23-2025

Learning Text Styles: A Study on Transfer, Attribution, and Verification

Hu, Zhiqiang

This thesis advances the computational understanding and manipulation of text styles through three interconnected pillars: (1) Text Style Transfer (TST), which alters stylistic properties (e.g., sentiment, formality) while preserving content; (2)Authorship Attribution (AA), identifying the author of a text via stylistic fingerprints; and (3) Authorship V erification (A V), determining whether two texts share the same authorship. We address critical challenges in these areas by leveraging parameter-efficient adaptation of large language models (LLMs), contrastive disentanglement of stylistic features, and instruction-based fine-tuning for explainable verification. First, for TST, we conduct a comprehensive survey and reproducibility study of 19 state-of-the-art algorithms, establishing benchmarks across diverse datasets. Building on these insights, we introduce LLM-Adapters, a unified framework for parameter-efficient fine-tuning (PEFT) that enables cost-effective adaptation of LLMs for style-centric tasks. This culminates in Adapter-TST, a novel architecture that models multiple stylistic attributes (e.g., sentiment, tense) using lightweight neural adapters. Adapter-TST achieves superior performance in multi-attribute transfer and compositional editing while reducing computational costs by 80% compared to full fine-tuning. For AA, we propose ContrastDistAA, a contrastive learning framework that disentangles content and style features to address performance degradation under topic shifts. Our method advances both individual-level attribution and regional linguistic analysis, achieving state-of-the-art accuracy by isolating culturally influenced stylistic patterns.

large language model, machine learning, natural language, (20 more...)

2507.1653

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.27)

Genre:

Summary/Review (1.00)
Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)