AITopics

2411.14679

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Jang, Minguk, Chung, Hye Won

Label Distribution Shift-Aware Prediction Refinement for Test-Time Adaptation

arXiv.org Artificial IntelligenceNov-20-2024

Test-time adaptation (TTA) is an effective approach to mitigate performance degradation of trained models when encountering input distribution shifts at test time. However, existing TTA methods often suffer significant performance drops when facing additional class distribution shifts. We first analyze TTA methods under label distribution shifts and identify the presence of class-wise confusion patterns commonly observed across different covariate shifts. Based on this observation, we introduce label Distribution shift-Aware prediction Refinement for Test-time adaptation (DART), a novel TTA method that refines the predictions by focusing on class-wise confusion patterns. DART trains a prediction refinement module during an intermediate time by exposing it to several batches with diverse class distributions using the training dataset. This module is then used during test time to detect and correct class distribution shifts, significantly improving pseudo-label accuracy for test data. Our method exhibits 5-18% gains in accuracy under label distribution shifts on CIFAR-10C, without any performance degradation when there is no label distribution shift. Extensive experiments on CIFAR, PACS, OfficeHome, and ImageNet benchmarks demonstrate DART's ability to correct inaccurate predictions caused by test-time distribution shifts. This improvement leads to enhanced performance in existing TTA methods, making DART a valuable plug-in tool.

artificial intelligence, distribution shift, machine learning, (14 more...)

2411.15204

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Italy > Veneto > Venice (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

arXiv.org Machine LearningNov-20-2024

Nonlinear Assimilation with Score-based Sequential Langevin Sampling

Ding, Zhao, Duan, Chenguang, Jiao, Yuling, Yang, Jerry Zhijian, Yuan, Cheng, Zhang, Pingwen

This paper presents a novel approach for nonlinear assimilation called score-based sequential Langevin sampling (SSLS) within a recursive Bayesian framework. SSLS decomposes the assimilation process into a sequence of prediction and update steps, utilizing dynamic models for prediction and observation data for updating via score-based Langevin Monte Carlo. An annealing strategy is incorporated to enhance convergence and facilitate multi-modal sampling. The convergence of SSLS in TV-distance is analyzed under certain conditions, providing insights into error behavior related to hyper-parameters. Numerical examples demonstrate its outstanding performance in high-dimensional and nonlinear scenarios, as well as in situations with sparse or partial measurements. Furthermore, SSLS effectively quantifies the uncertainty associated with the estimated states, highlighting its potential for error calibration.

assimilation, inequality, nonlinear assimilation, (13 more...)

2411.13443

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Why you don't overfit, and don't need Bayes if you only train for one epoch

Aitchison, Laurence

Here, we show that in the data-rich setting where you only train on each datapoint once (or equivalently, you only train for one epoch), standard "maximum likelihood" training optimizes the true data generating process (DGP) loss, which is equivalent to the test loss. Further, we show that the Bayesian model average optimizes the same objective, albeit while taking the expectation over uncertainty induced by finite data. As standard maximum likelihood training in the single-epoch setting optimizes the same objective as Bayesian inference, we argue that we do not expect Bayesian inference to offer any advantages in terms of overfitting or calibration in these settings. This explains the diminishing importance of Bayes in areas such as LLMs, which are often trained with one (or very few) epochs.

artificial intelligence, arxiv preprint arxiv, machine learning, (11 more...)

2411.14478

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

The Game-Theoretic Symbiosis of Trust and AI in Networked Systems

Ge, Yunfei, Zhu, Quanyan

This chapter explores the symbiotic relationship between Artificial Intelligence (AI) and trust in networked systems, focusing on how these two elements reinforce each other in strategic cybersecurity contexts. Using a game-theoretic framework, this chapter presents approaches to trust evaluation, the strategic role of AI in cybersecurity, and governance frameworks that ensure responsible AI deployment. We investigate how trust, when dynamically managed through AI, can form a resilient security ecosystem. By examining trust as both an AI output and an AI requirement, this chapter sets the foundation for a positive feedback loop where AI enhances network security and the trust placed in AI systems fosters their adoption. The rapid development of network systems has been a catalyst for innovations such as 5G communications, edge computing, and network slicing [6], driving the transformation of Industry 4.0 [18] and introducing new services for critical infrastructures.

artificial intelligence, machine learning, real time system, (19 more...)

2411.12859

Country:

North America > United States > New York (0.04)
North America > United States > Washington > King County > Seattle (0.04)

Genre:

Summary/Review (0.54)
Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(4 more...)

C$^{2}$INet: Realizing Incremental Trajectory Prediction with Prior-Aware Continual Causal Intervention

Li, Xiaohe, Huang, Feilong, Fan, Zide, Mou, Fangli, Lin, Leilei, Hou, Yingyan, Wen, Lijie

Trajectory prediction for multi-agents in complex scenarios is crucial for applications like autonomous driving. However, existing methods often overlook environmental biases, which leads to poor generalization. Additionally, hardware constraints limit the use of large-scale data across environments, and continual learning settings exacerbate the challenge of catastrophic forgetting. To address these issues, we propose the Continual Causal Intervention (C$^{2}$INet) method for generalizable multi-agent trajectory prediction within a continual learning framework. Using variational inference, we align environment-related prior with posterior estimator of confounding factors in the latent space, thereby intervening in causal correlations that affect trajectory representation. Furthermore, we store optimal variational priors across various scenarios using a memory queue, ensuring continuous debiasing during incremental task training. The proposed C$^{2}$INet enhances adaptability to diverse tasks while preserving previous task information to prevent catastrophic forgetting. It also incorporates pruning strategies to mitigate overfitting. Comparative evaluations on three real and synthetic complex datasets against state-of-the-art methods demonstrate that our proposed method consistently achieves reliable prediction performance, effectively mitigating confounding factors unique to different scenarios. This highlights the practical value of our method for real-world applications.

dataset, prediction, trajectory prediction, (12 more...)

2411.12313

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology (0.48)
Transportation > Ground > Road (0.34)
Automobiles & Trucks (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
(2 more...)

Zhang, Honghua, Wang, Benjie, Arenas, Marcelo, Broeck, Guy Van den

Restructuring Tractable Probabilistic Circuits

Probabilistic circuits (PCs) is a unifying representation for probabilistic models that support tractable inference. Numerous applications of PCs like controllable text generation depend on the ability to efficiently multiply two circuits. Existing multiplication algorithms require that the circuits respect the same structure, i.e. variable scopes decomposes according to the same vtree. In this work, we propose and study the task of restructuring structured(-decomposable) PCs, that is, transforming a structured PC such that it conforms to a target vtree. We propose a generic approach for this problem and show that it leads to novel polynomial-time algorithms for multiplying circuits respecting different vtrees, as well as a practical depth-reduction algorithm that preserves structured decomposibility. Our work opens up new avenues for tractable PC inference, suggesting the possibility of training with less restrictive PC structures while enabling efficient inference by changing their structures at inference time.

algorithm, node, van den broeck, (15 more...)

2411.12256

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
South America > Chile (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

arXiv.org Machine LearningNov-19-2024

Eliminating Ratio Bias for Gradient-based Simulated Parameter Estimation

Li, Zehao, Peng, Yijie

This article addresses the challenge of parameter calibration in stochastic models where the likelihood function is not analytically available. We propose a gradient-based simulated parameter estimation framework, leveraging a multi-time scale algorithm that tackles the issue of ratio bias in both maximum likelihood estimation and posterior density estimation problems. Additionally, we introduce a nested simulation optimization structure, providing theoretical analyses including strong convergence, asymptotic normality, convergence rate, and budget allocation strategies for the proposed algorithm. The framework is further extended to neural network training, offering a novel perspective on stochastic approximation in machine learning. Numerical experiments show that our algorithm can improve the estimation accuracy and save computational costs.

algorithm, li and peng, ratio bias, (10 more...)

2411.12995

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Cao, Lianghao, Chen, Joshua, Brennan, Michael, O'Leary-Roseberry, Thomas, Marzouk, Youssef, Ghattas, Omar

LazyDINO: Fast, scalable, and efficiently amortized Bayesian inversion via structure-exploiting and surrogate-driven measure transport

arXiv.org Machine LearningNov-19-2024

We present LazyDINO, a transport map variational inference method for fast, scalable, and efficiently amortized solutions of high-dimensional nonlinear Bayesian inverse problems with expensive parameter-to-observable (PtO) maps. Our method consists of an offline phase in which we construct a derivative-informed neural surrogate of the PtO map using joint samples of the PtO map and its Jacobian. During the online phase, when given observational data, we seek rapid posterior approximation using surrogate-driven training of a lazy map [Brennan et al., NeurIPS, (2020)], i.e., a structure-exploiting transport map with low-dimensional nonlinearity. The trained lazy map then produces approximate posterior samples or density evaluations. Our surrogate construction is optimized for amortized Bayesian inversion using lazy map variational inference. We show that (i) the derivative-based reduced basis architecture [O'Leary-Roseberry et al., Comput. Methods Appl. Mech. Eng., 388 (2022)] minimizes the upper bound on the expected error in surrogate posterior approximation, and (ii) the derivative-informed training formulation [O'Leary-Roseberry et al., J. Comput. Phys., 496 (2024)] minimizes the expected error due to surrogate-driven transport map optimization. Our numerical results demonstrate that LazyDINO is highly efficient in cost amortization for Bayesian inversion. We observe one to two orders of magnitude reduction of offline cost for accurate posterior approximation, compared to simulation-based amortized inference via conditional transport and conventional surrogate-driven transport. In particular, LazyDINO outperforms Laplace approximation consistently using fewer than 1000 offline samples, while other amortized inference methods struggle and sometimes fail at 16,000 offline samples.

approximation, lazydino, training sample, (17 more...)

2411.12726

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Energy (0.92)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Hickey, Jimmy, Williams, Jonathan P., Reich, Brian J., Hector, Emily C.

Multivariate and Online Transfer Learning with Uncertainty Quantification

arXiv.org Machine LearningNov-19-2024

Untreated periodontitis causes inflammation within the supporting tissue of the teeth and can ultimately lead to tooth loss. Modeling periodontal outcomes is beneficial as they are difficult and time consuming to measure, but disparities in representation between demographic groups must be considered. There may not be enough participants to build group specific models and it can be ineffective, and even dangerous, to apply a model to participants in an underrepresented group if demographic differences were not considered during training. We propose an extension to RECaST Bayesian transfer learning framework. Our method jointly models multivariate outcomes, exhibiting significant improvement over the previous univariate RECaST method. Further, we introduce an online approach to model sequential data sets. Negative transfer is mitigated to ensure that the information shared from the other demographic groups does not negatively impact the modeling of the underrepresented participants. The Bayesian framework naturally provides uncertainty quantification on predictions. Especially important in medical applications, our method does not share data between domains. We demonstrate the effectiveness of our method in both predictive performance and uncertainty quantification on simulated data and on a database of dental records from the HealthPartners Institute.

mahalanobis distance, posterior distribution, target data, (15 more...)

2411.12555

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Alaska (0.04)
North America > United States > Virginia (0.04)
(3 more...)

Genre:

Research Report (0.81)
Instructional Material > Online (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.68)
Health & Medicine > Therapeutic Area > Dental and Oral Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)