AITopics

2310.0057

Country:

North America > Canada > Ontario > Essex County > Windsor (0.04)
Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
Asia > Russia (0.04)

Genre: Research Report > Promising Solution (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningOct-1-2023

Performance-guaranteed regularization in maximum likelihood method: Gauge symmetry in Kullback -- Leibler divergence

Ichiki, Akihisa

The maximum likelihood method is the best-known method for estimating the probabilities behind the data. However, the conventional method obtains the probability model closest to the empirical distribution, resulting in overfitting. Then regularization methods prevent the model from being excessively close to the wrong probability, but little is known systematically about their performance. The idea of regularization is similar to error-correcting codes, which obtain optimal decoding by mixing suboptimal solutions with an incorrectly received code. The optimal decoding in error-correcting codes is achieved based on gauge symmetry. We propose a theoretically guaranteed regularization in the maximum likelihood method by focusing on a gauge symmetry in Kullback -- Leibler divergence. In our approach, we obtain the optimal model without the need to search for hyperparameters frequently appearing in regularization.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2303.16721

Country:

Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Bhatt, Shrey, Gupta, Aishwarya, Rai, Piyush

Federated Learning with Uncertainty via Distilled Predictive Distributions

arXiv.org Machine LearningOct-1-2023

Most existing federated learning methods are unable to estimate model/predictive uncertainty since the client models are trained using the standard loss function minimization approach which ignores such uncertainties. In many situations, however, especially in limited data settings, it is beneficial to take into account the uncertainty in the model parameters at each client as it leads to more accurate predictions and also because reliable estimates of uncertainty can be used for tasks, such as out-of-distribution (OOD) detection, and sequential decision-making tasks, such as active learning. We present a framework for federated learning with uncertainty where, in each round, each client infers the posterior distribution over its parameters as well as the posterior predictive distribution (PPD), distills the PPD into a single deep neural network, and sends this network to the server. Unlike some of the recent Bayesian approaches to federated learning, our approach does not require sending the whole posterior distribution of the parameters from each client to the server but only the PPD in the distilled form as a deep neural network. In addition, when making predictions at test time, it does not require computationally expensive Monte-Carlo averaging over the posterior distribution because our approach always maintains the PPD in the form of a single deep neural network. Moreover, our approach does not make any restrictive assumptions, such as the form of the clients' posterior distributions, or of their PPDs. We evaluate our approach on classification in federated setting, as well as active learning and OOD detection in federated settings, on which our approach outperforms various existing federated learning baselines.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2206.07562

Country:

North America > United States > Virginia (0.04)
Asia > India (0.04)
Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)

Genre: Research Report (1.00)

Industry: Education (0.73)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Islam, Upala Junaida, Paynabar, Kamran, Runger, George, Iquebal, Ashif Sikandar

Dynamic Exploration-Exploitation Trade-Off in Active Learning Regression with Bayesian Hierarchical Modeling

arXiv.org Machine LearningSep-30-2023

Active learning provides a framework to adaptively query the most informative experiments towards learning an unknown black-box function. Various approaches of active learning have been proposed in the literature, however, they either focus on exploration or exploitation in the design space. Methods that do consider exploration-exploitation simultaneously employ fixed or ad-hoc measures to control the trade-off that may not be optimal. In this paper, we develop a Bayesian hierarchical approach, referred as BHEEM, to dynamically balance the exploration-exploitation trade-off as more data points are queried. To sample from the posterior distribution of the trade-off parameter, We subsequently formulate an approximate Bayesian computation approach based on the linear dependence of queried data in the feature space. Simulated and real-world examples show the proposed approach achieves at least 21% and 11% average improvement when compared to pure exploration and exploitation strategies respectively. More importantly, we note that by optimally balancing the trade-off between exploration and exploitation, BHEEM performs better or at least as well as either pure exploration or pure exploitation.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2304.07665

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Qu, Zhaonan, Galichon, Alfred, Ugander, Johan

On Sinkhorn's Algorithm and Choice Modeling

arXiv.org Artificial IntelligenceSep-30-2023

For a broad class of choice and ranking models based on Luce's choice axiom, including the Bradley--Terry--Luce and Plackett--Luce models, we show that the associated maximum likelihood estimation problems are equivalent to a classic matrix balancing problem with target row and column sums. This perspective opens doors between two seemingly unrelated research areas, and allows us to unify existing algorithms in the choice modeling literature as special instances or analogs of Sinkhorn's celebrated algorithm for matrix balancing. We draw inspirations from these connections and resolve important open problems on the study of Sinkhorn's algorithm. We first prove the global linear convergence of Sinkhorn's algorithm for non-negative matrices whenever finite solutions to the matrix balancing problem exist. We characterize this global rate of convergence in terms of the algebraic connectivity of the bipartite graph constructed from data. Next, we also derive the sharp asymptotic rate of linear convergence, which generalizes a classic result of Knight (2008), but with a more explicit analysis that exploits an intrinsic orthogonality structure. To our knowledge, these are the first quantitative linear convergence results for Sinkhorn's algorithm for general non-negative matrices and positive marginals. The connections we establish in this paper between matrix balancing and choice modeling could help motivate further transmission of ideas and interesting results in both directions.

algorithm, matrix, sinkhorn, (14 more...)

2310.0026

Country:

Europe > Ireland (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry:

Transportation (0.67)
Government (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

arXiv.org Artificial IntelligenceSep-30-2023

Enhancing Representation Generalization in Authorship Identification

Wang, Haining

Authorship identification ascertains the authorship of texts whose origins remain undisclosed. That authorship identification techniques work as reliably as they do has been attributed to the fact that authorial style is properly captured and represented. Although modern authorship identification methods have evolved significantly over the years and have proven effective in distinguishing authorial styles, the generalization of stylistic features across domains has not been systematically reviewed. The presented work addresses the challenge of enhancing the generalization of stylistic representations in authorship identification, particularly when there are discrepancies between training and testing samples. A comprehensive review of empirical studies was conducted, focusing on various stylistic features and their effectiveness in representing an author's style. The influencing factors such as topic, genre, and register on writing style were also explored, along with strategies to mitigate their impact. While some stylistic features, like character n-grams and function words, have proven to be robust and discriminative, others, such as content words, can introduce biases and hinder cross-domain generalization. Representations learned using deep learning models, especially those incorporating character n-grams and syntactic information, show promise in enhancing representation generalization. The findings underscore the importance of selecting appropriate stylistic features for authorship identification, especially in cross-domain scenarios. The recognition of the strengths and weaknesses of various linguistic features paves the way for more accurate authorship identification in diverse contexts.

authorship attribution, computational linguistic, proceedings, (10 more...)

2310.00436

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
(30 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Schlichting, Marc R., Boord, Nina V., Corso, Anthony L., Kochenderfer, Mykel J.

SAVME: Efficient Safety Validation for Autonomous Systems Using Meta-Learning

arXiv.org Artificial IntelligenceSep-30-2023

Discovering potential failures of an autonomous system is important prior to deployment. Falsification-based methods are often used to assess the safety of such systems, but the cost of running many accurate simulation can be high. The validation can be accelerated by identifying critical failure scenarios for the system under test and by reducing the simulation runtime. We propose a Bayesian approach that integrates meta-learning strategies with a multi-armed bandit framework. Our method involves learning distributions over scenario parameters that are prone to triggering failures in the system under test, as well as a distribution over fidelity settings that enable fast and accurate simulations. In the spirit of meta-learning, we also assess whether the learned fidelity settings distribution facilitates faster learning of the scenario parameter distributions for new scenarios. We showcase our methodology using a cutting-edge 3D driving simulator, incorporating 16 fidelity settings for an autonomous vehicle stack that includes camera and lidar sensors. We evaluate various scenarios based on an autonomous vehicle pre-crash typology. As a result, our approach achieves a significant speedup, up to 18 times faster compared to traditional methods that solely rely on a high-fidelity simulator.

fidelity, scenario, simulator, (16 more...)

2309.12474

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.61)
(2 more...)

Lin, Alexander, Ba, Demba

An Efficient Algorithm for Clustered Multi-Task Compressive Sensing

arXiv.org Machine LearningSep-30-2023

This paper considers clustered multi-task compressive sensing, a hierarchical model that solves multiple compressive sensing tasks by finding clusters of tasks that leverage shared information to mutually improve signal reconstruction. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. The main bottleneck involves repeated matrix inversion and log-determinant computation for multiple large covariance matrices. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices. Our approach combines Monte Carlo sampling with iterative linear solvers. Our experiments reveal that compared to the existing baseline, our algorithm can be up to thousands of times faster and an order of magnitude more memory-efficient.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2310.0042

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Machine LearningSep-29-2023

Gradient and Uncertainty Enhanced Sequential Sampling for Global Fit

Lämmle, Sven, Bogoclu, Can, Cremanns, Kevin, Roos, Dirk

Most of these applications require computationally demanding simulation models. Machine learning (ML) models have emerged to mitigate the computational cost and are used as surrogates. For this task, ML models learn the relationship between the response of the expensive simulations from a dataset containing past observations. Various ML methods were proposed as surrogate models (sometimes referred as response surface methods), e.g. as solvers for partial differential equations (PDEs) describing the behavior of high-dimensional vector spaces or fields [1-3], or probabilistic approximators of parametric response functions [4]. Among these types of surrogate models, Gaussian Process Regression (GP) [5] is a popular choice [6, 7] because of its flexibility and ability to predict the model uncertainty. However, drawbacks are the selection of a suitable covariance (kernel) function that is application dependent, and the computational burden for larger data sets. Various extensions to GPs such as the deep GPs[8], sparse GPs based on variational inference [9, 10], and efficient matrix decomposition [11] were developed to overcome some of these limitations. One particularly promising approach is the combination of GPs with Artificial Neural Networks (ANNs) to Deep Gaussian Covariance Networks (DGCNs) [12, 13] to learn the non-stationary hyperparameters of the GP together with combinations of different covariance functions.

artificial intelligence, machine learning, surrogate model, (17 more...)

doi: 10.1016/j.cma.2023.116226

2310.0011

Country:

Europe > Germany (0.28)
Asia > Middle East > Israel > Mediterranean Sea (0.24)
North America > United States > New York (0.14)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Artificial IntelligenceSep-29-2023

Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation

Zhang, Wanpeng, Li, Yilin, Yang, Boyu, Lu, Zongqing

In real-world scenarios, the application of reinforcement learning is significantly challenged by complex non-stationarity. Most existing methods attempt to model changes in the environment explicitly, often requiring impractical prior knowledge. In this paper, we propose a new perspective, positing that non-stationarity can propagate and accumulate through complex causal relationships during state transitions, thereby compounding its sophistication and affecting policy learning. We believe that this challenge can be more effectively addressed by tracing the causal origin of non-stationarity. To this end, we introduce the Causal-Origin REPresentation (COREP) algorithm. COREP primarily employs a guided updating mechanism to learn a stable graph representation for states termed as causal-origin representation. By leveraging this representation, the learned policy exhibits impressive resilience to non-stationarity. We supplement our approach with a theoretical analysis grounded in the causal interpretation for non-stationary reinforcement learning, advocating for the validity of the causal-origin representation. Experimental results further demonstrate the superior performance of COREP over existing methods in tackling non-stationarity.

algorithm, causal-origin representation, representation, (14 more...)

2306.02747

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Austria (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)