AITopics

2210.17299

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.50)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceApr-4-2023

Geometric constraints improve inference of sparsely observed stochastic dynamics

Maoutsa, Dimitra

The dynamics of systems of many degrees of freedom evolving on multiple scales are often modeled in terms of stochastic differential equations. Usually the structural form of these equations is unknown and the only manifestation of the system's dynamics are observations at discrete points in time. Despite their widespread use, accurately inferring these systems from sparse-in-time observations remains challenging. Conventional inference methods either focus on the temporal structure of observations, neglecting the geometry of the system's invariant density, or use geometric approximations of the invariant density, which are limited to conservative driving forces. To address these limitations, here, we introduce a novel approach that reconciles these two perspectives. We propose a path augmentation scheme that employs data-driven control to account for the geometry of the invariant system's density. Non-parametric inference on the augmented paths, enables efficient identification of the underlying deterministic forces of systems observed at low sampling rates.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2304.00423

Country:

North America > United States > New York (0.04)
North America > United States > Indiana (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Clerico, Eugenio, Farghly, Tyler, Deligiannidis, George, Guedj, Benjamin, Doucet, Arnaud

Generalisation under gradient descent via deterministic PAC-Bayes

arXiv.org Artificial IntelligenceApr-4-2023

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2209.02525

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Khorashadizadeh, AmirEhsan, Kothari, Konik, Salsi, Leonardo, Harandi, Ali Aghababaei, de Hoop, Maarten, Dokmanić, Ivan

Conditional Injective Flows for Bayesian Imaging

Most deep learning models for computational imaging regress a single reconstructed image. In practice, however, ill-posedness, nonlinearity, model mismatch, and noise often conspire to make such point estimates misleading or insufficient. The Bayesian approach models images and (noisy) measurements as jointly distributed random vectors and aims to approximate the posterior distribution of unknowns. Recent variational inference methods based on conditional normalizing flows are a promising alternative to traditional MCMC methods, but they come with drawbacks: excessive memory and compute demands for moderate to high resolution images and underwhelming performance on hard nonlinear problems. In this work, we propose C-Trumpets -- conditional injective flows specifically designed for imaging problems, which greatly diminish these challenges. Injectivity reduces memory footprint and training time while low-dimensional latent space together with architectural innovations like fixed-volume-change layers and skip-connection revnet layers, C-Trumpets outperform regular conditional flow models on a variety of imaging and image restoration tasks, including limited-view CT and nonlinear inverse scattering, with a lower compute and memory budget. C-Trumpets enable fast approximation of point estimates like MMSE or MAP as well as physically-meaningful uncertainty quantification.

artificial intelligence, c-trumpet, machine learning, (18 more...)

doi: 10.1109/TCI.2023.3248949

2204.07664

Country:

Europe (0.93)
North America > United States (0.68)

Genre: Research Report (0.63)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Xu, Zhao, Rubio, Daniel Onoro, Serra, Giuseppe, Niepert, Mathias

Learning Sparsity of Representations with Discrete Latent Variables

Deep latent generative models have attracted increasing attention due to the capacity of combining the strengths of deep learning and probabilistic models in an elegant way. The data representations learned with the models are often continuous and dense. However in many applications, sparse representations are expected, such as learning sparse high dimensional embedding of data in an unsupervised setting, and learning multi-labels from thousands of candidate tags in a supervised setting. In some scenarios, there could be further restriction on degree of sparsity: the number of non-zero features of a representation cannot be larger than a pre-defined threshold $L_0$. In this paper we propose a sparse deep latent generative model SDLGM to explicitly model degree of sparsity and thus enable to learn the sparse structure of the data with the quantified sparsity constraint. The resulting sparsity of a representation is not fixed, but fits to the observation itself under the pre-defined restriction. In particular, we introduce to each observation $i$ an auxiliary random variable $L_i$, which models the sparsity of its representation. The sparse representations are then generated with a two-step sampling process via two Gumbel-Softmax distributions. For inference and learning, we develop an amortized variational method based on MC gradient estimator. The resulting sparse representations are differentiable with backpropagation. The experimental evaluation on multiple datasets for unsupervised and supervised learning problems shows the benefits of the proposed method.

artificial intelligence, machine learning, representation, (19 more...)

doi: 10.1109/IJCNN52387.2021.9533762

2304.00935

Country:

Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Telecommunications (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Yoshida, Naoki, Watanabe, Sumio

Upper Bound of Real Log Canonical Threshold of Tensor Decomposition and its Application to Bayesian Inference

Tensor decomposition is widely used in data science and machine learning [1]. For instance, It plays the central roles in signal processing by contribution analysis [2], data compression by converting tensor data to matrix data [3], and data recovery by counting backwards from the matrices to the original tensor data [4]. In many cases, tensor decomposition itself is known to be NP-hard [5]. For this reason, tensor decomposition is often calculated approximately by Bayesian inference. However, its mathematical property is not yet completely clarified because it is one of the singular statistical models. In this paper, we derive its generalization performance in Bayesian inference. Tensor decomposition has mainly two types: Tucker decomposition and CP decomposition.

artificial intelligence, bayesian inference, machine learning, (13 more...)

2303.05731

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Muttenthaler, Lukas, Dippel, Jonas, Linhardt, Lorenz, Vandermeulen, Robert A., Kornblith, Simon

Human alignment of neural network representations

Today's computer vision models achieve human or near-human level performance across a wide variety of vision tasks. However, their architectures, data, and learning algorithms differ in numerous ways from those that give rise to human vision. In this paper, we investigate the factors that affect the alignment between the representations learned by neural networks and human mental representations inferred from behavioral responses. We find that model scale and architecture have essentially no effect on the alignment with human behavioral responses, whereas the training dataset and objective function both have a much larger impact. These findings are consistent across three datasets of human similarity judgments collected using two different tasks. Linear transformations of neural network representations learned from behavioral responses from one dataset substantially improve alignment with human similarity judgments on the other two datasets. In addition, we find that some human concepts such as food and animals are well-represented by neural networks whereas others such as royal or sports-related objects are not. Overall, although models trained on larger, more diverse datasets achieve better alignment with humans than models trained on ImageNet alone, our results indicate that scaling alone is unlikely to be sufficient to train neural networks with conceptual representations that match those used by humans. Representation learning is a fundamental part of modern computer vision systems, but the paradigm has its roots in cognitive science. When Rumelhart et al. (1986) developed backpropagation, their goal was to find a method that could learn representations of concepts that are distributed across neurons, similarly to the human brain. The discovery that representations learned by backpropagation could replicate nontrivial aspects of human concept learning was a key factor in its rise to popularity in the late 1980s (Sutherland, 1986; Ng & Hinton, 2017). A string of empirical successes has since shifted the primary focus of representation learning research away from its similarities to human cognition and toward practical applications. This shift has been fruitful. By some metrics, the best computer vision models now outperform the best individual humans on benchmarks such as ImageNet (Shankar et al., 2020; Beyer et al., 2020; Vasudevan et al., 2022). As computer vision systems become increasingly widely used outside of research, we would like to know if they see the world in the same way that humans do.

artificial intelligence, machine learning, supervised, (19 more...)

2211.01201

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(7 more...)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
(2 more...)

arXiv.org Machine LearningApr-3-2023

Artificial neural networks and time series of counts: A class of nonlinear INGARCH models

Jahn, Malte

Time series of counts are frequently analyzed using generalized integer-valued autoregressive models with conditional heteroskedasticity (INGARCH). These models employ response functions to map a vector of past observations and past conditional expectations to the conditional expectation of the present observation. In this paper, it is shown how INGARCH models can be combined with artificial neural network (ANN) response functions to obtain a class of nonlinear INGARCH models. The ANN framework allows for the interpretation of many existing INGARCH models as a degenerate version of a corresponding neural model. Details on maximum likelihood estimation, marginal effects and confidence intervals are given. The empirical analysis of time series of bounded and unbounded counts reveals that the neural INGARCH models are able to outperform reasonable degenerate competitor models in terms of the information loss.

artificial intelligence, ingarch model, machine learning, (14 more...)

arXiv.org Machine Learning

2304.01025

Country:

Europe > Poland (0.04)
North America > United States (0.04)
Europe > Germany > Hamburg (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Banking & Finance > Economy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Xu, Zhao, Lawrence, Carolin, Shaker, Ammar, Siarheyeu, Raman

Uncertainty Propagation in Node Classification

Quantifying predictive uncertainty of neural networks has recently attracted increasing attention. In this work, we focus on measuring uncertainty of graph neural networks (GNNs) for the task of node classification. Most existing GNNs model message passing among nodes. The messages are often deterministic. Questions naturally arise: Does there exist uncertainty in the messages? How could we propagate such uncertainty over a graph together with messages? To address these issues, we propose a Bayesian uncertainty propagation (BUP) method, which embeds GNNs in a Bayesian modeling framework, and models predictive uncertainty of node classification with Bayesian confidence of predictive probability and uncertainty of messages. Our method proposes a novel uncertainty propagation mechanism inspired by Gaussian models. Moreover, we present an uncertainty oriented loss for node classification that allows the GNNs to clearly integrate predictive uncertainty in learning procedure. Consequently, the training examples with large predictive uncertainty will be penalized. We demonstrate the BUP with respect to prediction reliability and out-of-distribution (OOD) predictions. The learned uncertainty is also analyzed in depth. The relations between uncertainty and graph topology, as well as predictive uncertainty in the OOD cases are investigated with extensive experiments. The empirical results with popular benchmark datasets demonstrate the superior performance of the proposed method.

artificial intelligence, machine learning, node, (19 more...)

doi: 10.1109/ICDM54844.2022.00167

2304.00918

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Jun, Sung Jae, Lee, Sokbae

Average Adjusted Association: Efficient Estimation with High Dimensional Confounders

arXiv.org Artificial IntelligenceApr-2-2023

The log odds ratio is a well-established metric for evaluating the association between binary outcome and exposure variables. Despite its widespread use, there has been limited discussion on how to summarize the log odds ratio as a function of confounders through averaging. To address this issue, we propose the Average Adjusted Association (AAA), which is a summary measure of association in a heterogeneous population, adjusted for observed confounders. To facilitate the use of it, we also develop efficient double/debiased machine learning (DML) estimators of the AAA. Our DML estimators use two equivalent forms of the efficient influence function, and are applicable in various sampling scenarios, including random sampling, outcome-based sampling, and exposure-based sampling. Through real data and simulations, we demonstrate the practicality and effectiveness of our proposed estimators in measuring the AAA.

artificial intelligence, estimator, machine learning, (17 more...)

2205.14048

Country:

North America > Greenland (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)