AITopics

2404.10179

Country:

North America (0.14)
Asia > Middle East (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

arXiv.org Artificial IntelligenceJan-13-2023

Laser: Latent Set Representations for 3D Generative Modeling

Moreno, Pol, Kosiorek, Adam R., Strathmann, Heiko, Zoran, Daniel, Schneider, Rosalia G., Winckler, Björn, Markeeva, Larisa, Weber, Théophane, Rezende, Danilo J.

NeRF provides unparalleled fidelity of novel view synthesis: rendering a 3D scene from an arbitrary viewpoint. NeRF requires training on a large number of views that fully cover a scene, which limits its applicability. While these issues can be addressed by learning a prior over scenes in various forms, previous approaches have been either applied to overly simple scenes or struggling to render unobserved parts. We introduce Laser-NV: a generative model which achieves high modelling capacity, and which is based on a set-valued latent representation modelled by normalizing flows. Similarly to previous amortized approaches, Laser-NV learns structure from multiple scenes and is capable of fast, feed-forward inference from few views. To encourage higher rendering fidelity and consistency with observed views, Laser-NV further incorporates a geometry-informed attention mechanism over the observed views. Laser-NV further produces diverse and plausible completions of occluded parts of a scene while remaining consistent with observations. Laser-NV shows state-of-the-art novel-view synthesis quality when evaluated on ShapeNet and on a novel simulated City dataset, which features high uncertainty in the unobserved regions of the scene.

artificial intelligence, aser -nv, machine learning, (16 more...)

2301.05747

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningJul-29-2021

Neural Variational Gradient Descent

di Langosco, Lauro Langosco, Fortuin, Vincent, Strathmann, Heiko

Particle-based approximate Bayesian inference approaches such as Stein Variational Gradient Descent (SVGD) combine the flexibility and convergence guarantees of sampling methods with the computational benefits of variational inference. In practice, SVGD relies on the choice of an appropriate kernel function, which impacts its ability to model the target distribution -- a challenging problem with only heuristic solutions. We propose Neural Variational Gradient Descent (NVGD), which is based on parameterizing the witness function of the Stein discrepancy by a deep neural network whose parameters are learned in parallel to the inference, mitigating the necessity to make any kernel choices whatsoever. We empirically evaluate our method on popular synthetic inference problems, real-world Bayesian linear regression, and Bayesian neural network inference.

bayesian inference, neural network, stein discrepancy, (16 more...)

2107.10731

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.52)

arXiv.org Artificial IntelligenceMar-1-2021

Persistent Message Passing

Strathmann, Heiko, Barekatain, Mohammadamin, Blundell, Charles, Veličković, Petar

Graph neural networks (GNNs) are a powerful inductive bias for modelling algorithmic reasoning procedures and data structures. Their prowess was mainly demonstrated on tasks featuring Markovian dynamics, where querying any associated data structure depends only on its latest state. For many tasks of interest, however, it may be highly beneficial to support efficient data structure queries dependent on previous states. This requires tracking the data structure's evolution through time, placing significant pressure on the GNN's latent representations. We introduce Persistent Message Passing (PMP), a mechanism which endows GNNs with capability of querying past state by explicitly persisting it: rather than overwriting node representations, it creates new nodes whenever required. PMP generalises out-of-distribution to more than 2x larger test inputs on dynamic temporal range queries, significantly outperforming GNNs which overwrite states.

artificial intelligence, arxiv preprint arxiv, neural network, (15 more...)

2103.01043

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Software > Programming Languages (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.48)

arXiv.org Machine LearningNov-22-2018

Learning deep kernels for exponential family densities

Wenliang, Li, Sutherland, Dougal, Strathmann, Heiko, Gretton, Arthur

The kernel exponential family is a rich class of distributions,which can be fit efficiently and with statistical guarantees by score matching. Being required to choose a priori a simple kernel such as the Gaussian, however, limits its practical applicability. We provide a scheme for learning a kernel parameterized by a deep network, which can find complex location-dependent local features of the data geometry. This gives a very rich class of density models, capable of fitting complex structures on moderate-dimensional problems. Compared to deep density models fit via maximum likelihood, our approach provides a complementary set of strengths and tradeoffs: in empirical studies, the former can yield higher likelihoods, whereas the latter gives better estimates of the gradient of the log density, the score, which describes the distribution's shape.

kernel, neural network, optimization problem, (18 more...)

1811.08357

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceOct-24-2018

Scalable Gaussian Processes on Discrete Domains

Fortuin, Vincent, Dresdner, Gideon, Strathmann, Heiko, Rätsch, Gunnar

Kernel methods on discrete domains have shown great promise for many challenging tasks, e.g., on biological sequence data as well as on molecular structures. Scalable kernel methods like support vector machines offer good predictive performances but they often do not provide uncertainty estimates. In contrast, probabilistic kernel methods like Gaussian Processes offer uncertainty estimates in addition to good predictive performance but fall short in terms of scalability. We present the first sparse Gaussian Process approximation framework on discrete input domains. Our framework achieves good predictive performance as well as uncertainty estimates using different discrete optimization techniques. We present competitive results comparing our framework to support vector machine and full Gaussian Process baselines on synthetic data as well as on challenging real-world DNA sequence data.

artificial intelligence, health & medicine, likelihood, (19 more...)

1810.10368

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.96)

arXiv.org Machine LearningJun-6-2018

Deep Self-Organization: Interpretable Discrete Representation Learning on Time Series

Fortuin, Vincent, Hüser, Matthias, Locatello, Francesco, Strathmann, Heiko, Rätsch, Gunnar

Human professionals are often required to make decisions based on complex multivariate time series measurements in an online setting, e.g. in health care. Since human cognition is not optimized to work well in high-dimensional spaces, these decisions benefit from interpretable low-dimensional representations. However, many representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose to couple a variational autoencoder to a discrete latent space and introduce a topological structure through the use of self-organizing maps. This allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the latent space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application. In the latter experiment, our representation uncovers meaningful structure in the acute physiological state of a patient.

deep learning, neural network, representation, (20 more...)

1806.02199

Country:

Europe > United Kingdom (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Providers & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

arXiv.org Machine LearningMar-13-2018

Efficient and principled score estimation with Nystr\"om kernel exponential families

Sutherland, Dougal J., Strathmann, Heiko, Arbel, Michael, Gretton, Arthur

We propose a fast method with statistical guarantees for learning an exponential family density model where the natural parameter is in a reproducing kernel Hilbert space, and may be infinite-dimensional. The model is learned by fitting the derivative of the log density, the score, thus avoiding the need to compute a normalization constant. Our approach improves the computational efficiency of an earlier solution by using a low-rank, Nystr\"om-like solution. The new solution retains the consistency and convergence rates of the full-rank solution (exactly in Fisher distance, and nearly in other distances), with guarantees on the degree of cost and storage reduction. We evaluate the method in experiments on density estimation and in the construction of an adaptive Hamiltonian Monte Carlo sampler. Compared to an existing score learning approach using a denoising autoencoder, our estimator is empirically more data-efficient when estimating the score, runs faster, and has fewer parameters (which can be tuned in a principled and interpretable way), in addition to providing statistical guarantees.

artificial intelligence, efficient and principled score estimation, machine learning, (16 more...)

1705.0836

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

arXiv.org Machine LearningJul-25-2017

Kernel Sequential Monte Carlo

Schuster, Ingmar, Strathmann, Heiko, Paige, Brooks, Sejdinovic, Dino

We propose kernel sequential Monte Carlo (KSMC), a framework for sampling from static target densities. KSMC is a family of sequential Monte Carlo algorithms that are based on building emulator models of the current particle system in a reproducing kernel Hilbert space. We here focus on modelling nonlinear covariance structure and gradients of the target. The emulator's geometry is adaptively updated and subsequently used to inform local proposals. Unlike in adaptive Markov chain Monte Carlo, continuous adaptation does not compromise convergence of the sampler. KSMC combines the strengths of sequental Monte Carlo and kernel methods: superior performance for multimodal targets and the ability to estimate model evidence as compared to Markov chain Monte Carlo, and the emulator's ability to represent targets that exhibit high degrees of nonlinearity. As KSMC does not require access to target gradients, it is particularly applicable on targets whose gradients are unknown or prohibitively expensive. We describe necessary tuning details and demonstrate the benefits of the the proposed methodology on a series of challenging synthetic and real-world examples.

algorithm, artificial intelligence, survey article, (19 more...)

1510.03105

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.87)

arXiv.org Machine LearningFeb-10-2017

Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

Sutherland, Dougal J., Tung, Hsiao-Yu, Strathmann, Heiko, De, Soumyajit, Ramdas, Aaditya, Smola, Alex, Gretton, Arthur

We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples. Second, the MMD can be used to evaluate the performance of a generative model, by testing the model's samples against a reference data set. In the latter role, the optimized MMD is particularly helpful, as it gives an interpretable indication of how the model and data distributions differ, even in cases where individual model samples are not easily distinguished either by eye or by classifier.

artificial intelligence, kernel, neural network, (16 more...)

1611.04488

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)