AITopics | Solin, Arno

Plotting

Solin, Arno

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Approximate Particle Smoothing Trajectories via Diffusion Generative Models

Tamir, Ella, Solin, Arno

arXiv.org Artificial IntelligenceJun-1-2024

Learning dynamical systems from sparse observations is critical in numerous fields, including biology, finance, and physics. Even if tackling such problems is standard in general information fusion, it remains challenging for contemporary machine learning models, such as diffusion models. We introduce a method that integrates conditional particle filtering with ancestral sampling and diffusion models, enabling the generation of realistic trajectories that align with observed data. Our approach uses a smoother based on iterating a conditional particle filter with ancestral sampling to first generate plausible trajectories matching observed marginals, and learns the corresponding diffusion model. This approach provides both a generative method for high-quality, smoothed trajectories under complex constraints, and an efficient approximation of the particle smoothing distribution for classical tracking problems. We demonstrate the approach in time-series generation and interpolation tasks, including vehicle tracking and single-cell RNA sequencing data.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2406.00561

Country: Europe (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Improving Discrete Diffusion Models via Structured Preferential Generation

Rissanen, Severi, Heinonen, Markus, Solin, Arno

arXiv.org Artificial IntelligenceMay-28-2024

In the domains of image and audio, diffusion models have shown impressive performance. However, their application to discrete data types, such as language, has often been suboptimal compared to autoregressive generative models. This paper tackles the challenge of improving discrete diffusion models by introducing a structured forward process that leverages the inherent information hierarchy in discrete categories, such as words in text. Our approach biases the generative process to produce certain categories before others, resulting in a notable improvement in log-likelihood scores on the text8 dataset. This work paves the way for more advances in discrete diffusion models with potentially significant enhancements in performance.

diffusion model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.17889

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Alignment is Key for Applying Diffusion Models to Retrosynthesis

Laabid, Najwa, Rissanen, Severi, Heinonen, Markus, Solin, Arno, Garg, Vikas

arXiv.org Artificial IntelligenceMay-27-2024

Retrosynthesis, the task of identifying precursors for a given molecule, can be naturally framed as a conditional graph generation task. Diffusion models are a particularly promising modelling approach, enabling post-hoc conditioning and trading off quality for speed during generation. We show mathematically that permutation equivariant denoisers severely limit the expressiveness of graph diffusion models and thus their adaptation to retrosynthesis. To address this limitation, we relax the equivariance requirement such that it only applies to aligned permutations of the conditioning and the generated graphs obtained through atom mapping. Our new denoiser achieves the highest top-$1$ accuracy ($54.7$\%) across template-free and template-based methods on USPTO-50k. We also demonstrate the ability for flexible post-training conditioning and good sample quality with small diffusion step counts, highlighting the potential for interactive applications and additional controls for multi-step planning.

artificial intelligence, machine learning, reaction, (12 more...)

arXiv.org Artificial Intelligence

2405.17656

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Flatness Improves Backbone Generalisation in Few-shot Classification

Li, Rui, Trapp, Martin, Klasson, Marcus, Solin, Arno

arXiv.org Artificial IntelligenceApr-11-2024

Deployment of deep neural networks in real-world settings typically requires adaptation to new tasks with few examples. Few-shot classification (FSC) provides a solution to this problem by leveraging pre-trained backbones for fast adaptation to new classes. Surprisingly, most efforts have only focused on developing architectures for easing the adaptation to the target domain without considering the importance of backbone training for good generalisation. We show that flatness-aware backbone training with vanilla fine-tuning results in a simpler yet competitive baseline compared to the state-of-the-art. Our results indicate that for in- and cross-domain FSC, backbone training is crucial to achieving good generalisation across different adaptation methods. We advocate more care should be taken when training these models.

artificial intelligence, backbone, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.07696

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Function-space Parameterization of Neural Networks for Sequential Learning

Scannell, Aidan, Mereu, Riccardo, Chang, Paul, Tamir, Ella, Pajarinen, Joni, Solin, Arno

arXiv.org Machine LearningMar-16-2024

Sequential learning paradigms pose challenges for gradient-based deep learning due to difficulties incorporating new data and retaining prior knowledge. While Gaussian processes elegantly tackle these problems, they struggle with scalability and handling rich inputs, such as images. To address these issues, we introduce a technique that converts neural networks from weight space to function space, through a dual parameterization. Our parameterization offers: (i) a way to scale function-space methods to large data sets via sparsification, (ii) retention of prior knowledge when access to past data is limited, and (iii) a mechanism to incorporate new data without retraining. Our experiments demonstrate that we can retain knowledge in continual learning and incorporate new data efficiently. We further show its strengths in uncertainty quantification and guiding exploration in model-based RL. Further information and code is available on the project website.

artificial intelligence, machine learning, sfr, (14 more...)

arXiv.org Machine Learning

2403.10929

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Fixing Overconfidence in Dynamic Neural Networks

Meronen, Lassi, Trapp, Martin, Pilzer, Andrea, Yang, Le, Solin, Arno

arXiv.org Artificial IntelligenceDec-8-2023

Dynamic neural networks are a recent technique that promises a remedy for the increasing size of modern deep learning models by dynamically adapting their computational cost to the difficulty of the inputs. In this way, the model can adjust to a limited computational budget. However, the poor quality of uncertainty estimates in deep learning models makes it difficult to distinguish between hard and easy samples. To address this challenge, we present a computationally efficient approach for post-hoc uncertainty quantification in dynamic neural networks. We show that adequately quantifying and accounting for both aleatoric and epistemic uncertainty through a probabilistic treatment of the last layers improves the predictive performance and aids decision-making when determining the computational budget. In the experiments, we show improvements on CIFAR-100, ImageNet, and Caltech-256 in terms of accuracy, capturing uncertainty, and calibration error.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2302.06359

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Transport with Support: Data-Conditional Diffusion Bridges

Tamir, Ella, Trapp, Martin, Solin, Arno

arXiv.org Artificial IntelligenceNov-24-2023

The dynamic Schr\"odinger bridge problem provides an appealing setting for solving constrained time-series data generation tasks posed as optimal transport problems. It consists of learning non-linear diffusion processes using efficient iterative solvers. Recent works have demonstrated state-of-the-art results (eg. in modelling single-cell embryo RNA sequences or sampling from complex posteriors) but are limited to learning bridges with only initial and terminal constraints. Our work extends this paradigm by proposing the Iterative Smoothing Bridge (ISB). We integrate Bayesian filtering and optimal control into learning the diffusion process, enabling the generation of constrained stochastic processes governed by sparse observations at intermediate stages and terminal constraints. We assess the effectiveness of our method on synthetic and real-world data generation tasks and we show that the ISB generalises well to high-dimensional data, is computationally efficient, and provides accurate estimates of the marginals at intermediate and terminal times.

artificial intelligence, machine learning, particle, (15 more...)

arXiv.org Artificial Intelligence

2301.13636

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

PriorCVAE: scalable MCMC parameter inference with Bayesian deep generative modelling

Semenova, Elizaveta, Verma, Prakhar, Cairney-Leeming, Max, Solin, Arno, Bhatt, Samir, Flaxman, Seth

arXiv.org Machine LearningNov-10-2023

Recent advances have shown that GP priors, or their finite realisations, can be encoded using deep generative models such as variational autoencoders (VAEs). These learned generators can serve as drop-in replacements for the original priors during MCMC inference. While this approach enables efficient inference, it loses information about the hyperparameters of the original models, and consequently makes inference over hyperparameters impossible and the learned priors indistinct. To overcome this limitation, we condition the VAE on stochastic process hyperparameters. This allows the joint encoding of hyperparameters with GP realizations and their subsequent estimation during inference. Further, we demonstrate that our proposed method, PriorCVAE, is agnostic to the nature of the models which it approximates, and can be used, for instance, to encode solutions of ODEs. It provides a practical tool for approximate inference and shows potential in real-life spatial and spatiotemporal applications.

artificial intelligence, machine learning, priorcvae, (17 more...)

arXiv.org Machine Learning

2304.04307

Country:

Africa (0.28)
Europe > United Kingdom (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Variational Gaussian Process Diffusion Processes

Verma, Prakhar, Adam, Vincent, Solin, Arno

arXiv.org Machine LearningOct-31-2023

Diffusion processes are a class of stochastic differential equations (SDEs) providing a rich family of expressive models that arise naturally in dynamic modelling tasks. Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems. We build upon work within variational inference, approximating the posterior process as a linear diffusion process, and point out pathologies in the approach. We propose an alternative parameterization of the Gaussian variational process using a site-based exponential family description. This allows us to trade a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.

artificial intelligence, cvi-dp, machine learning, (18 more...)

arXiv.org Machine Learning

2306.02066

Country:

Europe (0.28)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Zou, Yingtian, Verma, Vikas, Mittal, Sarthak, Tang, Wai Hoh, Pham, Hieu, Kannala, Juho, Bengio, Yoshua, Solin, Arno, Kawaguchi, Kenji

arXiv.org Artificial IntelligenceOct-15-2023

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.

artificial intelligence, machine learning, mixup, (18 more...)

arXiv.org Artificial Intelligence

2212.13381

Country:

Asia > Singapore (0.14)
North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback