AITopics | Biloš, Marin

Collaborating Authors

Biloš, Marin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Variational Schr\"odinger Diffusion Models

Deng, Wei, Luo, Weijian, Tan, Yixin, Biloš, Marin, Chen, Yu, Nevmyvaka, Yuriy, Chen, Ricky T. Q.

arXiv.org Artificial IntelligenceJun-19-2024

Schr\"odinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schr\"odinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.04795

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

Rasul, Kashif, Ashok, Arjun, Williams, Andrew Robert, Ghonia, Hena, Bhagwatkar, Rishika, Khorasani, Arian, Bayazi, Mohammad Javad Darvishi, Adamopoulos, George, Riachi, Roland, Hassen, Nadhir, Biloš, Marin, Garg, Sahil, Schneider, Anderson, Chapados, Nicolas, Drouin, Alexandre, Zantedeschi, Valentina, Nevmyvaka, Yuriy, Rish, Irina

arXiv.org Artificial IntelligenceFeb-8-2024

Over the past years, foundation models have caused a paradigm shift in machine learning due to their unprecedented capabilities for zero-shot and few-shot generalization. However, despite the success of foundation models in modalities such as natural language processing and computer vision, the development of foundation models for time series forecasting has lagged behind. We present Lag-Llama, a general-purpose foundation model for univariate probabilistic time series forecasting based on a decoder-only transformer architecture that uses lags as covariates. Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities compared to a wide range of forecasting models on downstream datasets across domains. Moreover, when fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance, outperforming prior deep learning approaches, emerging as the best general-purpose model on average. Lag-Llama serves as a strong contender to the current state-of-art in time series forecasting and paves the way for future advancements in foundation models tailored to time series data.

data mining, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2310.08278

Country:

Oceania > Australia (0.93)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Energy > Renewable (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Add and Thin: Diffusion for Temporal Point Processes

Lüdke, David, Biloš, Marin, Shchur, Oleksandr, Lienen, Marten, Günnemann, Stephan

arXiv.org Machine LearningNov-2-2023

Autoregressive neural networks within the temporal point process (TPP) framework have become the standard for modeling continuous-time event data. Even though these models can expressively capture event sequences in a one-step-ahead fashion, they are inherently limited for long-term forecasting applications due to the accumulation of errors caused by their sequential nature. To overcome these limitations, we derive ADD-THIN, a principled probabilistic denoising diffusion model for TPPs that operates on entire event sequences. Unlike existing diffusion approaches, ADD-THIN naturally handles data with discrete and continuous components. In experiments on synthetic and real-world datasets, our model matches the state-of-the-art TPP models in density estimation and strongly outperforms them in forecasting.

artificial intelligence, event sequence, machine learning, (15 more...)

arXiv.org Machine Learning

2311.01139

Country: North America > United States > New York (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Modeling Temporal Data as Continuous Functions with Stochastic Process Diffusion

Biloš, Marin, Rasul, Kashif, Schneider, Anderson, Nevmyvaka, Yuriy, Günnemann, Stephan

arXiv.org Artificial IntelligenceMay-19-2023

Temporal data such as time series can be viewed as discretized measurements of the underlying function. To build a generative model for such data we have to model the stochastic process that governs it. We propose a solution by defining the denoising diffusion model in the function space which also allows us to naturally handle irregularly-sampled observations. The forward process gradually adds noise to functions, preserving their continuity, while the learned reverse process removes the noise and returns functions as new samples. To this end, we define suitable noise sources and introduce novel denoising and score-matching models. We show how our method can be used for multivariate probabilistic forecasting and imputation, and how our model can be interpreted as a neural process.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Artificial Intelligence

2211.0259

Country:

North America > United States (0.46)
Europe > Germany (0.28)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.64)

Add feedback

Fast and Flexible Temporal Point Processes with Triangular Maps

Shchur, Oleksandr, Gao, Nicholas, Biloš, Marin, Günnemann, Stephan

arXiv.org Machine LearningNov-10-2020

Temporal point process (TPP) models combined with recurrent neural networks provide a powerful framework for modeling continuous-time event data. While such models are flexible, they are inherently sequential and therefore cannot benefit from the parallelism of modern hardware. By exploiting the recent developments in the field of normalizing flows, we design TriTPP -- a new class of non-recurrent TPP models, where both sampling and likelihood computation can be done in parallel. TriTPP matches the flexibility of RNN-based methods but permits orders of magnitude faster sampling. This enables us to use the new model for variational inference in continuous-time discrete-state systems. We demonstrate the advantages of the proposed framework on synthetic and real-world datasets.

deep learning, neural network, tritpp, (19 more...)

arXiv.org Machine Learning

2006.12631

Country:

North America > United States (0.28)
Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)
Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(3 more...)

Add feedback

Equivariant Normalizing Flows for Point Processes and Sets

Biloš, Marin, Günnemann, Stephan

arXiv.org Machine LearningOct-7-2020

A point process describes how random sets of exchangeable points are generated. The points usually influence the positions of each other via attractive and repulsive forces. To model this behavior, it is enough to transform the samples from the uniform process with a sufficiently complex equivariant function. However, learning the parameters of the resulting process is challenging since the likelihood is hard to estimate and often intractable. This leads us to our proposed model - CONFET. Based on continuous normalizing flows, it allows arbitrary interactions between points while having tractable likelihood. Experiments on various real and synthetic datasets show the improved performance of our new scalable approach.

deep learning, neural network, point process, (21 more...)

arXiv.org Machine Learning

2010.03242

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Deep Representation Learning and Clustering of Traffic Scenarios

Harmening, Nick, Biloš, Marin, Günnemann, Stephan

arXiv.org Machine LearningJul-15-2020

Determining the traffic scenario space is a major challenge for the homologation and coverage assessment of automated driving functions. In contrast to current approaches that are mainly scenario-based and rely on expert knowledge, we introduce two data driven autoencoding models that learn a latent representation of traffic scenes. First is a CNN based spatio-temporal model that autoencodes a grid of traffic participants' positions. Secondly, we develop a pure temporal RNN based model that auto-encodes a sequence of sets. To handle the unordered set data, we had to incorporate the permutation invariance property. Finally, we show how the latent scenario embeddings can be used for clustering traffic scenarios and similarity retrieval.

deep learning, neural network, scenario, (15 more...)

arXiv.org Machine Learning

2007.0774

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.42)

Industry:

Automobiles & Trucks (0.49)
Transportation > Ground > Road (0.35)
Information Technology > Robotics & Automation (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)
Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Intensity-Free Learning of Temporal Point Processes

Shchur, Oleksandr, Biloš, Marin, Günnemann, Stephan

arXiv.org Machine LearningSep-26-2019

Temporal point processes are the dominant paradigm for modeling sequences of events happening at irregular intervals. The standard way of learning in such models is by estimating the conditional intensity function. However, parameterizing the intensity function usually incurs several trade-offs. We show how to overcome the limitations of intensity-based approaches by directly modeling the conditional distribution of inter-event times. We draw on the literature on normalizing flows to design models that are flexible and efficient. We additionally propose a simple mixture model that matches the flexibility of flow-based models, but also permits sampling and computing moments in closed form. The proposed models achieve state-of-the-art performance in standard prediction tasks and are suitable for novel applications, such as learning sequence embeddings and imputing missing data.

deep learning, neural network, sequence, (20 more...)

arXiv.org Machine Learning

1909.12127

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback