Goto

Collaborating Authors

 hamburg


Critiquing DER SPIEGEL: The Four Dilemmas Facing Quality Journalism

Der Spiegel International

Not only that, but information is suddenly everywhere, people are losing trust in news outlets and there is a growing exhaustion with crisis reporting. Serious journalism is under greater pressure than ever before. How is DER SPIEGEL reacting? Quite some time ago, an email landed in my inbox from a former DER SPIEGEL editor. He wanted to pitch me a story and, as I quickly realized, stir things up a bit. Then, a couple of months ago, he approached me personally on the sidelines of an event in Hamburg, perhaps because I still hadn't shown much interest. He said we should meet up for a tea or something harder." He is plagued, he told me, each and every week by the wrenching, agonizing decision as to whether he should cancel his subscription to DER SPIEGEL - dismayed by what he described as an incipient decline under the dictates of late-capitalist sales imperatives" he had observed at his former employer.


Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech

de Oliveira, Danilo, Richter, Julius, Lemercier, Jean-Marie, Welker, Simon, Gerkmann, Timo

arXiv.org Artificial Intelligence

Diffusion models have found great success in generating high quality, natural samples of speech, but their potential for density estimation for speech has so far remained largely unexplored. In this work, we leverage an unconditional diffusion model trained only on clean speech for the assessment of speech quality. We show that the quality of a speech utterance can be assessed by estimating the likelihood of a corresponding sample in the terminating Gaussian distribution, obtained via a deterministic noising process. The resulting method is purely unsupervised, trained only on clean speech, and therefore does not rely on annotations. Our diffusion-based approach leverages clean speech priors to assess quality based on how the input relates to the learned distribution of clean data. Our proposed log-likelihoods show promising results, correlating well with intrusive speech quality metrics such as POLQA and SI-SDR.


Learning phase-space flows using time-discrete implicit Runge-Kutta PINNs

Corral, Álvaro Fernández, Mendoza, Nicolás, Iske, Armin, Yachmenev, Andrey, Küpper, Jochen

arXiv.org Artificial Intelligence

We present a computational framework for obtaining multidimensional phase-space solutions of systems of non-linear coupled differential equations, using high-order implicit Runge-Kutta Physics- Informed Neural Networks (IRK-PINNs) schemes. Building upon foundational work originally solving differential equations for fields depending on coordinates [J. Comput. Phys. 378, 686 (2019)], we adapt the scheme to a context where the coordinates are treated as functions. This modification enables us to efficiently solve equations of motion for a particle in an external field. Our scheme is particularly useful for explicitly time-independent and periodic fields. We apply this approach to successfully solve the equations of motion for a mass particle placed in a central force field and a charged particle in a periodic electric field.


CrashFormer: A Multimodal Architecture to Predict the Risk of Crash

Monsefi, Amin Karimi, Shiri, Pouya, Mohammadshirazi, Ahmad, Monsefi, Nastaran Karimi, Davies, Ron, Moosavi, Sobhan, Ramnath, Rajiv

arXiv.org Artificial Intelligence

Reducing traffic accidents is a crucial global public safety concern. Accident prediction is key to improving traffic safety, enabling proactive measures to be taken before a crash occurs, and informing safety policies, regulations, and targeted interventions. Despite numerous studies on accident prediction over the past decades, many have limitations in terms of generalizability, reproducibility, or feasibility for practical use due to input data or problem formulation. To address existing shortcomings, we propose CrashFormer, a multi-modal architecture that utilizes comprehensive (but relatively easy to obtain) inputs such as the history of accidents, weather information, map images, and demographic information. The model predicts the future risk of accidents on a reasonably acceptable cadence (i.e., every six hours) for a geographical location of 5.161 square kilometers. CrashFormer is composed of five components: a sequential encoder to utilize historical accidents and weather data, an image encoder to use map imagery data, a raw data encoder to utilize demographic information, a feature fusion module for aggregating the encoded features, and a classifier that accepts the aggregated data and makes predictions accordingly. Results from extensive real-world experiments in 10 major US cities show that CrashFormer outperforms state-of-the-art sequential and non-sequential models by 1.8% in F1-score on average when using ``sparse'' input data.


Explainable Trajectory Representation through Dictionary Learning

Tang, Yuanbo, Peng, Zhiyuan, Li, Yang

arXiv.org Artificial Intelligence

Trajectory representation learning on a network enhances our understanding of vehicular traffic patterns and benefits numerous downstream applications. Existing approaches using classic machine learning or deep learning embed trajectories as dense vectors, which lack interpretability and are inefficient to store and analyze in downstream tasks. In this paper, an explainable trajectory representation learning framework through dictionary learning is proposed. Given a collection of trajectories on a network, it extracts a compact dictionary of commonly used subpaths called "pathlets", which optimally reconstruct each trajectory by simple concatenations. The resulting representation is naturally sparse and encodes strong spatial semantics. Theoretical analysis of our proposed algorithm is conducted to provide a probabilistic bound on the estimation error of the optimal dictionary. A hierarchical dictionary learning scheme is also proposed to ensure the algorithm's scalability on large networks, leading to a multi-scale trajectory representation. Our framework is evaluated on two large-scale real-world taxi datasets. Compared to previous work, the dictionary learned by our method is more compact and has better reconstruction rate for new trajectories. We also demonstrate the promising performance of this method in downstream tasks including trip time prediction task and data compression.


Choose A Table: Tensor Dirichlet Process Multinomial Mixture Model with Graphs for Passenger Trajectory Clustering

Li, Ziyue, Yan, Hao, Zhang, Chen, Sun, Lijun, Ketter, Wolfgang, Tsung, Fugee

arXiv.org Machine Learning

Passenger clustering based on trajectory records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, including multiple trips within each passenger and multi-dimensional information about each trip. Furthermore, existing approaches rely on an accurate specification of the clustering number to start. Finally, existing methods do not consider spatial semantic graphs such as geographical proximity and functional similarity between the locations. In this paper, we propose a novel tensor Dirichlet Process Multinomial Mixture model with graphs, which can preserve the hierarchical structure of the multi-dimensional trip information and cluster them in a unified one-step manner with the ability to determine the number of clusters automatically. The spatial graphs are utilized in community detection to link the semantic neighbors. We further propose a tensor version of Collapsed Gibbs Sampling method with a minimum cluster size requirement. A case study based on Hong Kong metro passenger data is conducted to demonstrate the automatic process of cluster amount evolution and better cluster quality measured by within-cluster compactness and cross-cluster separateness. The code is available at https://github.com/bonaldli/TensorDPMM-G.


SRAI: Towards Standardization of Geospatial AI

Gramacki, Piotr, Leśniara, Kacper, Raczycki, Kamil, Woźniak, Szymon, Przymus, Marcin, Szymański, Piotr

arXiv.org Artificial Intelligence

Spatial Representations for Artificial Intelligence (srai) is a Python library for working with geospatial data. The library can download geospatial data, split a given area into micro-regions using multiple algorithms and train an embedding model using various architectures. It includes baseline models as well as more complex methods from published works. Those capabilities make it possible to use srai in a complete pipeline for geospatial task solving. The proposed library is the first step to standardize the geospatial AI domain toolset. It is fully open-source and published under Apache 2.0 licence.


DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion Models

Wen, Haomin, Lin, Youfang, Xia, Yutong, Wan, Huaiyu, Wen, Qingsong, Zimmermann, Roger, Liang, Yuxuan

arXiv.org Artificial Intelligence

Spatio-temporal graph neural networks (STGNN) have emerged as the dominant model for spatio-temporal graph (STG) forecasting. Despite their success, they fail to model intrinsic uncertainties within STG data, which cripples their practicality in downstream tasks for decision-making. To this end, this paper focuses on probabilistic STG forecasting, which is challenging due to the difficulty in modeling uncertainties and complex ST dependencies. In this study, we present the first attempt to generalize the popular denoising diffusion probabilistic models to STGs, leading to a novel non-autoregressive framework called DiffSTG, along with the first denoising network UGnet for STG in the framework. Our approach combines the spatio-temporal learning capabilities of STGNNs with the uncertainty measurements of diffusion models. Extensive experiments validate that DiffSTG reduces the Continuous Ranked Probability Score (CRPS) by 4%-14%, and Root Mean Squared Error (RMSE) by 2%-7% over existing methods on three real-world datasets.


Earth Virtualization Engines -- A Technical Perspective

Hoefler, Torsten, Stevens, Bjorn, Prein, Andreas F., Baehr, Johanna, Schulthess, Thomas, Stocker, Thomas F., Taylor, John, Klocke, Daniel, Manninen, Pekka, Forster, Piers M., Kölling, Tobias, Gruber, Nicolas, Anzt, Hartwig, Frauen, Claudia, Ziemen, Florian, Klöwer, Milan, Kashinath, Karthik, Schär, Christoph, Fuhrer, Oliver, Lawrence, Bryan N.

arXiv.org Artificial Intelligence

Participants of the Berlin Summit on Earth Virtualization Engines (EVEs) discussed ideas and concepts to improve our ability to cope with climate change. EVEs aim to provide interactive and accessible climate simulations and data for a wide range of users. They combine high-resolution physics-based models with machine learning techniques to improve the fidelity, efficiency, and interpretability of climate projections. At their core, EVEs offer a federated data layer that enables simple and fast access to exabyte-sized climate data through simple interfaces. In this article, we summarize the technical challenges and opportunities for developing EVEs, and argue that they are essential for addressing the consequences of climate change. We are all witnessing the effects of climate change. Hotter summers, prolonged droughts, massive flooding, or ocean heat waves are examples of extreme weather and climate events that are growing in frequency and intensity. Many agree that addressing climate mitigation and adaptation is the biggest problem humanity faces today. A large group of scientists and practitioners from different climate-related domains, including some computer scientists, got together for a week in Berlin this July to discuss the concept of "Earth Virtualization Engines" (EVEs). The summit kicked off with the question: "If climate change is the most critical problem today, why are we not using the largest computers to help solve it?".


Computing excited states of molecules using normalizing flows

Saleh, Yahya, Corral, Álvaro Fernández, Iske, Armin, Küpper, Jochen, Yachmenev, Andrey

arXiv.org Artificial Intelligence

We present a new nonlinear variational framework for simultaneously computing ground and excited states of quantum systems. Our approach is based on approximating wavefunctions in the linear span of basis functions that are augmented and optimized \emph{via} composition with normalizing flows. The accuracy and efficiency of our approach are demonstrated in the calculations of a large number of vibrational states of the triatomic H$_2$S molecule as well as ground and several excited electronic states of prototypical one-electron systems including the hydrogen atom, the molecular hydrogen ion, and a carbon atom in a single-active-electron approximation. The results demonstrate significant improvements in the accuracy of energy predictions and accelerated basis-set convergence even when using normalizing flows with a small number of parameters. The present approach can be also seen as the optimization of a set of intrinsic coordinates that best capture the underlying physics within the given basis set.