AITopics | Grötschla, Florian

Collaborating Authors

Grötschla, Florian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

High-Fidelity Music Vocoder using Neural Audio Codecs

Lanzendörfer, Luca A., Grötschla, Florian, Ungersböck, Michael, Wattenhofer, Roger

arXiv.org Artificial IntelligenceFeb-18-2025

-- While neural vocoders have made significant progress in high-fidelity speech synthesis, their application on polyphonic music has remained underexplored. In this work, we propose DisCoder, a neural vocoder that leverages a generative adversarial encoder-decoder architecture informed by a neural audio codec to reconstruct high-fidelity 44.1 kHz audio from mel spectrograms. Our approach first transforms the mel spectrogram into a lower-dimensional representation aligned with the Descript Audio Codec (DAC) latent space before reconstructing it to an audio signal using a fine-tuned DAC decoder . DisCoder achieves state-of-the-art performance in music synthesis on several objective metrics and in a MUSHRA listening study. Our approach also shows competitive performance in speech synthesis, highlighting its potential as a universal vocoder .

artificial intelligence, machine learning, speech synthesis, (13 more...)

arXiv.org Artificial Intelligence

2502.12759

Genre: Research Report > New Finding (0.68)

Industry:

Media > Music (0.35)
Leisure & Entertainment (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.55)

Add feedback

Audio Atlas: Visualizing and Exploring Audio Datasets

Lanzendörfer, Luca A., Grötschla, Florian, Valizada, Uzeyir, Wattenhofer, Roger

arXiv.org Artificial IntelligenceNov-30-2024

We introduce Audio Atlas, an interactive web application for visualizing audio data using text-audio embeddings. Audio Atlas is designed to facilitate the exploration and analysis of audio datasets using a contrastive embedding model and a vector database for efficient data management and semantic search. The system maps audio embeddings into a two-dimensional space and leverages DeepScatter for dynamic visualization. Designed for extensibility, Audio Atlas allows easy integration of new datasets, enabling users to better understand their audio data and identify both patterns and outliers. We open-source the codebase of Audio Atlas, and provide an initial implementation containing various audio and music datasets.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2412.00591

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.72)
Information Technology > Information Management > Search (0.70)
Information Technology > Artificial Intelligence > Natural Language (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

Benchmarking Positional Encodings for GNNs and Graph Transformers

Grötschla, Florian, Xie, Jiaqing, Wattenhofer, Roger

arXiv.org Artificial IntelligenceNov-19-2024

Recent advances in Graph Neural Networks (GNNs) and Graph Transformers (GTs) have been driven by innovations in architectures and Positional Encodings (PEs), which are critical for augmenting node features and capturing graph topology. PEs are essential for GTs, where topological information would otherwise be lost without message-passing. However, PEs are often tested alongside novel architectures, making it difficult to isolate their effect on established models. To address this, we present a comprehensive benchmark of PEs in a unified framework that includes both message-passing GNNs and GTs. We also establish theoretical connections between MPNNs and GTs and introduce a sparsified GRIT attention mechanism to examine the influence of global connectivity. Our findings demonstrate that previously untested combinations of GNN architectures and PEs can outperform existing methods and offer a more comprehensive picture of the state-of-the-art. To support future research and experimentation in our framework, we make the code publicly available. Graph machine learning has traditionally relied on message-passing neural networks (MPNNs), which work through iterative rounds of neighborhood aggregation (Kipf & Welling, 2016). In each round, nodes update their states by incorporating information from their neighbors along with their own current states. While effective in capturing local graph structures, this approach can struggle with modeling long-range dependencies. Graph Transformer (GT) architectures utilize full attention mechanisms to circumvent this, but necessitate new methods to integrate graph topology information (Dwivedi & Bresson, 2020). This is similar to how positional encodings (PEs) in Natural Language Processing (NLP) represent token positions within sequences (Vaswani et al., 2017). However, encoding positional information in graphs is more complex than in sequences. Ideally, positional encodings should allow the reconstruction of the graph's topology from node features and provide useful inductive biases to improve performance (Black et al., 2024).

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2411.12732

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SNAC: Multi-Scale Neural Audio Codec

Siuzdak, Hubert, Grötschla, Florian, Lanzendörfer, Luca A.

arXiv.org Artificial IntelligenceOct-18-2024

Neural audio codecs have recently gained popularity because they can represent audio signals with high fidelity at very low bitrates, making it feasible to use language modeling approaches for audio generation and understanding. Residual Vector Quantization (RVQ) has become the standard technique for neural audio compression using a cascade of VQ codebooks. This paper proposes the Multi-Scale Neural Audio Codec, a simple extension of RVQ where the quantizers can operate at different temporal resolutions. By applying a hierarchy of quantizers at variable frame rates, the codec adapts to the audio structure across multiple timescales. This leads to more efficient compression, as demonstrated by extensive objective and subjective evaluations. The code and model weights are open-sourced at https://github.com/hubertsiuzdak/snac.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.14411

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Benchmarking GNNs Using Lightning Network Data

Feichtinger, Rainer, Grötschla, Florian, Heimbach, Lioba, Wattenhofer, Roger

arXiv.org Artificial IntelligenceJul-5-2024

The Bitcoin Lightning Network is a layer 2 protocol designed to facilitate fast and inexpensive Bitcoin transactions. It operates by establishing channels between users, where Bitcoin is locked and transactions are conducted off-chain until the channels are closed, with only the initial and final transactions recorded on the blockchain. Routing transactions through intermediary nodes is crucial for users without direct channels, allowing these routing nodes to collect fees for their services. Nodes announce their channels to the network, forming a graph with channels as edges. In this paper, we analyze the graph structure of the Lightning Network and investigate the statistical relationships between node properties using machine learning, particularly Graph Neural Networks (GNNs). We formulate a series of tasks to explore these relationships and provide benchmarks for GNN architectures, demonstrating how topological and neighbor information enhances performance. Our evaluation of several models reveals the effectiveness of GNNs in these tasks and highlights the insights gained from their application.

artificial intelligence, machine learning, snapshot, (17 more...)

arXiv.org Artificial Intelligence

2407.07916

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CoRe-GD: A Hierarchical Framework for Scalable Graph Visualization with GNNs

Grötschla, Florian, Mathys, Joël, Veres, Robert, Wattenhofer, Roger

arXiv.org Artificial IntelligenceFeb-9-2024

Graph Visualization, also known as Graph Drawing, aims to find geometric embeddings of graphs that optimize certain criteria. Stress is a widely used metric; stress is minimized when every pair of nodes is positioned at their shortest path distance. However, stress optimization presents computational challenges due to its inherent complexity and is usually solved using heuristics in practice. We introduce a scalable Graph Neural Network (GNN) based Graph Drawing framework with sub-quadratic runtime that can learn to optimize stress. Inspired by classical stress optimization techniques and force-directed layout algorithms, we create a coarsening hierarchy for the input graph. Beginning at the coarsest level, we iteratively refine and un-coarsen the layout, until we generate an embedding for the original graph. To enhance information propagation within the network, we propose a novel positional rewiring technique based on intermediate node positions. Our empirical evaluation demonstrates that the framework achieves state-of-the-art performance while remaining scalable.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2402.06706

Country:

North America > United States (0.46)
Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence

Jordan, Philip, Grötschla, Florian, Fan, Flint Xiaofeng, Wattenhofer, Roger

arXiv.org Artificial IntelligenceJan-7-2024

In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results represent the first sample complexity analysis for Byzantine fault-tolerant decentralized federated non-convex optimization, our technical contributions may be of independent interest. Finally, we corroborate our theoretical results experimentally for common RL environments, demonstrating the speed-up of decentralized federations w.r.t. the number of participating agents and resilience against various Byzantine attacks.

agent, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2401.03489

Country:

Asia (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

Add feedback

SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning

Minder, Julian, Grötschla, Florian, Mathys, Joël, Wattenhofer, Roger

arXiv.org Artificial IntelligenceNov-20-2023

We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations. Many algorithms in CLRS require global memory or information exchange, mirrored in its execution model, which constructs fully connected (not sparse) graphs based on the underlying problem. Despite CLRS's aim of assessing how effectively learned algorithms can generalize to larger instances, the existing execution model becomes a significant constraint due to its demanding memory requirements and runtime (hard to scale). However, many important algorithms do not demand a fully connected graph; these algorithms, primarily distributed in nature, align closely with the message-passing paradigm employed by Graph Neural Networks. Hence, we propose SALSA-CLRS, an extension of the current CLRS benchmark specifically with scalability and sparseness in mind. Our approach includes adapted algorithms from the original CLRS benchmark and introduces new problems from distributed and randomized algorithms. Moreover, we perform a thorough empirical evaluation of our benchmark.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2309.12253

Country: Europe (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SURF: A Generalization Benchmark for GNNs Predicting Fluid Dynamics

Künzli, Stefan, Grötschla, Florian, Mathys, Joël, Wattenhofer, Roger

arXiv.org Artificial IntelligenceNov-20-2023

Simulating fluid dynamics is crucial for the design and development process, ranging from simple valves to complex turbomachinery. Accurately solving the underlying physical equations is computationally expensive. Therefore, learning-based solvers that model interactions on meshes have gained interest due to their promising speed-ups. However, it is unknown to what extent these models truly understand the underlying physical principles and can generalize rather than interpolate. Generalization is a key requirement for a general-purpose fluid simulator, which should adapt to different topologies, resolutions, or thermodynamic ranges. We propose SURF, a benchmark designed to test the $\textit{generalization}$ of learned graph-based fluid simulators. SURF comprises individual datasets and provides specific performance and generalization metrics for evaluating and comparing different models. We empirically demonstrate the applicability of SURF by thoroughly investigating the two state-of-the-art graph-based models, yielding new insights into their generalization.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.20049

Country: North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.93)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Flood and Echo: Algorithmic Alignment of GNNs with Distributed Computing

Mathys, Joël, Grötschla, Florian, Nadimpalli, Kalyan Varma, Wattenhofer, Roger

arXiv.org Artificial IntelligenceOct-12-2023

Graph Neural Networks are a natural fit for learning algorithms. They can directly represent tasks through an abstract but versatile graph structure and handle inputs of different sizes. This opens up the possibility for scaling and extrapolation to larger graphs, one of the most important advantages of an algorithm. However, this raises two core questions i) How can we enable nodes to gather the required information in a given graph (information exchange), even if is far away and ii) How can we design an execution framework which enables this information exchange for extrapolation to larger graph sizes (algorithmic alignment for extrapolation). Through its sparse but parallel activations it is provably more efficient in terms of message complexity. We study the proposed model and provide both empirical evidence and theoretical insights in terms of its expressiveness, efficiency, information exchange and ability to extrapolate. We study the problem of algorithm learning using Graph Neural Networks. The concept of an algorithm is best understood as a sequence of instructions which can be applied to compute a desired output given the respective input. Algorithms have the advantage, that they work correctly for their entire domain. If we want to multiply two numbers, we can easily illustrate and explain the multiplication algorithm using small numbers. However, the same procedure generalizes, i.e. the algorithm can be used to extrapolate and multiply together much larger numbers using the same algorithmic steps. Algorithm learning aims to grasp these underlying algorithmic principles and incorporate them into machine learning architectures. Therefore, the ability to process different input sizes and extrapolate are at the core of our study. Graphs and as an extension GNNs naturally present themselves to study algorithms as many algorithmic problems can be represented as graphs. Moreover, they can inherently capture instances of different sizes, which allows us to study extrapolation. GNNs follow the message passing paradigm, which closely corresponds to computation models studied in distributed computing.

artificial intelligence, graph, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2310.0697

Country:

Europe (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback