AITopics | Stober, Sebastian

Collaborating Authors

Stober, Sebastian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative AI Training and Copyright Law

Dornis, Tim W., Stober, Sebastian

arXiv.org Artificial IntelligenceFeb-21-2025

Training generative AI models requires extensive amounts of data. A common practice is to collect such data through web scraping. Yet, much of what has been and is collected is copyright protected. Its use may be copyright infringement. In the USA, AI developers rely on "fair use" and in Europe, the prevailing view is that the exception for "Text and Data Mining" (TDM) applies. In a recent interdisciplinary tandem-study, we have argued in detail that this is actually not the case because generative AI training fundamentally differs from TDM. In this article, we share our main findings and the implications for both public and corporate research on generative models. We further discuss how the phenomenon of training data memorization leads to copyright issues independently from the "fair use" and TDM exceptions. Finally, we outline how the ISMIR could contribute to the ongoing discussion about fair practices with respect to generative AI that satisfy all stakeholders.

machine learning, natural language, training data, (19 more...)

arXiv.org Artificial Intelligence

2502.15858

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.50)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

TransferLight: Zero-Shot Traffic Signal Control on any Road-Network

Schmidt, Johann, Dreyer, Frank, Hashimi, Sayed Abid, Stober, Sebastian

arXiv.org Artificial IntelligenceDec-23-2024

Traffic signal control plays a crucial role in urban mobility. However, existing methods often struggle to generalize beyond their training environments to unseen scenarios with varying traffic dynamics. We present TransferLight, a novel framework designed for robust generalization across road-networks, diverse traffic conditions and intersection geometries. At its core, we propose a log-distance reward function, offering spatially-aware signal prioritization while remaining adaptable to varied lane configurations - overcoming the limitations of traditional pressure-based rewards. Our hierarchical, heterogeneous, and directed graph neural network architecture effectively captures granular traffic dynamics, enabling transferability to arbitrary intersection layouts. Using a decentralized multi-agent approach, global rewards, and novel state transition priors, we develop a single, weight-tied policy that scales zero-shot to any road network without re-training. Through domain randomization during training, we additionally enhance generalization capabilities. Experimental results validate TransferLight's superior performance in unseen scenarios, advancing practical, generalizable intelligent transportation systems to meet evolving urban traffic demands.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2412.09719

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

An Interpretable X-ray Style Transfer via Trainable Local Laplacian Filter

Eckert, Dominik, Ritschl, Ludwig, Syben, Christopher, Hümmer, Christian, Wicklein, Julia, Beister, Marcel, Kappler, Steffen, Stober, Sebastian

arXiv.org Artificial IntelligenceNov-11-2024

Radiologists have preferred visual impressions or 'styles' of X-ray images that are manually adjusted to their needs to support their diagnostic performance. In this work, we propose an automatic and interpretable X-ray style transfer by introducing a trainable version of the Local Laplacian Filter (LLF). From the shape of the LLF's optimized remap function, the characteristics of the style transfer can be inferred and reliability of the algorithm can be ensured. Moreover, we enable the LLF to capture complex X-ray style features by replacing the remap function with a Multi-Layer Perceptron (MLP) and adding a trainable normalization layer. We demonstrate the effectiveness of the proposed method by transforming unprocessed mammographic X-ray images into images that match the style of target mammograms and achieve a Structural Similarity Index (SSIM) of 0.94 compared to 0.82 of the baseline LLF style transfer method from Aubry et al.

artificial intelligence, llf, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.07072

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Improving Voice Quality in Speech Anonymization With Just Perception-Informed Losses

Ghosh, Suhita, Thiele, Tim, Lorbeer, Frederic, Dreyer, Frank, Stober, Sebastian

arXiv.org Artificial IntelligenceOct-20-2024

The increasing use of cloud-based speech assistants has heightened the need for effective speech anonymization, which aims to obscure a speaker's identity while retaining critical information for subsequent tasks. One approach to achieving this is through voice conversion. While existing methods often emphasize complex architectures and training techniques, our research underscores the importance of loss functions inspired by the human auditory system. Our proposed loss functions are model-agnostic, incorporating handcrafted and deep learning-based features to effectively capture quality representations. Through objective and subjective evaluations, we demonstrate that a VQVAE-based model, enhanced with our perception-driven losses, surpasses the vanilla model in terms of naturalness, intelligibility, and prosody while maintaining speaker anonymity. These improvements are consistently observed across various datasets, languages, target speakers, and genders.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.15499

Genre: Research Report (0.83)

Industry:

Health & Medicine (0.69)
Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example

Ghosh, Suhita, Jouaiti, Melanie, Das, Arnab, Sinha, Yamini, Polzehl, Tim, Siegert, Ingo, Stober, Sebastian

arXiv.org Artificial IntelligenceOct-20-2024

Speech anonymisation aims to protect speaker identity by changing personal identifiers in speech while retaining linguistic content. Current methods fail to retain prosody and unique speech patterns found in elderly and pathological speech domains, which is essential for remote health monitoring. To address this gap, we propose a voice conversion-based method (DDSP-QbE) using differentiable digital signal processing and query-by-example. The proposed method, trained with novel losses, aids in disentangling linguistic, prosodic, and domain representations, enabling the model to adapt to uncommon speech patterns. Objective and subjective evaluations show that DDSP-QbE significantly outperforms the voice conversion state-of-the-art concerning intelligibility, prosody, and domain preservation across diverse datasets, pathologies, and speakers while maintaining quality and speaker anonymity. Experts validate domain preservation by analysing twelve clinically pertinent domain attributes.

artificial intelligence, information management, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2024-328

2410.155

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Information Management > Search (0.70)

Add feedback

Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers

Schmidt, Johann, Stober, Sebastian

arXiv.org Artificial IntelligenceMay-27-2024

Deep neural networks are applied in more and more areas of everyday life. However, they still lack essential abilities, such as robustly dealing with spatially transformed input signals. Approaches to mitigate this severe robustness issue are limited to two pathways: Either models are implicitly regularised by increased sample variability (data augmentation) or explicitly constrained by hard-coded inductive biases. The limiting factor of the former is the size of the data space, which renders sufficient sample coverage intractable. The latter is limited by the engineering effort required to develop such inductive biases for every possible scenario. Instead, we take inspiration from human behaviour, where percepts are modified by mental or physical actions during inference. We propose a novel technique to emulate such an inference process for neural nets. This is achieved by traversing a sparsified inverse transformation tree during inference using parallel energy-based evaluations. Our proposed inference algorithm, called Inverse Transformation Search (ITS), is model-agnostic and equips the model with zero-shot pseudo-invariance to spatially transformed inputs. We evaluated our method on several benchmark datasets, including a synthesised ImageNet test set. ITS outperforms the utilised baselines on all zero-shot test scenarios.

artificial intelligence, hypothesis, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.0373

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

StyleX: A Trainable Metric for X-ray Style Distances

Eckert, Dominik, Syben, Christopher, Hümmer, Christian, Ritschl, Ludwig, Kappler, Steffen, Stober, Sebastian

arXiv.org Artificial IntelligenceMay-23-2024

The progression of X-ray technology introduces diverse image styles that need to be adapted to the preferences of radiologists. To support this task, we introduce a novel deep learning-based metric that quantifies style differences of non-matching image pairs. At the heart of our metric is an encoder capable of generating X-ray image style representations. This encoder is trained without any explicit knowledge of style distances by exploiting Simple Siamese learning. During inference, the style representations produced by the encoder are used to calculate a distance metric for non-matching image pairs. Our experiments investigate the proposed concept for a disclosed reproducible and a proprietary image processing pipeline along two dimensions: First, we use a t-distributed stochastic neighbor embedding (t-SNE) analysis to illustrate that the encoder outputs provide meaningful and discriminative style representations. Second, the proposed metric calculated from the encoder outputs is shown to quantify style distances for non-matching pairs in good alignment with the human perception. These results confirm that our proposed method is a promising technique to quantify style differences, which can be used for guided style selection as well as automatic optimization of image pipeline parameters.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2405.14718

Country: Europe > Sweden > Skåne County (0.15)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

Visualizing Deep Neural Networks with Topographic Activation Maps

Krug, Valerie, Ratul, Raihan Kabir, Olson, Christopher, Stober, Sebastian

arXiv.org Artificial IntelligenceJun-14-2023

Machine Learning with Deep Neural Networks (DNNs) has become a successful tool in solving tasks across various fields of application. However, the complexity of DNNs makes it difficult to understand how they solve their learned task. To improve the explainability of DNNs, we adapt methods from neuroscience that analyze complex and opaque systems. Here, we draw inspiration from how neuroscience uses topographic maps to visualize brain activity. To also visualize activations of neurons in DNNs as topographic maps, we research techniques to layout the neurons in a two-dimensional space such that neurons of similar activity are in the vicinity of each other. In this work, we introduce and compare methods to obtain a topographic layout of neurons in a DNN layer. Moreover, we demonstrate how to use topographic activation maps to identify errors or encoded biases and to visualize training processes. Our novel visualization technique improves the transparency of DNN-based decision-making systems and is interpretable without expert knowledge in Machine Learning.

artificial intelligence, machine learning, topographic map, (16 more...)

arXiv.org Artificial Intelligence

2204.03528

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Continuous Rotation Canonicalization with Radial Beam Sampling

Schmidt, Johann, Stober, Sebastian

arXiv.org Artificial IntelligenceFeb-7-2023

Nearly all state of the art vision models are sensitive to image rotations. Existing methods often compensate for missing inductive biases by using augmented training data to learn pseudo-invariances. Alongside the resource demanding data inflation process, predictions often poorly generalize. The inductive biases inherent to convolutional neural networks allow for translation equivariance through kernels acting parallely to the horizontal and vertical axes of the pixel grid. This inductive bias, however, does not allow for rotation equivariance. We propose a radial beam sampling strategy along with radial kernels operating on these beams to inherently incorporate center-rotation covariance. Together with an angle distance loss, we present a radial beam-based image canonicalization model, short BIC. Our model allows for maximal continuous angle regression and canonicalizes arbitrary center-rotated input images. As a pre-processing model, this enables rotation-invariant vision pipelines with model-agnostic rotation-sensitive downstream predictions. We show that our end-to-end trained angle regressor is able to predict continuous rotation angles on several vision datasets, i.e. FashionMNIST, CIFAR10, COIL100, and LFW.

artificial intelligence, machine learning, rotation, (17 more...)

arXiv.org Artificial Intelligence

2206.1069

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Differentiable Generalised Predictive Coding

Ofner, André, Stober, Sebastian

arXiv.org Artificial IntelligenceDec-8-2021

This paper deals with differentiable dynamical models congruent with neural process theories that cast brain function as the hierarchical refinement of an internal generative model explaining observations. Our work extends existing implementations of gradient-based predictive coding with automatic differentiation and allows to integrate deep neural networks for non-linear state parameterization. Gradient-based predictive coding optimises inferred states and weights locally in for each layer by optimising precision-weighted prediction errors that propagate from stimuli towards latent states. Predictions flow backwards, from latent states towards lower layers. The model suggested here optimises hierarchical and dynamical predictions of latent states. Hierarchical predictions encode expected content and hierarchical structure. Dynamical predictions capture changes in the encoded content along with higher order derivatives. Hierarchical and dynamical predictions interact and address different aspects of the same latent states. We apply the model to various perception and planning tasks on sequential data and show their mutual dependence. In particular, we demonstrate how learning sampling distances in parallel address meaningful locations data sampled at discrete time steps. We discuss possibilities to relax the assumption of linear hierarchies in favor of more flexible graph structure with emergent properties. We compare the granular structure of the model with canonical microcircuits describing predictive coding in biological networks and review the connection to Markov Blankets as a tool to characterize modularity. A final section sketches out ideas for efficient perception and planning in nested spatio-temporal hierarchies.

artificial intelligence, machine learning, predictive coding, (21 more...)

arXiv.org Artificial Intelligence

2112.03378

Country: Europe > Germany (0.14)

Genre: Research Report (0.40)

Industry:

Law > Litigation (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback