AITopics | Sperl, Philip

Collaborating Authors

Sperl, Philip

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DeePen: Penetration Testing for Audio Deepfake Detection

Müller, Nicolas, Kawa, Piotr, Stan, Adriana, Doan, Thien-Phuc, Jung, Souhwan, Choong, Wei Herng, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceMar-5-2025

Deepfakes - manipulated or forged audio and video media - pose significant security risks to individuals, organizations, and society at large. To address these challenges, machine learning-based classifiers are commonly employed to detect deepfake content. In this paper, we assess the robustness of such classifiers through a systematic penetration testing methodology, which we introduce as DeePen. Our approach operates without prior knowledge of or access to the target deepfake detection models. Instead, it leverages a set of carefully selected signal processing modifications - referred to as attacks - to evaluate model vulnerabilities. Using DeePen, we analyze both real-world production systems and publicly available academic model checkpoints, demonstrating that all tested systems exhibit weaknesses and can be reliably deceived by simple manipulations such as time-stretching or echo addition. Furthermore, our findings reveal that while some attacks can be mitigated by retraining detection systems with knowledge of the specific attack, others remain persistently effective. We release all associated code.

artificial intelligence, detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.20427

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Harder or Different? Understanding Generalization of Audio Deepfake Detection

Müller, Nicolas M., Evans, Nicholas, Tak, Hemlata, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceJun-12-2024

Recent research has highlighted a key issue in speech deepfake detection: models trained on one set of deepfakes perform poorly on others. The question arises: is this due to the continuously improving quality of Text-to-Speech (TTS) models, i.e., are newer DeepFakes just 'harder' to detect? Or, is it because deepfakes generated with one model are fundamentally different to those generated using another model? We answer this question by decomposing the performance gap between in-domain and out-of-domain test data into 'hardness' and 'difference' components. Experiments performed using ASVspoof databases indicate that the hardness component is practically negligible, with the performance gap being attributed primarily to the difference component. This has direct implications for real-world deepfake detection, highlighting that merely increasing model capacity, the currently-dominant research trend, may not effectively address the generalization challenge.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.03512

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A New Approach to Voice Authenticity

Müller, Nicolas M., Kawa, Piotr, Hu, Shen, Neu, Matthias, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceFeb-9-2024

Voice faking, driven primarily by recent advances in text-to-speech (TTS) synthesis technology, poses significant societal challenges. Currently, the prevailing assumption is that unaltered human speech can be considered genuine, while fake speech comes from TTS synthesis. We argue that this binary distinction is oversimplified. For instance, altered playback speeds can be used for malicious purposes, like in the 'Drunken Nancy Pelosi' incident. Similarly, editing of audio clips can be done ethically, e.g., for brevity or summarization in news reporting or podcasts, but editing can also create misleading narratives. In this paper, we propose a conceptual shift away from the binary paradigm of audio being either 'fake' or 'real'. Instead, our focus is on pinpointing 'voice edits', which encompass traditional modifications like filters and cuts, as well as TTS synthesis and VC systems. We delineate 6 categories and curate a new challenge dataset rooted in the M-AILABS corpus, for which we present baseline detection systems. And most importantly, we argue that merely categorizing audio as fake or real is a dangerous over-simplification that will fail to move the field of speech technology forward.

artificial intelligence, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2402.06304

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Media > News (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)
Information Technology > Artificial Intelligence > Speech (0.92)

Add feedback

Physical Adversarial Examples for Multi-Camera Systems

Răduţoiu, Ana, Schulze, Jan-Philipp, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceNov-14-2023

Neural networks build the foundation of several intelligent systems, which, however, are known to be easily fooled by adversarial examples. Recent advances made these attacks possible even in air-gapped scenarios, where the autonomous system observes its surroundings by, e.g., a camera. We extend these ideas in our research and evaluate the robustness of multi-camera setups against such physical adversarial examples. This scenario becomes ever more important with the rise in popularity of autonomous vehicles, which fuse the information of several cameras for their driving decision. While we find that multi-camera setups provide some robustness towards past attack methods, we see that this advantage reduces when optimizing on multiple perspectives at once. We propose a novel attack method that we call Transcender-MC, where we incorporate online 3D renderings and perspective projections in the training process. Moreover, we motivate that certain data augmentation techniques can facilitate the generation of successful adversarial examples even further. Transcender-MC is 11% more effective in successfully attacking multi-camera setups than state-of-the-art methods. Our findings offer valuable insights regarding the resilience of object detection in a setup with multiple cameras and motivate the need of developing adequate defense mechanisms against them.

artificial intelligence, machine learning, robustness, (17 more...)

arXiv.org Artificial Intelligence

2311.08539

Country: Europe > Germany (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)

Add feedback

Protecting Publicly Available Data With Machine Learning Shortcuts

Müller, Nicolas M., Burgert, Maximilian, Debus, Pascal, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceOct-30-2023

Machine-learning (ML) shortcuts or spurious correlations are artifacts in datasets that lead to very good training and test performance but severely limit the model's generalization capability. Such shortcuts are insidious because they go unnoticed due to good in-domain test performance. In this paper, we explore the influence of different shortcuts and show that even simple shortcuts are difficult to detect by explainable AI methods. We then exploit this fact and design an approach to defend online databases against crawlers: providers such as dating platforms, clothing manufacturers, or used car dealers have to deal with a professionalized crawling industry that grabs and resells data points on a large scale. We show that a deterrent can be created by deliberately adding ML shortcuts. Such augmented datasets are then unusable for ML use cases, which deters crawlers and the unauthorized use of data from the internet. Using real-world data from three use cases, we show that the proposed approach renders such collected data unusable, while the shortcut is at the same time difficult to notice in human perception. Thus, our proposed approach can serve as a proactive protection against illegitimate data crawling.

artificial intelligence, machine learning, shortcut, (16 more...)

arXiv.org Artificial Intelligence

2310.19381

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Complex-valued neural networks for voice anti-spoofing

Müller, Nicolas M., Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceAug-22-2023

Current anti-spoofing and audio deepfake detection systems use either magnitude spectrogram-based features (such as CQT or Melspectrograms) or raw audio processed through convolution or sinc-layers. Both methods have drawbacks: magnitude spectrograms discard phase information, which affects audio naturalness, and raw-feature-based models cannot use traditional explainable AI methods. This paper proposes a new approach that combines the benefits of both methods by using complex-valued neural networks to process the complex-valued, CQT frequency-domain representation of the input audio. This method retains phase information and allows for explainable AI methods. Results show that this approach outperforms previous methods on the "In-the-Wild" anti-spoofing dataset and enables interpretation of the results through explainable AI. Ablation studies confirm that the model has learned to use phase information to detect voice spoofing.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2308.118

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Shortcut Detection with Variational Autoencoders

Müller, Nicolas M., Roschmann, Simon, Khan, Shahbaz, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceJul-21-2023

For real-world applications of machine learning (ML), it is essential that models make predictions based on well-generalizing features rather than spurious correlations in the data. The identification of such spurious correlations, also known as shortcuts, is a challenging problem and has so far been scarcely addressed. In this work, we present a novel approach to detect shortcuts in image and audio datasets by leveraging variational autoencoders (VAEs). The disentanglement of features in the latent space of VAEs allows us to discover feature-target correlations in datasets and semi-automatically evaluate them for ML shortcuts. We demonstrate the applicability of our method on several real-world datasets and identify shortcuts that have not been discovered before.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.04246

Country:

Europe (0.46)
North America > United States > Hawaii (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.50)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

R2-AD2: Detecting Anomalies by Analysing the Raw Gradient

Schulze, Jan-Philipp, Sperl, Philip, Răduţoiu, Ana, Sagebiel, Carla, Böttinger, Konstantin

arXiv.org Artificial IntelligenceJun-21-2022

Neural networks follow a gradient-based learning scheme, adapting their mapping parameters by back-propagating the output loss. Samples unlike the ones seen during training cause a different gradient distribution. Based on this intuition, we design a novel semi-supervised anomaly detection method called R2-AD2. By analysing the temporal distribution of the gradient over multiple training steps, we reliably detect point anomalies in strict semi-supervised settings. Instead of domain dependent features, we input the raw gradient caused by the sample under test to an end-to-end recurrent neural network architecture. R2-AD2 works in a purely data-driven way, thus is readily applicable in a variety of important use cases of anomaly detection.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-26387-3_13

2206.10259

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Visualizing Automatic Speech Recognition -- Means for a Better Understanding?

Markert, Karla, Parracone, Romain, Kulakov, Mykhailo, Sperl, Philip, Kao, Ching-Yu, Böttinger, Konstantin

arXiv.org Artificial IntelligenceFeb-1-2022

Automatic speech recognition (ASR) is improving ever more at mimicking human speech processing. The functioning of ASR, however, remains to a large extent obfuscated by the complex structure of the deep neural networks (DNNs) they are based on. In this paper, we show how so-called attribution methods, that we import from image recognition and suitably adapt to handle audio data, can help to clarify the working of ASR. Taking DeepSpeech, an end-to-end model for ASR, as a case study, we show how these techniques help to visualize which features of the input are the most influential in determining the output. We focus on three visualization techniques: Layer-wise Relevance Propagation (LRP), Saliency Maps, and Shapley Additive Explanations (SHAP). We compare these methods and discuss potential further applications, such as in the detection of adversarial examples.

artificial intelligence, attribution, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/SPSC.2021-4

2202.00673

Country:

Europe > Italy (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Gradient Masking and the Underestimated Robustness Threats of Differential Privacy in Deep Learning

Boenisch, Franziska, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceMay-17-2021

An important problem in deep learning is the privacy and security of neural networks (NNs). Both aspects have long been considered separately. To date, it is still poorly understood how privacy enhancing training affects the robustness of NNs. This paper experimentally evaluates the impact of training with Differential Privacy (DP), a standard method for privacy preservation, on model vulnerability against a broad range of adversarial attacks. The results suggest that private models are less robust than their non-private counterparts, and that adversarial examples transfer better among DP models than between non-private and private ones. Furthermore, detailed analyses of DP and non-DP models suggest significant differences between their gradients. Additionally, this work is the first to observe that an unfavorable choice of parameters in DP training can lead to gradient masking, and, thereby, results in a wrong sense of security.

deep learning, gradient, neural network, (19 more...)

arXiv.org Artificial Intelligence

2105.07985

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Santa Clara County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback