AITopics | fake audio

Collaborating Authors

fake audio

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

I Can Hear You: Selective Robust Training for Deepfake Audio Detection

Zhang, Zirui, Hao, Wei, Sankoh, Aroon, Lin, William, Mendiola-Ortiz, Emanuel, Yang, Junfeng, Mao, Chengzhi

arXiv.org Artificial IntelligenceOct-31-2024

Recent advances in AI-generated voices have intensified the challenge of detecting deepfake audio, posing risks for scams and the spread of disinformation. To tackle this issue, we establish the largest public voice dataset to date, named DeepFakeVox-HQ, comprising 1.3 million samples, including 270,000 high-quality deepfake samples from 14 diverse sources. Despite previously reported high accuracy, existing deepfake voice detectors struggle with our diversely collected dataset, and their detection success rates drop even further under realistic corruptions and adversarial attacks. We conduct a holistic investigation into factors that enhance model robustness and show that incorporating a diversified set of voice augmentations is beneficial. Moreover, we find that the best detection models often rely on high-frequency features, which are imperceptible to humans and can be easily manipulated by an attacker. To address this, we propose the F-SAT: Frequency-Selective Adversarial Training method focusing on high-frequency components. Empirical results demonstrate that using our training dataset boosts baseline model performance (without robust training) by 33%, and our robust training further improves accuracy by 7.7% on clean samples and by 29.3% on corrupted and attacked samples, over the state-of-the-art RawNet3 model.

artificial intelligence, arxiv preprint arxiv, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.00121

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Human Brain Exhibits Distinct Patterns When Listening to Fake Versus Real Audio: Preliminary Evidence

Salehi, Mahsa, Stefanov, Kalin, Shareghi, Ehsan

arXiv.org Artificial IntelligenceJul-8-2024

In this paper we study the variations in human brain activity when listening to real and fake audio. Our preliminary results suggest that the representations learned by a state-of-the-art deepfake audio detection algorithm, do not exhibit clear distinct patterns between real and fake audio. In contrast, human brain activity, as measured by EEG, displays distinct patterns when individuals are exposed to fake versus real audio. This preliminary evidence enables future research directions in areas such as deepfake audio detection.

audio, dataset, springer nature 2021, (13 more...)

arXiv.org Artificial Intelligence

2402.14982

Country:

Oceania > Australia > Victoria > Melbourne (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom (0.04)
Asia > Middle East > UAE (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An Alleged Deepfake of UK Opposition Leader Keir Starmer Shows the Dangers of Fake Audio

WIREDOct-9-2023, 15:30:00 GMT

As members of the UK's largest opposition party gathered in Liverpool for their party conference--probably their last before the UK holds a general election--a potentially explosive audio file started circulating on X, formerly known as Twitter. The 25-second recording was posted by an X account with the handle "@Leo_Hutz" that was set up in January 2023. In the clip, Sir Keir Starmer, the Labour Party leader, is apparently heard swearing repeatedly at a staffer. "I have obtained audio of Keir Starmer verbally abusing his staffers at [the Labour Party] conference," the X account posted. "This disgusting bully is about to become our next PM."

alleged deepfake, fake audio, opposition leader keir starmer show, (5 more...)

WIRED

Country: Europe > United Kingdom (0.80)

Industry:

Government (1.00)
Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Communications > Social Media (0.40)

Add feedback

Adaptive Fake Audio Detection with Low-Rank Model Squeezing

Zhang, Xiaohui, Yi, Jiangyan, Tao, Jianhua, Wang, Chenlong, Xu, Le, Fu, Ruibo

arXiv.org Artificial IntelligenceJun-8-2023

Traditional approaches, such as finetuning on new datasets containing these novel spoofing algorithms, are computationally intensive and pose a risk of impairing the acquired knowledge of known fake audio types. To address these challenges, this paper proposes an innovative approach that mitigates the limitations associated with finetuning. We introduce the concept of training low-rank adaptation matrices tailored specifically to the newly emerging fake audio types. During the inference stage, these adaptation matrices are combined with the existing model to generate the final prediction output. Extensive experimentation is conducted to evaluate the efficacy of the proposed method. The results demonstrate that our approach effectively preserves the prediction accuracy of the existing model for known fake audio types. Furthermore, our approach offers several advantages, including reduced storage memory requirements and lower equal error rates compared to conventional finetuning methods, particularly on specific spoofing algorithms.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.04956

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Asia > China > Beijing > Beijing (0.05)
Europe > Czechia > South Moravian Region > Brno (0.04)
(11 more...)

Genre: Research Report > New Finding (0.89)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)

Add feedback

Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion

Liu, Rui, Zhang, Jinhua, Gao, Guanglai, Li, Haizhou

arXiv.org Artificial IntelligenceMay-24-2023

Audio Deepfake Detection (ADD) aims to detect the fake audio generated by text-to-speech (TTS), voice conversion (VC) and replay, etc., which is an emerging topic. Traditionally we take the mono signal as input and focus on robust feature extraction and effective classifier design. However, the dual-channel stereo information in the audio signal also includes important cues for deepfake, which has not been studied in the prior work. In this paper, we propose a novel ADD model, termed as M2S-ADD, that attempts to discover audio authenticity cues during the mono-to-stereo conversion process. We first projects the mono to a stereo signal using a pretrained stereo synthesizer, then employs a dual-branch neural architecture to process the left and right channel signals, respectively. In this way, we effectively reveal the artifacts in the fake audio, thus improve the ADD performance. The experiments on the ASVspoof2019 database show that M2S-ADD outperforms all baselines that input mono. We release the source code at \url{https://github.com/AI-S2-Lab/M2S-ADD}.

artificial intelligence, detection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.16353

Country:

Asia > Mongolia (0.04)
Asia > China > Inner Mongolia (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

I Cloned My Voice and My Mother Couldn't Tell the Difference

SlateApr-13-2023, 18:38:53 GMT

This article is from Understanding AI, a newsletter that explores how A.I. works and how it's changing our world. A couple of weeks ago, I used A.I. software to clone my voice. The resulting audio sounded pretty convincing to me, but I wanted to see what others thought. So I created a test audio file based on the first 12 paragraphs of this article that I wrote. Seven randomly chosen paragraphs were my real voice, while the other five were generated by A.I. I asked members of my family to see if they could tell the difference.

audio, audio file, descript, (16 more...)

Slate

Country:

North America > United States (0.14)
Europe > Ireland (0.05)
North America > Canada > Saskatchewan > Regina (0.04)

Industry:

Information Technology > Security & Privacy (1.00)
Media (0.95)
Government (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)

Add feedback

An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio

Yan, Xinrui, Yi, Jiangyan, Tao, Jianhua, Wang, Chenglong, Ma, Haoxin, Wang, Tao, Wang, Shiming, Fu, Ruibo

arXiv.org Artificial IntelligenceAug-20-2022

Many effective attempts have been made for fake audio detection. However, they can only provide detection results but no countermeasures to curb this harm. For many related practical applications, what model or algorithm generated the fake audio also is needed. Therefore, We propose a new problem for detecting vocoder fingerprints of fake audio. Experiments are conducted on the datasets synthesized by eight state-of-the-art vocoders. We have preliminarily explored the features and model architectures. The t-SNE visualization shows that different vocoders generate distinct vocoder fingerprints.

fingerprint, vocoder, vocoder fingerprint, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3552466.3556525

2208.09646

Country:

Europe > Portugal > Lisbon > Lisbon (0.05)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.93)
Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Half-Truth: A Partially Fake Audio Detection Dataset

Yi, Jiangyan, Bai, Ye, Tao, Jianhua, Tian, Zhengkun, Wang, Chenglong, Wang, Tao, Fu, Ruibo

arXiv.org Artificial IntelligenceApr-8-2021

Diverse promising datasets have been designed to hold back the development of fake audio detection, such as ASVspoof databases. However, previous datasets ignore an attacking situation, in which the hacker hides some small fake clips in real speech audio. This poses a serious threat since that it is difficult to distinguish the small fake clip from the whole speech utterance. Therefore, this paper develops such a dataset for half-truth audio detection (HAD). Partially fake audio in the HAD dataset involves only changing a few words in an utterance.The audio of the words is generated with the very latest state-of-the-art speech synthesis technology. We can not only detect fake uttrances but also localize manipulated regions in a speech using this dataset. Some benchmark results are presented on this dataset. The results show that partially fake audio presents much more challenging than fully fake audio for fake audio detection.

audio, dataset, utterance, (14 more...)

arXiv.org Artificial Intelligence

2104.03617

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.90)

Technology:

Information Technology > Artificial Intelligence > Speech (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Security & Privacy (0.90)

Add feedback

It's not just phishing emails, now we have to worry about fake calls, too

USATODAY - Tech Top StoriesFeb-27-2020, 13:17:29 GMT

When your boss calls and tells you to wire $100,000 to a supplier, be on your toes. It could be a fake call. As if "phishing" phony emails weren't enough, on the rise now are "deep fake" audios that can be cloned with near perfection to sound almost perfect, and are easy to create for hackers. "It's on the rise, and something to watch out for," says Vijay Balasubramaniyan, the CEO of Pindrop, a company that offers biometric authentication for enterprise. Balasubramaniyan demonstrated during a security conference how easy it is to take audio from the internet and use machine learning to create recorded phrases into sentences that the human probably never said.

balasubramaniyan, email, fake call, (4 more...)

USATODAY - Tech Top Stories

Country:

North America > United States > California > San Francisco County > San Francisco (0.06)
Europe > United Kingdom (0.06)
Asia > North Korea (0.06)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.54)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)

Add feedback