AITopics | Yamagishi, Junichi

Collaborating Authors

Yamagishi, Junichi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Automated Fact-Checking of Real-World Claims: Exploring Task Formulation and Assessment with LLMs

Sahitaj, Premtim, Maab, Iffat, Yamagishi, Junichi, Kolanowski, Jawan, Möller, Sebastian, Schmitt, Vera

arXiv.org Artificial IntelligenceFeb-12-2025

Fact-checking is necessary to address the increasing volume of misinformation. Traditional fact-checking relies on manual analysis to verify claims, but it is slow and resource-intensive. This study establishes baseline comparisons for Automated Fact-Checking (AFC) using Large Language Models (LLMs) across multiple labeling schemes (binary, three-class, five-class) and extends traditional claim verification by incorporating analysis, verdict classification, and explanation in a structured setup to provide comprehensive justifications for real-world claims. We evaluate Llama-3 models of varying sizes (3B, 8B, 70B) on 17,856 claims collected from PolitiFact (2007-2024) using evidence retrieved via restricted web searches. We utilize TIGERScore as a reference-free evaluation metric to score the justifications. Our results show that larger LLMs consistently outperform smaller LLMs in classification accuracy and justification quality without fine-tuning. We find that smaller LLMs in a one-shot scenario provide comparable task performance to fine-tuned Small Language Models (SLMs) with large context sizes, while larger LLMs consistently surpass them. Evidence integration improves performance across all models, with larger LLMs benefiting most. Distinguishing between nuanced labels remains challenging, emphasizing the need for further exploration of labeling schemes and alignment with evidences. Our findings demonstrate the potential of retrieval-augmented AFC with LLMs.

large language model, machine learning, natural language, (8 more...)

arXiv.org Artificial Intelligence

2502.08909

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

The First VoicePrivacy Attacker Challenge Evaluation Plan

Tomashenko, Natalia, Miao, Xiaoxiao, Vincent, Emmanuel, Yamagishi, Junichi

arXiv.org Artificial IntelligenceOct-21-2024

The First VoicePrivacy Attacker Challenge is a new kind of challenge organized as part of the VoicePrivacy initiative and supported by ICASSP 2025 as the SP Grand Challenge It focuses on developing attacker systems against voice anonymization, which will be evaluated against a set of anonymization systems submitted to the VoicePrivacy 2024 Challenge. Training, development, and evaluation datasets are provided along with a baseline attacker system. Participants shall develop their attacker systems in the form of automatic speaker verification systems and submit their scores on the development and evaluation data to the organizers. To do so, they can use any additional training data and models, provided that they are openly available and declared before the specified deadline. The metric for evaluation is equal error rate (EER). Results will be presented at the ICASSP 2025 special session to which 5 selected top-ranked participants will be invited to submit and present their challenge systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.07428

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
(2 more...)

Add feedback

AfriHuBERT: A self-supervised speech representation model for African languages

Alabi, Jesujoba O., Liu, Xuechen, Klakow, Dietrich, Yamagishi, Junichi

arXiv.org Artificial IntelligenceSep-30-2024

In this work, we present AfriHuBERT, an extension of mHuBERT-147, a state-of-the-art (SOTA) and compact self-supervised learning (SSL) model, originally pretrained on 147 languages. While mHuBERT-147 was pretrained on 16 African languages, we expand this to cover 39 African languages through continued pretraining on 6,500+ hours of speech data aggregated from diverse sources, including 23 newly added languages. We evaluate AfriHuBERT on two key speech tasks: Language Identification (LID) and Automatic Speech Recognition (ASR) using FLEURS dataset. Our results show a +4% F1 score improvement on average for LID and a -1.2% average Word Error Rate (WER) reduction for ASR. Further analysis shows that ASR models trained on AfriHuBERT exhibit improved cross-corpus generalization. Additionally, the analysis indicates that the FLEURS have data quality limitations that may affect their suitability for evaluating low-resource African languages, suggesting the need for better evaluation benchmarks for these languages.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.20201

Country:

Africa (1.00)
Europe (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

To what extent can ASV systems naturally defend against spoofing attacks?

Jung, Jee-weon, Wang, Xin, Evans, Nicholas, Watanabe, Shinji, Shim, Hye-jin, Tak, Hemlata, Arora, Sidhhant, Yamagishi, Junichi, Chung, Joon Son

arXiv.org Artificial IntelligenceJun-14-2024

The current automatic speaker verification (ASV) task involves making binary decisions on two types of trials: target and nontarget. However, emerging advancements in speech generation technology pose significant threats to the reliability of ASV systems. This study investigates whether ASV effortlessly acquires robustness against spoofing attacks (i.e., zero-shot capability) by systematically exploring diverse ASV systems and spoofing attacks, ranging from traditional to cutting-edge techniques. Through extensive analyses conducted on eight distinct ASV systems and 29 spoofing attack systems, we demonstrate that the evolution of ASV inherently incorporates defense mechanisms Figure 1: Average Spoof Equal Error Rates (SPF-EERs) on 29 against spoofing attacks. Nevertheless, our findings also different spoofing attacks, chronologically displayed using eight underscore that the advancement of spoofing attacks far outpaces automatic speaker verification (ASV) systems. The SPF-EER that of ASV systems, hence necessitating further research adopts spoof trials in place of conventional non-target trials, on spoofing-robust ASV methodologies.

artificial intelligence, asv system, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.05339

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.90)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.58)

Add feedback

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Gong, Cheng, Cooper, Erica, Wang, Xin, Qiang, Chunyu, Geng, Mengzhe, Wells, Dan, Wang, Longbiao, Dang, Jianwu, Tessier, Marc, Pine, Aidan, Richmond, Korin, Yamagishi, Junichi

arXiv.org Artificial IntelligenceJun-13-2024

Self-supervised learning (SSL) representations from massively multilingual models offer a promising solution for low-resource language speech tasks. Despite advancements, language adaptation in TTS systems remains an open problem. This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system proposed in our previous work. We conducted experiments on 12 languages using limited data with various fine-tuning configurations. We demonstrate that the similarity in phonetics between the pre-training and target languages, as well as the language category, affects the target language's adaptation performance. Additionally, we find that the fine-tuning dataset size and number of speakers influence adaptability. Surprisingly, we also observed that using paired data for fine-tuning is not always optimal compared to audio-only data. Beyond speech intelligibility, our analysis covers speaker similarity, language identification, and predicted MOS.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.08911

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The VoicePrivacy 2024 Challenge Evaluation Plan

Tomashenko, Natalia, Miao, Xiaoxiao, Champion, Pierre, Meyer, Sarina, Wang, Xin, Vincent, Emmanuel, Panariello, Michele, Evans, Nicholas, Yamagishi, Junichi, Todisco, Massimiliano

arXiv.org Artificial IntelligenceJun-12-2024

The task of the challenge is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content and emotional states. The organizers provide development and evaluation datasets and evaluation scripts, as well as baseline anonymization systems and a list of training resources formed on the basis of the participants' requests. Participants apply their developed anonymization systems, run evaluation scripts and submit evaluation results and anonymized speech data to the organizers. Results will be presented at a workshop held in conjunction with Interspeech 2024 to which all participants are invited to present their challenge systems and to submit additional workshop papers.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.02677

Country:

Asia > Japan (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio

Zhang, Lin, Wang, Xin, Cooper, Erica, Diez, Mireia, Landini, Federico, Evans, Nicholas, Yamagishi, Junichi

arXiv.org Artificial IntelligenceJun-11-2024

This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Countermeasure-Condition Clustering (3C) model. Utilizing this model, we first explore how to effectively train countermeasures to support spoof diarization using three labeling schemes. We then utilize spoof localization predictions to enhance the diarization performance. This first study reveals the high complexity of the task, even in restricted scenarios where only a single speaker per audio file and an oracle number of spoofing methods are considered. Our code is available at https://github.com/nii-yamagishilab/PartialSpoof.

artificial intelligence, diarization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2406.07816

Country:

Europe > Czechia (0.28)
Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.68)
Government (0.46)

Technology:

Information Technology > Security & Privacy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Bridging Textual and Tabular Worlds for Fact Verification: A Lightweight, Attention-Based Model

Varnosfaderani, Shirin Dabbaghi, Kruengkrai, Canasai, Yahyapour, Ramin, Yamagishi, Junichi

arXiv.org Artificial IntelligenceMar-25-2024

FEVEROUS is a benchmark and research initiative focused on fact extraction and verification tasks involving unstructured text and structured tabular data. In FEVEROUS, existing works often rely on extensive preprocessing and utilize rule-based transformations of data, leading to potential context loss or misleading encodings. This paper introduces a simple yet powerful model that nullifies the need for modality conversion, thereby preserving the original evidence's context. By leveraging pre-trained models on diverse text and tabular datasets and by incorporating a lightweight attention-based mechanism, our approach efficiently exploits latent connections between different data types, thereby yielding comprehensive and reliable verdict predictions. The model's modular structure adeptly manages multi-modal information, ensuring the integrity and authenticity of the original evidence are uncompromised. Comparative analyses reveal that our approach exhibits competitive performance, aligning itself closely with top-tier models on the FEVEROUS benchmark.

benchmark, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2403.17361

Country: Europe > Germany (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Uncertainty as a Predictor: Leveraging Self-Supervised Learning for Zero-Shot MOS Prediction

Ravuri, Aditya, Cooper, Erica, Yamagishi, Junichi

arXiv.org Machine LearningDec-25-2023

This paper addresses the gap in We are particularly inspired by approaches in biology where efficient audio quality prediction, especially in low-resource zero-shot prediction is possible using a model's uncertainty settings where extensive MOS data from large-scale listening estimates, where uncertainties act as proxies for downstream tests may be unavailable. We demonstrate that uncertainty tasks [4]. Our main hypotheses are that, measures derived from out-of-the-box pretrained selfsupervised learning (SSL) models, such as wav2vec, correlate 1. uncertainty estimates can be derived from the outputs with MOS scores. These findings are based on data from the of SSL models such as wav2vec, and that, 2022 and 2023 VoiceMOS challenges. We explore the extent 2. these uncertainties can be used as proxies to MOS of this correlation across different models and language scores as high model uncertainty around the contents contexts, revealing insights into how inherent uncertainties in of an audio sequence must correspond to low audio SSL models can serve as effective proxies for audio quality quality.

artificial intelligence, machine learning, ssl model, (18 more...)

arXiv.org Machine Learning

2312.15616

Country:

Asia > Japan (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

XFEVER: Exploring Fact Verification across Languages

Chang, Yi-Chen, Kruengkrai, Canasai, Yamagishi, Junichi

arXiv.org Artificial IntelligenceOct-24-2023

This paper introduces the Cross-lingual Fact Extraction and VERification (XFEVER) dataset designed for benchmarking the fact verification models across different languages. We constructed it by translating the claim and evidence texts of the Fact Extraction and VERification (FEVER) dataset into six languages. The training and development sets were translated using machine translation, whereas the test set includes texts translated by professional translators and machine-translated texts. Using the XFEVER dataset, two cross-lingual fact verification scenarios, zero-shot learning and translate-train learning, are defined, and baseline models for each scenario are also proposed in this paper. Experimental results show that the multilingual language model can be used to build fact verification models in different languages efficiently. However, the performance varies by language and is somewhat inferior to the English case. We also found that we can effectively mitigate model miscalibration by considering the prediction similarity between the English and target languages. The XFEVER dataset, code, and model checkpoints are available at https://github.com/nii-yamagishilab/xfever.

artificial intelligence, exploring fact verification, natural language, (1 more...)

arXiv.org Artificial Intelligence

2310.16278

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.87)

Add feedback