AITopics | Peri, Raghuveer

Collaborating Authors

Peri, Raghuveer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Peri, Raghuveer, Jayanthi, Sai Muralidhar, Ronanki, Srikanth, Bhatia, Anshu, Mundnich, Karel, Dingliwal, Saket, Das, Nilaksh, Hou, Zejiang, Huybrechts, Goeric, Vishnubhotla, Srikanth, Garcia-Romero, Daniel, Srinivasan, Sundararajan, Han, Kyu J, Kirchhoff, Katrin

arXiv.org Artificial IntelligenceMay-14-2024

Integrated Speech and Large Language Models (SLMs) that can follow speech instructions and generate relevant text responses have gained popularity lately. However, the safety and robustness of these models remains largely unclear. In this work, we investigate the potential vulnerabilities of such instruction-following speech-language models to adversarial attacks and jailbreaking. Specifically, we design algorithms that can generate adversarial examples to jailbreak SLMs in both white-box and black-box attack settings without human involvement. Additionally, we propose countermeasures to thwart such jailbreaking attacks. Our models, trained on dialog data with speech instructions, achieve state-of-the-art performance on spoken question-answering task, scoring over 80% on both safety and helpfulness metrics. Despite safety guardrails, experiments on jailbreaking demonstrate the vulnerability of SLMs to adversarial perturbations and transfer attacks, with average attack success rates of 90% and 10% respectively when evaluated on a dataset of carefully designed harmful questions spanning 12 different toxic categories. However, we demonstrate that our proposed countermeasures reduce the attack success significantly.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.08317

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VoxWatch: An open-set speaker recognition benchmark on VoxCeleb

Peri, Raghuveer, Sadjadi, Seyed Omid, Garcia-Romero, Daniel

arXiv.org Artificial IntelligenceJun-30-2023

Despite its broad practical applications such as in fraud prevention, open-set speaker identification (OSI) has received less attention in the speaker recognition community compared to speaker verification (SV). OSI deals with determining if a test speech sample belongs to a speaker from a set of pre-enrolled individuals (in-set) or if it is from an out-of-set speaker. In addition to the typical challenges associated with speech variability, OSI is prone to the "false-alarm problem"; as the size of the in-set speaker population (a.k.a watchlist) grows, the out-of-set scores become larger, leading to increased false alarm rates. This is in particular challenging for applications in financial institutions and border security where the watchlist size is typically of the order of several thousand speakers. Therefore, it is important to systematically quantify the false-alarm problem, and develop techniques that alleviate the impact of watchlist size on detection performance. Prior studies on this problem are sparse, and lack a common benchmark for systematic evaluations. In this paper, we present the first public benchmark for OSI, developed using the VoxCeleb dataset. We quantify the effect of the watchlist size and speech duration on the watchlist-based speaker detection task using three strong neural network based systems. In contrast to the findings from prior research, we show that the commonly adopted adaptive score normalization is not guaranteed to improve the performance for this task. On the other hand, we show that score calibration and score fusion, two other commonly used techniques in SV, result in significant improvements in OSI performance.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2307.00169

Genre: Research Report (1.00)

Industry:

Government (0.34)
Banking & Finance (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.71)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.64)

Add feedback