Goto

Collaborating Authors

 bark


Who Gets the Mic? Investigating Gender Bias in the Speaker Assignment of a Speech-LLM

Puhach, Dariia, Payberah, Amir H., Székely, Éva

arXiv.org Artificial Intelligence

However, whether these similarities extend to gender bias remains an open question. This study proposes a methodology leveraging speaker assignment as an analytic tool for bias investigation. Unlike text-based models, which encode gendered associations implicitly, Speech-LLMs must produce a gendered voice, making speaker selection an explicit bias cue. We evaluate Bark, a Text-to-Speech (TTS) model, analyzing its default speaker assignments for textual prompts. If Bark's speaker selection systematically aligns with gendered associations, it may reveal patterns in its training data or model design. To test this, we construct two datasets: (i) Professions, containing gender-stereotyped occupations, and (ii) Gender-Colored Words, featuring gendered connotations. While Bark does not exhibit systematic bias, it demonstrates gender awareness and has some gender inclinations.


BARK: A Fully Bayesian Tree Kernel for Black-box Optimization

Boyne, Toby, Folch, Jose Pablo, Lee, Robert M, Shafei, Behrang, Misener, Ruth

arXiv.org Machine Learning

We perform Bayesian optimization using a Gaussian process perspective on Bayesian Additive Regression Trees (BART). Our BART Kernel (BARK) uses tree agreement to define a posterior over piecewise-constant functions, and we explore the space of tree kernels using a Markov chain Monte Carlo approach. Where BART only samples functions, the resulting BARK model obtains samples of Gaussian processes defining distributions over functions, which allow us to build acquisition functions for Bayesian optimization. Our tree-based approach enables global optimization over the surrogate, even for mixed-feature spaces. Moreover, where many previous tree-based kernels provide uncertainty quantification over function values, our sampling scheme captures uncertainty over the tree structure itself. Our experiments show the strong performance of BARK on both synthetic and applied benchmarks, due to the combination of our fully Bayesian surrogate and the optimization procedure.


Order-Sorted Intensional Logic: Expressing Subtyping Polymorphism with Typing Assertions and Quantification over Concepts

Marković, Đorđe, Denecker, Marc

arXiv.org Artificial Intelligence

Subtyping, also known as subtype polymorphism, is a concept extensively studied in programming language theory, delineating the substitutability relation among datatypes. This property ensures that programs designed for supertype objects remain compatible with their subtypes. In this paper, we explore the capability of order-sorted logic for utilizing these ideas in the context of Knowledge Representation. We recognize two fundamental limitations: First, the inability of this logic to address the concept rather than the value of non-logical symbols, and second, the lack of language constructs for constraining the type of terms. Consequently, we propose guarded order-sorted intensional logic, where guards are language constructs for annotating typing information and intensional logic provides support for quantification over concepts.



Kallini et al. (2024) do not compare impossible languages with constituency-based ones

Hunter, Tim

arXiv.org Artificial Intelligence

A central goal of linguistic theory is to find a precise characterization of the notion "possible human language", in the form of a computational device that is capable of describing all and only the languages that can be acquired by a typically developing human child. The success of recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal. This would only be the case if, in addition to succeeding in learning human languages, LLMs struggle to learn "impossible" human languages. Kallini et al. (2024; "Mission: Impossible Language Models", Proc. ACL) conducted experiments aiming to test this by training GPT-2 on a variety of synthetic languages, and found that it learns some more successfully than others. They present these asymmetries as support for the idea that LLMs' inductive biases align with what is regarded as "possible" for human languages, but the most significant comparison has a confound that makes this conclusion unwarranted. In this paper I explain the confound and suggest some ways forward towards constructing a comparison that appropriately tests the underlying issue.


Towards Dog Bark Decoding: Leveraging Human Speech Processing for Automated Bark Classification

Abzaliev, Artem, Espinosa, Humberto Pérez, Mihalcea, Rada

arXiv.org Artificial Intelligence

Similar to humans, animals make extensive use of verbal and non-verbal forms of communication, including a large range of audio signals. In this paper, we address dog vocalizations and explore the use of self-supervised speech representation models pre-trained on human speech to address dog bark classification tasks that find parallels in human-centered tasks in speech recognition. We specifically address four tasks: dog recognition, breed identification, gender classification, and context grounding. We show that using speech embedding representations significantly improves over simpler classification baselines. Further, we also find that models pre-trained on large human speech acoustics can provide additional performance boosts on several tasks.


We trialed this dystopian helmet that monitors your BRAINWAVES while you drive - and barks at you if you're not paying attention!

Daily Mail - Science & tech

A new helmet that monitors your brain while you drive aims to prevent accidents caused by fatigue or lapses in concentration. The dystopian invention is from Japanese company Macnica - and DailyMail.com The helmet uses a series of electrodes and sensors that monitor activity in the important regions of your brain. 'By measuring your brain activity, we can measure millisecond-by-millisecond your state, from simple measures like how drowsy you are to more sophisticated concentration,' Leon Deouell, chief science officer of Inner Eye, told DailyMail.com. By tapping into your brainwaves and analyzing them with AI, Macnica generates a readout that shows how well you're concentrating, whether you're distracted, and how drowsy you are.


End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec2

Tathe, Aniket, Kamble, Anand, Kumbharkar, Suyash, Bhandare, Atharva, Mitra, Anirban C.

arXiv.org Artificial Intelligence

Speech has long been a barrier to effective communication and connection, persisting as a challenge in our increasingly interconnected world. This research paper introduces a transformative solution to this persistent obstacle - an end-to-end speech conversion framework tailored for Hindi-to-English translation, culminating in the synthesis of English audio. By integrating cutting-edge technologies such as XLSR Wav2Vec2 for automatic speech recognition (ASR), mBART for neural machine translation (NMT), and a Text-to-Speech (TTS) synthesis component, this framework offers a unified and seamless approach to cross-lingual communication. We delve into the intricate details of each component, elucidating their individual contributions and exploring the synergies that enable a fluid transition from spoken Hindi to synthesized English audio.


Generative AI can help bring tomorrow's gaming NPCs to life

Engadget

Elves and Argonians clipping through walls and stepping through tables, blacksmiths who won't acknowledge your existence until you take single step to the left, Draugers that drop into rag-doll seizures the moment you put an arrow through their eye -- Bethesda's Elder Scrolls long-running RPG series is beloved for many reasons, the realism of their non-playable characters (NPCs) is not among them. But the days of hearing the same rote quotes and watching the same half-hearted search patterns perpetually repeated from NPCs are quickly coming to an end. It's all thanks to the emergence of generative chatbots that are helping game developers craft more lifelike, realistic characters and in-game action. "Game AI is seldom about any deep intelligence but rather about the illusion of intelligence," Steve Rabin, Principal Software Engineer at Electronic Arts, wrote in the 2017 essay, The Illusion of Intelligence. "Often we are trying to create believable human behavior, but the actual intelligence that we are able to program is fairly constrained and painfully brittle."


Ubisoft Introduces "Ghostwriter" AI-Powered Video Game Dialogue Generator - Open Data Science - Your News Source for AI, Machine Learning & more

#artificialintelligence

The popularity of open-world games has grown over the last few years. Much of this is due to their immersive worlds, which tend to be so vast that players could spend hours just walking around enjoying the slightest details. Some of these include games such as The Witcher 3: Wild Hunt, Elden Ring, Elder Scrolls V: Skyrim, and many others. But, with a massive and immersive world, comes the need for non-playable characters, or NPCs, who are a valuable aspect of open-world games and this is where Ghostwriter comes in. For many who are outside of the gaming industry, the chatter is just that, background noise that helps your mind believe it's in another world.