Goto

Collaborating Authors

 Tombouctou


Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts

Altinok, Duygu

arXiv.org Artificial Intelligence

ASR systems often struggle with maintaining syntactic and semantic accuracy in long audio transcripts, impacting tasks like Named Entity Recognition (NER), capitalization, and punctuation. We propose a novel approach that enhances ASR by distilling contextual knowledge from LLaMA models into Whisper. Our method uses two strategies: (1) token level distillation with optimal transport to align dimensions and sequence lengths, and (2) representation loss minimization between sentence embeddings of Whisper and LLaMA, blending syntax and semantics. Evaluations on the Spoken Wikipedia dataset, a benchmark with long audios and rich entities demonstrate significant improvements in Word Error Rate (WER), NER, capitalization, and punctuation success. By introducing novel NER metrics and exploring semantics aware ASR, our work highlights the value of integrating linguistic context into transcription, setting a foundation for robust, context-aware ASR in longform speech.


Towards Robust Knowledge Representations in Multilingual LLMs for Equivalence and Inheritance based Consistent Reasoning

Arora, Gaurav, Merugu, Srujana, Jain, Shreya, Saxena, Vaibhav

arXiv.org Artificial Intelligence

Reasoning and linguistic skills form the cornerstone of human intelligence, facilitating problem-solving and decision-making. Recent advances in Large Language Models (LLMs) have led to impressive linguistic capabilities and emergent reasoning behaviors, fueling widespread adoption across application domains. However, LLMs still struggle with complex reasoning tasks, highlighting their systemic limitations. In this work, we focus on evaluating whether LLMs have the requisite representations to reason using two foundational relationships: "equivalence" and "inheritance". We introduce novel tasks and benchmarks spanning six languages and observe that current SOTA LLMs often produce conflicting answers to the same questions across languages in 17.3-57.5% of cases and violate inheritance constraints in up to 37.2% cases. To enhance consistency across languages, we propose novel "Compositional Representations" where tokens are represented as composition of equivalent tokens across languages, with resulting conflict reduction (up to -4.7%) indicating benefits of shared LLM representations.


Effects of Human Adversarial and Affable Samples on BERT Generalization

Elangovan, Aparna, He, Jiayuan, Li, Yuan, Verspoor, Karin

arXiv.org Artificial Intelligence

BERT-based models have had strong performance on leaderboards, yet have been demonstrably worse in real-world settings requiring generalization. Limited quantities of training data is considered a key impediment to achieving generalizability in machine learning. In this paper, we examine the impact of training data quality, not quantity, on a model's generalizability. We consider two characteristics of training data: the portion of human-adversarial (h-adversarial), i.e., sample pairs with seemingly minor differences but different ground-truth labels, and human-affable (h-affable) training samples, i.e., sample pairs with minor differences but the same ground-truth label. We find that for a fixed size of training samples, as a rule of thumb, having 10-30% h-adversarial instances improves the precision, and therefore F1, by up to 20 points in the tasks of text classification and relation extraction. Increasing h-adversarials beyond this range can result in performance plateaus or even degradation. In contrast, h-affables may not contribute to a model's generalizability and may even degrade generalization performance.


AI Is Steeped in Big Tech's 'Digital Colonialism'

WIRED

It has been said that algorithms are "opinions embedded in code." Few people understand the implications of that better than Abeba Birhane. Born and raised in Bahir Dar, Ethiopia, Birhane moved to Ireland to study: first psychology, then philosophy, then a PhD in cognitive science at University College Dublin. During her doctorate, she found herself surrounded by software developers and data science students--immersed in the models they were building and the data sets they were using. But she started to realize that no one was really asking questions about what was actually in those data sets.


Acoustic scene classification using auditory datasets

Kumpawat, Jayesh, Dey, Shubhajit

arXiv.org Artificial Intelligence

The approach used not only challenges some of the fundamental mathematical techniques used so far in early experiments of the same trend but also introduces new scopes and new horizons for interesting results. The physics governing spectrograms have been optimized in the project along with exploring how it handles the intense requirements of the problem at hand. Major contributions and developments brought under the light, through this project involve using better mathematical techniques and problem-specific machine learning methods. Improvised data analysis and data augmentation for audio datasets like frequency masking and random frequency-time stretching are used in the project and hence are explained in this paper. In the used methodology, the audio transforms principle were also tried and explored, and indeed the insights gained were used constructively in the later stages of the project. Using a deep learning principle is surely one of them. Also, in this paper, the potential scopes and upcoming research openings in both short and long term tunnel of time has been presented. Although much of the results gained are domain-specific as of now, they are surely potent enough to produce novel solutions in various different domains of diverse backgrounds.


Forty fighters 'neutralised' in drone strikes in Niger

Al Jazeera

French drone strikes have killed nearly 40 fighters earlier travelling on motorcycles near Niger's border with Burkina Faso, France's military said on Thursday. In a statement, the French military called the strikes a "new tactical success" for France's counterterrorism efforts in Africa's Sahel region, named Operation Barkhane. "Intelligence obtained from Nigerien units in contact with the column confirmed that the motorcycles belonged to an armed terrorist group moving between Burkina Faso and Niger," Barkhane said in the statement. "In close coordination with Niger's Armed Forces, the Barkhane force conducted several strikes against the column. Nearly 40 terrorists were neutralised."

  Country:

What is Synthetic Intelligence and What Does It Mean for Humanity?

#artificialintelligence

A merger between humans and machines is coming, and it's not what you may have thought. Something mysterious flickered into reality when our ancestors first learned to extract knowledge from their heads and embed it in tools. Now, millions of years later, our tools are fusing with us and, in so doing, bringing about something that is part biological and part technological. We are incubating this new intelligence in our organizations, but it is also true that it represents an extension of ourselves. Humanity is like a seed in an enigmatic womb made up of artificial intelligence and automation.


What is Synthetic Intelligence and What Does It Mean for Humanity?

#artificialintelligence

A merger between humans and machines is coming, and it's not what you may have thought. Something mysterious flickered into reality when our ancestors first learned to extract knowledge from their heads and embed it in tools. Now, millions of years later, our tools are fusing with us and, in so doing, bringing about something that is part biological and part technological. We are incubating this new intelligence in our organizations, but it is also true that it represents an extension of ourselves. Humanity is like a seed in an enigmatic womb made up of artificial intelligence and automation.


IoT Inventory Management Uses AI for Supply Chains

#artificialintelligence

Famous American philosopher George Carlin did a brilliant riff on stuff. He rightly contended that we just spend our lives moving and acquiring our stuff, and a house is just a pile of your stuff with a cover on it. Companies that are in the business of making and selling stuff have to be able to track what could amount to millions of widgets until they can unload their stuff onto someone else. There are plenty of inventory management software systems out there using artificial intelligence to help retailers be more efficient peddling stuff. For example, a company called Blue Yonder uses machine learning algorithms to crunch data, including sales and local weather forecasts, to help stores reduce out-of-stock rates of stuff while boosting profits so they can sell more stuff.


Accelerating AI: Past...

#artificialintelligence

SiFive does a quarterly series of tech talks, not necessarily directly to do with SiFive or even RISC-V. For example, last quarter it was Paul Kocher (and if you don't know that name, you need to go and read my post about that talk Paul Kocher: Differential Power Analysis and Spectre). This quarter it was Krste Asanović on Accelerating AI: Past, Present, and Future. This post will cover the past. The present and future have to wait (good title for a movie?).