Goto

Collaborating Authors

 ambient


CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following

Ma, Yinghao, Li, Siyou, Yu, Juntao, Benetos, Emmanouil, Maezawa, Akira

arXiv.org Artificial Intelligence

Recent advances in audio-text large language models (LLMs) have opened new possibilities for music understanding and generation. However, existing benchmarks are limited in scope, often relying on simplified tasks or multi-choice evaluations that fail to reflect the complexity of real-world music analysis. We reinterpret a broad range of traditional MIR annotations as instruction-following formats and introduce CMI-Bench, a comprehensive music instruction following benchmark designed to evaluate audio-text LLMs on a diverse set of music information retrieval (MIR) tasks. These include genre classification, emotion regression, emotion tagging, instrument classification, pitch estimation, key detection, lyrics transcription, melody extraction, vocal technique recognition, instrument performance technique detection, music tagging, music captioning, and (down)beat tracking: reflecting core challenges in MIR research. Unlike previous benchmarks, CMI-Bench adopts standardized evaluation metrics consistent with previous state-of-the-art MIR models, ensuring direct comparability with supervised approaches. We provide an evaluation toolkit supporting all open-source audio-textual LLMs, including LTU, Qwen-audio, SALMONN, MusiLingo, etc. Experiment results reveal significant performance gaps between LLMs and supervised models, along with their culture, chronological and gender bias, highlighting the potential and limitations of current models in addressing MIR tasks. CMI-Bench establishes a unified foundation for evaluating music instruction following, driving progress in music-aware LLMs.


A novel approach to data generation in generative model

Kim, JaeHong, Shim, Jaewon

arXiv.org Artificial Intelligence

Variational Autoencoders (VAEs) and other generative models are widely employed in artificial intelligence to synthesize new data. However, current approaches rely on Euclidean geometric assumptions and statistical approximations that fail to capture the structured and emergent nature of data generation. This paper introduces the Convergent Fusion Paradigm (CFP) theory, a novel geometric framework that redefines data generation by integrating dimensional expansion accompanied by qualitative transformation. By modifying the latent space geometry to interact with emergent high-dimensional structures, CFP theory addresses key challenges such as identifiability issues and unintended artifacts like hallucinations in Large Language Models (LLMs). CFP theory is based on two key conceptual hypotheses that redefine how generative models structure relationships between data and algorithms. Through the lens of CFP theory, we critically examine existing metric-learning approaches. CFP theory advances this perspective by introducing time-reversed metric embeddings and structural convergence mechanisms, leading to a novel geometric approach that better accounts for data generation as a structured epistemic process. Beyond its computational implications, CFP theory provides philosophical insights into the ontological underpinnings of data generation. By offering a systematic framework for high-dimensional learning dynamics, CFP theory contributes to establishing a theoretical foundation for understanding the data-relationship structures in AI. Finally, future research in CFP theory will be led to its implications for fully realizing qualitative transformations, introducing the potential of Hilbert space in generative modeling.


Predicting concentration levels of air pollutants by transfer learning and recurrent neural network

Fong, Iat Hang, Li, Tengyue, Fong, Simon, Wong, Raymond K., Tallón-Ballesteros, Antonio J.

arXiv.org Artificial Intelligence

Air pollution (AP) poses a great threat to human health, and people are paying more attention than ever to its prediction. Accurate prediction of AP helps people to plan for their outdoor activities and aids protecting human health. In this paper, long-short term memory (LSTM) recurrent neural networks (RNNs) have been used to predict the future concentration of air pollutants (APS) in Macau. Additionally, meteorological data and data on the concentration of APS have been utilized. Moreover, in Macau, some air quality monitoring stations (AQMSs) have less observed data in quantity, and, at the same time, some AQMSs recorded less observed data of certain types of APS. Therefore, the transfer learning and pre-trained neural networks have been employed to assist AQMSs with less observed data to build a neural network with high prediction accuracy. The experimental sample covers a period longer than 12-year and includes daily measurements from several APS as well as other more classical meteorological values. Records from five stations, four out of them are AQMSs and the remaining one is an automatic weather station, have been prepared from the aforesaid period and eventually underwent to computational intelligence techniques to build and extract a prediction knowledge-based system. As shown by experimentation, LSTM RNNs initialized with transfer learning methods have higher prediction accuracy; it incurred shorter training time than randomly initialized recurrent neural networks.


The A.I. Surveillance Companies That Say They Can Thwart Mass Shootings and Suicides

Slate

Our world has long been filled with cameras peering out over streets, malls, and schools. Many have been recording for years. But for the most part, no one ever looks at the footage. These little devices, perched on shelves and poles, exist primarily to create a record. If something happens and someone wants to learn more, they can go back.


Software Engineer, Front-end (Illinois) at Ambient.ai

#artificialintelligence

Ambient.ai is an AI company headquartered in Palo Alto on a mission to prevent as many security incidents as possible. Our breakthrough technology combines cutting-edge deep learning with a contextual knowledge model to achieve human-like perception ability. Ambient's flagship product has been deployed by multiple Fortune 100 companies to solve a mission-critical problem in a way that has never been possible. The company was founded in 2017 by Shikhar Shrestha and Vikesh Khanna who are experts in artificial intelligence from Stanford University who previously built iconic products at Apple, Google, Microsoft, and Dropbox. We are a Series-B company backed by Andreessen Horowitz (a16z), SV Angel, YCombinator, and visionary angels like Jyoti Bansal, Mark Leslie, and Elad Gil.


DHL and IBM report cites benefits and potential of AI in logistics

#artificialintelligence

The advent of Artificial Intelligence (AI) technology has been making inroads over the years in various sectors. But a joint report issued this week by global express and logistics services provider DHL and technology powerhouse IBM takes an in-depth look into the impact of AI within logistics. The report, entitled "Artificial Intelligence in Logistics: A collaborative report by DHL and IBM on implications and use cases for the logistics industry," examines different ways in which AI can be used for augmenting logistics operations, especially now at a time when leveraging AI is more accessible and affordable than it has been in the past. "Everything can be enhanced through modern technology, and I think AI is at the beginning of really big usefulness," said Ken Allen, CEO of DHL Express, in an interview. "We already have big data and IoT and this is another part of that. This type of digitalization proposes the next'S-curve' after globalization that is really driving our business in this fast-growing world of e-commerce. Now…for the first time through broadband and mobile devices everyone can be connected, and the possibilities are endless. It creates massive opportunities, as well as massive complexities in that everyone in the world is a potential customer now. And we need to use AI, big data, and other forms of digital marketing to reach all of our customers."


The First Workshop on Artificial Intelligence Techniques for Ambient Intelligence (AITAm I '06)

AI Magazine

Reports The first annual workshop on the role of AI in ambient intelligence was held in Riva de Garda, Italy, on August 29, 2006. The workshop was colocated with the European Conference on Artificial Intelligence (ECAI 2006). It provided an opportunity for researchers in a variety of AI subfields together with representatives of commercial interests to explore ambient intelligence technology and applications. Ambient intelligence is an AIbased paradigm with a high potential to affect daily life in the near future. The broad idea is to enrich a space (such as a room, house, building, bus station, or a critical area in a hospital) with sensors tied to intelligent software, so that the people using the space can benefit from a responsive, even wise environment.


Mossberg: The Disappearing Computer

#artificialintelligence

The biggest hardware and software arrival since the iPad in 2010 has been Amazon's Echo voice-controlled intelligent speaker, powered by its Alexa software assistant. But just because you're not seeing amazing new consumer tech products on Amazon, in the app stores, or at the Apple Store or Best Buy, that doesn't mean the tech revolution is stuck or stopped. They are: Artificial intelligence / machine learning, augmented reality, virtual reality, robotics and drones, smart homes, self-driving cars, and digital health / wearables. Google has changed its entire corporate mission to be "AI first" and, with Google Home and Google Assistant, to perform tasks via voice commands and eventually hold real, unstructured conversations.


Intelligent Agents Things

#artificialintelligence

Artificial Intelligence is all the rage amongst founders and investors. And for good reason: regardless of where you think we are in the hype cycle, it's increasingly clear AI is eventually going to touch everything. The questions now turn to when and how it will impact specific markets and categories. I've been particularly excited about the consumerization of AI and the impact on everyday products and platforms for consumers and professionals. There's a tendency to reduce AI to machine learning (ML), the subfield primarily responsible for AI's resurgence, but ML is just one part of a broader story.


Cinematic, Ambient, Inhabitable Narrative Environments: Story Systems in Search of an Artificial Intelligence Engine

Wingate, Steven Nicholas (South Dakota State University)

AAAI Conferences

Cinematic, Ambient, Inhabitable Narrative Environments (CAINEs) are conceptual AI-driven interactive story systems combining text, audio, and visual imagery that are scalable and adaptable to a wide range of storytelling needs and interactor inputs. Conceived by at artist outside the AI community, they represent an opportunity to use AI in a nontraditional and immersive narrative fashion that relies not on the goal-based arrangement of story elements, but on the accretion and association of those elements in the minds of interactors. This paper represents the initial phase of the project’s development.