Goto

Collaborating Authors

 instrumentation


Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems

Moshkovich, Dany, Zeltyn, Sergey

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly deployed within agentic systems - collections of interacting, LLM-powered agents that execute complex, adaptive workflows using memory, tools, and dynamic planning. While enabling powerful new capabilities, these systems also introduce unique forms of uncertainty stemming from probabilistic reasoning, evolving memory states, and fluid execution paths. Traditional software observability and operations practices fall short in addressing these challenges. This paper presents our vision of AgentOps: a comprehensive framework for observing, analyzing, optimizing, and automating operation of agentic AI systems. We identify distinct needs across four key roles - developers, testers, site reliability engineers (SREs), and business users - each of whom engages with the system at different points in its lifecycle. We present the AgentOps Automation Pipeline, a six-stage process encompassing behavior observation, metric collection, issue detection, root cause analysis, optimized recommendations, and runtime automation. Throughout, we emphasize the critical role of automation in managing uncertainty and enabling self-improving AI systems - not by eliminating uncertainty, but by taming it to ensure safe, adaptive, and effective operation.


Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

Vitale, Francesco, Flammini, Francesco, Caporuscio, Mauro, Mazzocca, Nicola

arXiv.org Artificial Intelligence

Context: Ensuring high levels of dependability in modern computer-based systems has become increasingly challenging due to their complexity. Although systems are validated at design time, their behavior can be different at run-time, possibly showing control-flow anomalies due to "unknown unknowns". Objective: We aim to detect control-flow anomalies through software monitoring, which verifies run-time behavior by logging software execution and detecting deviations from expected control flow. Methods: We propose a methodology to develop software monitors for control-flow anomaly detection through Large Language Models (LLMs) and conformance checking. The methodology builds on existing software development practices to maintain traditional V&V while providing an additional level of robustness and trustworthiness. It leverages LLMs to link design-time models and implementation code, automating source-code instrumentation. The resulting event logs are analyzed via conformance checking, an explainable and effective technique for control-flow anomaly detection. Results: We test the methodology on a case-study scenario from the European Railway Traffic Management System / European Train Control System (ERTMS/ETCS), which is a railway standard for modern interoperable railways. The results obtained from the ERTMS/ETCS case study demonstrate that LLM-based source-code instrumentation can achieve up to 84.775% control-flow coverage of the reference design-time process model, while the subsequent conformance checking-based anomaly detection reaches a peak performance of 96.610% F1-score and 93.515% AUC. Conclusion: Incorporating domain-specific knowledge to guide LLMs in source-code instrumentation significantly allowed obtaining reliable and quality software logs and enabled effective control-flow anomaly detection through conformance checking.


That New Hit Song on Spotify? It Was Made by A.I.

The New Yorker

That New Hit Song on Spotify? Aspiring musicians are churning out tracks using generative artificial intelligence. Some are topping the charts. Nick Arter, a thirty-five-year-old in Washington, D.C., never quite managed to become a professional musician the old-fashioned way. He grew up in Harrisburg, Pennsylvania, in a music-loving family.


TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs

Blankestijn, Arjan, Odyurt, Uraz, Yousefzadeh, Amirreza

arXiv.org Artificial Intelligence

The Transformer Machine Learning (ML) architecture has been gaining considerable momentum in recent years. In particular, computational High-Energy Physics tasks such as jet tagging and particle track reconstruction (tracking), have either achieved proper solutions, or reached considerable milestones using Transformers. On the other hand, the use of specialised hardware accelerators, especially FPGAs, is an effective method to achieve online, or pseudo-online latencies. The development and integration of Transformer-based ML to FPGAs is still ongoing and the support from current tools is very limited to non-existent. Additionally, FPGA resources present a significant constraint. Considering the model size alone, while smaller models can be deployed directly, larger models are to be partitioned in a meaningful and ideally, automated way . We aim to develop methodologies and tools for monolithic, or partitioned Transformer synthesis, specifically targeting inference. Our primary use-case involves two machine learning model designs for tracking, derived from the TrackFormers project. We elaborate our development approach, present preliminary results, and provide comparisons.


TuneGenie: Reasoning-based LLM agents for preferential music generation

Pandey, Amitesh, Arifdjanov, Jafarbek, Tiwari, Ansh

arXiv.org Artificial Intelligence

Recently, Large language models (LLMs) have shown great promise across a diversity of tasks, ranging from generating images to reasoning spatially. Considering their remarkable (and growing) textual reasoning capabilities, we investigate LLMs' potency in conducting analyses of an individual's preferences in music (based on playlist metadata, personal write-ups, etc.) and producing effective prompts (based on these analyses) to be passed to Suno AI (a generative AI tool for music production). Our proposition of a novel LLM-based textual representation to music model (which we call TuneGenie) and the various methods we develop to evaluate & benchmark similar models add to the increasing (and increasingly controversial) corpus of research on the use of AI in generating art.


Instrumentation for Better Demonstrations: A Case Study

Proesmans, Remko, Lips, Thomas, wyffels, Francis

arXiv.org Artificial Intelligence

Learning from demonstrations is a powerful paradigm for robot manipulation, but its effectiveness hinges on both the quantity and quality of the collected data. In this work, we present a case study of how instrumentation, i.e. integration of sensors, can improve the quality of demonstrations and automate data collection. We instrument a squeeze bottle with a pressure sensor to learn a liquid dispensing task, enabling automated data collection via a PI controller. Transformer-based policies trained on automated demonstrations outperform those trained on human data in 78% of cases. Our findings indicate that instrumentation not only facilitates scalable data collection but also leads to better-performing policies, highlighting its potential in the pursuit of generalist robotic agents.


NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

Wang, Yashan, Wu, Shangda, Hu, Jianhuai, Du, Xingjian, Peng, Yueqi, Huang, Yongxin, Fan, Shuai, Li, Xiaobing, Yu, Feng, Sun, Maosong

arXiv.org Artificial Intelligence

We introduce NotaGen, a symbolic music generation model aiming to explore the potential of producing high-quality classical sheet music. Inspired by the success of Large Language Models (LLMs), NotaGen adopts pre-training, fine-tuning, and reinforcement learning paradigms (henceforth referred to as the LLM training paradigms). It is pre-trained on 1.6M pieces of music in ABC notation, and then fine-tuned on approximately 9K high-quality classical compositions conditioned on "period-composer-instrumentation" prompts. For reinforcement learning, we propose the CLaMP-DPO method, which further enhances generation quality and controllability without requiring human annotations or predefined rewards. Our experiments demonstrate the efficacy of CLaMP-DPO in symbolic music generation models with different architectures and encoding schemes. Furthermore, subjective A/B tests show that NotaGen outperforms baseline models against human compositions, greatly advancing musical aesthetics in symbolic music generation.


Analyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase Segmentation

Le, Dinh-Viet-Toan, Bigo, Louis, Keller, Mikaela

arXiv.org Artificial Intelligence

Byte-Pair Encoding (BPE) is an algorithm commonly used in Natural Language Processing to build a vocabulary of subwords, which has been recently applied to symbolic music. Given that symbolic music can differ significantly from text, particularly with polyphony, we investigate how BPE behaves with different types of musical content. This study provides a qualitative analysis of BPE's behavior across various instrumentations and evaluates its impact on a musical phrase segmentation task for both monophonic and polyphonic music. Our findings show that the BPE training process is highly dependent on the instrumentation and that BPE "supertokens" succeed in capturing abstract musical content. In a musical phrase segmentation task, BPE notably improves performance in a polyphonic setting, but enhances performance in monophonic tunes only within a specific range of BPE merges.


A five-bar mechanism to assist finger flexion-extension movement: system implementation

Zapatero-Gutiérrez, Araceli, Castillo-Castañeda, Eduardo, Laribi, Med Amine

arXiv.org Artificial Intelligence

The lack of specialized personnel and assistive technology to assist in rehabilitation therapies is one of the challenges facing the health sector today, and it is projected to increase. For researchers and engineers, it represents an opportunity to innovate and develop devices that improve and optimize rehabilitation services for the benefit of society. Among the different types of injuries, hand injuries occur most frequently. These injuries require a rehabilitation process in order for the hand to regain its functionality. This article presents the fabrication and instrumentation of an end-effector prototype, based on a five-bar configuration, for finger rehabilitation that executes a natural flexion-extension movement. The dimensions were obtained through the gradient method optimization and evaluated through Matlab. Experimental tests were carried out to demonstrate the prototype's functionality and the effectiveness of a five-bar mechanism acting in a vertical plane, where gravity influences the mechanism's performance. Position control using fifth-order polynomials with via points was implemented in the joint space. The design of the end-effector was also evaluated by performing a theoretical comparison, calculated as a function of a real flexion-extension trajectory of the fingers and the angle of rotation obtained through an IMU. As a result, controlling the two degrees of freedom of the mechanism at several points of the trajectory assures the end-effector trajectory and therefore the fingers' range of motion, which helps for full patient recovery.


Emotion Manipulation Through Music -- A Deep Learning Interactive Visual Approach

Abdalla, Adel N., Osborne, Jared, Andonie, Razvan

arXiv.org Artificial Intelligence

In recent years, the fields of Music Information Retrieval (MIR) and Music Emotion Recognition (MER) have received significant attention, leading to multiple advances in how music is analyzed [1, 2]. These developments have increased the accuracy in determining what emotions are present in a given music sample, but the current state of the art is only now passing 75% through the use of Random Forest and Support Vector Machine models [3]. This is in contrast to the field of speech recognition, where current models are approaching 100% accuracy across hundreds of languages for word identification [4] and 85% for standard speech emotion recognition [5]. The additional challenges in music recognition come from the nature of music itself as the lyrical and emotional content of a vocalist's contribution are only one part of the whole. Tempo, rhythm, timbre, instrumentation choice, perceived genre, and other factors contribute together to shape the emotional and tonal landscape of any given work into a unique blend that is interpreted subjectively by individual listeners [6]. The goal of our paper is to show that by changing the underlying structure of a small subset of musical features of any given musical piece, we can adjust the perceived emotional content of the work towards a specific desired emotion.