Goto

Collaborating Authors

 Markov Models


A Time-Series Data Augmentation Model through Diffusion and Transformer Integration

arXiv.org Artificial Intelligence

IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS 1 A Time-Series Data Augmentation Model through Diffusion and Transformer Integration Y uren Zhang ID, Zhongnan Pu ID, Lei jing ID Member,IEEE Abstract --With the development of Artificial Intelligence, numerous real-world tasks have been accomplished using technology integrated with deep learning. T o achieve optimal performance, deep neural networks typically require large volumes of data for training. Although advances in data augmentation have facilitated the acquisition of vast datasets, most of this data is concentrated in domains like images and speech. However, there has been relatively less focus on augmenting time-series data. T o address this gap and generate a substantial amount of time-series data, we propose a simple and effective method that combines the Diffusion and Transformer models. By utilizing an adjusted diffusion denoising model to generate a large volume of initial time-step action data, followed by employing a Transformer model to predict subsequent actions, and incorporating a weighted loss function to achieve convergence, the method demonstrates its effectiveness. Using the performance improvement of the model after applying augmented data as a benchmark, and comparing the results with those obtained without data augmentation or using traditional data augmentation methods, this approach shows its capability to produce high-quality augmented data. I NTRODUCTION W ITH the development of artificial intelligence (AI), numerous tasks in the real world have been accomplished through technologies combined with deep learning. Typically, a neural network that exhibits excellent performance requires a substantial amount of data for training. V arious types of multi-modal data, such as images, speech, and audio, can now be easily obtained from the Internet. The acquisition of these types of data is no longer an issue. However, due to privacy concerns, costs, and other factors, not all types of data can reach the scale of image or other types data. For instance, the data scale of rare diseases often remains relatively small [1], [2].


Deep Learning Innovations for Energy Efficiency: Advances in Non-Intrusive Load Monitoring and EV Charging Optimization for a Sustainable Grid

arXiv.org Artificial Intelligence

The global energy landscape is undergoing a profound transformation, often referred to as the energy transition, driven by the urgent need to mitigate climate change, reduce greenhouse gas emissions, and ensure sustainable energy supplies. However, the undoubted complexity of new investments in renewables, as well as the phase out of high CO2-emission energy sources, hampers the pace of the energy transition and raises doubts as to whether new renewable energy sources are capable of solely meeting the climate target goals. This highlights the need to investigate alternative pathways to accelerate the energy transition, by identifying human activity domains with higher/excessive energy demands. Two notable examples where there is room for improvement, in the sense of reducing energy consumption and consequently CO2 emissions, are residential energy consumption and road transport. This dissertation investigates the development of novel Deep Learning techniques to create tools which solve limitations in these two key energy domains. Reduction of residential energy consumption can be achieved by empowering end-users with the user of Non-Intrusive Load Monitoring, whereas optimization of EV charging with Deep Reinforcement Learning can tackle road transport decarbonization.


Hierarchical Task Decomposition for Execution Monitoring and Error Recovery: Understanding the Rationale Behind Task Demonstrations

arXiv.org Artificial Intelligence

Multi-step manipulation tasks where robots interact with their environment and must apply process forces based on the perceived situation remain challenging to learn and prone to execution errors. Accurately simulating these tasks is also difficult. Hence, it is crucial for robust task performance to learn how to coordinate end-effector pose and applied force, monitor execution, and react to deviations. To address these challenges, we propose a learning approach that directly infers both low- and high-level task representations from user demonstrations on the real system. We developed an unsupervised task segmentation algorithm that combines intention recognition and feature clustering to infer the skills of a task. We leverage the inferred characteristic features of each skill in a novel unsupervised anomaly detection approach to identify deviations from the intended task execution. Together, these components form a comprehensive framework capable of incrementally learning task decisions and new behaviors as new situations arise. Compared to state-of-the-art learning techniques, our approach significantly reduces the required amount of training data and computational complexity while efficiently learning complex in-contact behaviors and recovery strategies. Our proposed task segmentation and anomaly detection approaches outperform state-of-the-art methods on force-based tasks evaluated on two different robotic systems.


Qualitative Analysis of $ω$-Regular Objectives on Robust MDPs

arXiv.org Artificial Intelligence

Robust Markov Decision Processes (RMDPs) generalize classical MDPs that consider uncertainties in transition probabilities by defining a set of possible transition functions. An objective is a set of runs (or infinite trajectories) of the RMDP, and the value for an objective is the maximal probability that the agent can guarantee against the adversarial environment. We consider (a) reachability objectives, where given a target set of states, the goal is to eventually arrive at one of them; and (b) parity objectives, which are a canonical representation for $ω$-regular objectives. The qualitative analysis problem asks whether the objective can be ensured with probability 1. In this work, we study the qualitative problem for reachability and parity objectives on RMDPs without making any assumption over the structures of the RMDPs, e.g., unichain or aperiodic. Our contributions are twofold. We first present efficient algorithms with oracle access to uncertainty sets that solve qualitative problems of reachability and parity objectives. We then report experimental results demonstrating the effectiveness of our oracle-based approach on classical RMDP examples from the literature scaling up to thousands of states.


A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance

arXiv.org Artificial Intelligence

We study reinforcement learning by combining recent advances in regularized linear programming formulations with the classical theory of stochastic approximation. Motivated by the challenge of designing algorithms that leverage off-policy data while maintaining on-policy exploration, we propose PGDA-RL, a novel primal-dual Projected Gradient Descent-Ascent algorithm for solving regularized Markov Decision Processes (MDPs). PGDA-RL integrates experience replay-based gradient estimation with a two-timescale decomposition of the underlying nested optimization problem. The algorithm operates asynchronously, interacts with the environment through a single trajectory of correlated data, and updates its policy online in response to the dual variable associated with the occupation measure of the underlying MDP. We prove that PGDA-RL converges almost surely to the optimal value function and policy of the regularized MDP. Our convergence analysis relies on tools from stochastic approximation theory and holds under weaker assumptions than those required by existing primal-dual RL approaches, notably removing the need for a simulator or a fixed behavioral policy.


Model-Based AI planning and Execution Systems for Robotics

arXiv.org Artificial Intelligence

Model-based planning and execution systems offer a principled approach to building flexible autonomous robots that can perform diverse tasks by automatically combining a host of basic skills. This idea is almost as old as modern robotics. Yet, while diverse general-purpose reasoning architectures have been proposed since, general-purpose systems that are integrated with modern robotic platforms have emerged only recently, starting with the influential ROSPlan system. Since then, a growing number of model-based systems for robot task-level control have emerged. In this paper, we consider the diverse design choices and issues existing systems attempt to address, the different solutions proposed so far, and suggest avenues for future development.


Automatic Music Transcription using Convolutional Neural Networks and Constant-Q transform

arXiv.org Artificial Intelligence

Automatic music transcription (AMT) is the problem of analyzing an audio recording of a musical piece and detecting notes that are being played. AMT is a challenging problem, particularly when it comes to polyphonic music. The goal of AMT is to produce a score representation of a music piece, by analyzing a sound signal containing multiple notes played simultaneously. In this work, we design a processing pipeline that can transform classical piano audio files in .wav format into a music score representation. The features from the audio signals are extracted using the constant-Q transform, and the resulting coefficients are used as an input to the convolutional neural network (CNN) model.


Uncertain Machine Ethics Planning

arXiv.org Artificial Intelligence

Machine Ethics decisions should consider the implications of uncertainty over decisions. Decisions should be made over sequences of actions to reach preferable outcomes long term. The evaluation of outcomes, however, may invoke one or more moral theories, which might have conflicting judgements. Each theory will require differing representations of the ethical situation. For example, Utilitarianism measures numerical values, Deontology analyses duties, and Virtue Ethics emphasises moral character. While balancing potentially conflicting moral considerations, decisions may need to be made, for example, to achieve morally neutral goals with minimal costs. In this paper, we formalise the problem as a Multi-Moral Markov Decision Process and a Multi-Moral Stochastic Shortest Path Problem. We develop a heuristic algorithm based on Multi-Objective AO*, utilising Sven-Ove Hansson's Hypothetical Retrospection procedure for ethical reasoning under uncertainty. Our approach is validated by a case study from Machine Ethics literature: the problem of whether to steal insulin for someone who needs it.


Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time

arXiv.org Artificial Intelligence

Most popular and effective approaches to online solving Partially Observable Markov Decision Processes (POMDPs, Kaelbling et al. (1998)), e.g., Partially Observable Monte Carlo Planning (POMCP) by Silver and Veness (2010) and Determinized Sparse Partially Observable Tree (DESPOT) by Ye et al. (2017), rely on Monte Carlo Tree Search (MCTS). These approaches are based on online simulations performed in a simulation environment (i.e. a black-box twin of the real POMDP environment) and estimate the value of actions. However, they require domain-specific policy heuristics, suggesting best actions at each state, for efficient exploration. Macro-actions (He et al. (2011); Bertolucci et al. (2021)) are popular policy heuristics that are particularly efficient for long planning horizons. A macro-action is essentially a sequence of suggested actions from a given state that can effectively guide the simulation phase towards actions with high utilities. However, such heuristics are heavily dependent on domain features and are typically handcrafted for each specific domain. Defining these heuristics is an arduous process that requires significant domain knowledge, especially in complex domains. An alternative approach, like the one by Cai and Hsu (2022), is to learn such heuristics via neural networks, which are, however, uninterpretable and data-inefficient. This paper extends the methodology proposed by Meli et al. (2024) to the learning, via Inductive Logic Programming (ILP, Muggleton (1991)), of Event Calculus (EC) theories C. Veronese, D. Meli & A. Farinelli.


An Active Inference perspective on Neurofeedback Training

arXiv.org Artificial Intelligence

Neurofeedback training (NFT) aims to teach self-regulation of brain activity through real-time feedback, but suffers from highly variable outcomes and poorly understood mechanisms, hampering its validation. To address these issues, we propose a formal computational model of the NFT closed loop. Using Active Inference, a Bayesian framework modelling perception, action, and learning, we simulate agents interacting with an NFT environment. This enables us to test the impact of design choices (e.g., feedback quality, biomarker validity) and subject factors (e.g., prior beliefs) on training. Simulations show that training effectiveness is sensitive to feedback noise or bias, and to prior beliefs (highlighting the importance of guiding instructions), but also reveal that perfect feedback is insufficient to guarantee high performance. This approach provides a tool for assessing and predicting NFT variability, interpret empirical data, and potentially develop personalized training protocols.