Markov Models
Post-processing of EEG-based Auditory Attention Decoding Decisions via Hidden Markov Models
Heintz, Nicolas, Francart, Tom, Bertrand, Alexander
--Auditory attention decoding (AAD) algorithms exploit brain signals, such as electroencephalography (EEG), to identify which speaker a listener is focusing on in a multi-speaker environment. While state-of-the-art AAD algorithms can identify the attended speaker on short time windows, their predictions are often too inaccurate for practical use. In this work, we propose augmenting AAD with a hidden Markov model (HMM) that models the temporal structure of attention. More specifically, the HMM relies on the fact that a subject is much less likely to switch attention than to keep attending the same speaker at any moment in time. We show how a HMM can significantly improve existing AAD algorithms in both causal (real-time) and non-causal (offline) settings. We further demonstrate that HMMs outperform existing postprocessing approaches in both accuracy and responsiveness, and explore how various factors such as window length, switching frequency, and AAD accuracy influence overall performance. The proposed method is computationally efficient, intuitive to use and applicable in both real-time and offline settings. Accurately detecting to whom someone wishes to listen is of crucial importance for a wide array of applications. For example, this would allow a hearing aid to determine which speakers should be enhanced or suppressed [1]-[4]. This problem can potentially be solved by decoding the auditory attention from brain signals using electroencephalography (EEG) [5]-[9]. The most common and reliable method to decode attention from the neural response is based on stimulus reconstruction [3], [5]-[7], [10]. This method is based on the observation that the brain tracks attended speech more than unattended speech [11], [12]. The goal is to train a decoder that reconstructs the temporal variations in the attended speech signal (e.g., its amplitude envelope) from the EEG data.
Deep neural networks can provably solve Bellman equations for Markov decision processes without the curse of dimensionality
Jentzen, Arnulf, Kleinberg, Konrad, Kruse, Thomas
Discrete time stochastic optimal control problems and Markov decision processes (MDPs) are fundamental models for sequential decision-making under uncertainty and as such provide the mathematical framework underlying reinforcement learning theory. A central tool for solving MDPs is the Bellman equation and its solution, the so-called $Q$-function. In this article, we construct deep neural network (DNN) approximations for $Q$-functions associated to MDPs with infinite time horizon and finite control set $A$. More specifically, we show that if the the payoff function and the random transition dynamics of the MDP can be suitably approximated by DNNs with leaky rectified linear unit (ReLU) activation, then the solutions $Q_d\colon \mathbb R^d\to \mathbb R^{|A|}$, $d\in \mathbb{N}$, of the associated Bellman equations can also be approximated in the $L^2$-sense by DNNs with leaky ReLU activation whose numbers of parameters grow at most polynomially in both the dimension $d\in \mathbb{N}$ of the state space and the reciprocal $1/\varepsilon$ of the prescribed error $\varepsilon\in (0,1)$. Our proof relies on the recently introduced full-history recursive multilevel fixed-point (MLFP) approximation scheme.
Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Fei, Zhaoye, Ji, Li, Wang, Siyin, Shi, Junhao, Gong, Jingjing, Qiu, Xipeng
Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, yet they face significant challenges in embodied task planning scenarios that require continuous environmental understanding and action generation. Existing approaches generate open-loop action scripts based on static knowledge, making it difficult to learn causal relationships between actions and environmental feedback, particularly in partially observable environments. We introduce Embodied Planner-R1, a novel outcome-driven reinforcement learning framework that enables LLMs to develop interactive capabilities through autonomous exploration with minimal supervision. Our framework incorporates three key innovations: (1) Without human annotations, we employ pure reinforcement learning with group rollout, incorporating in-environment interaction through parallel exploration; (2) completion-driven sparse reward; and (3) Interactive Policy Optimization (IPO) for efficient learning from grouped trajectories. Across two challenging text-based Embodied planning benchmarks, Embodied Planner-R1 achieves impressive completion rates of 97.78% on ALFWorld and 79.92% on ScienceWorld, surpassing prior methods by a large margin, and suffers only a -3.66% drop in previously unseen environments, evidencing strong generalization.
Learning Motion Skills with Adaptive Assistive Curriculum Force in Humanoid Robots
Cao, Zhanxiang, Zhang, Yang, Nie, Buqing, Lin, Huangxuan, Li, Haoyang, Gao, Yue
A key challenge in this domain is the balance between exploration and exploitation, which often results in slow learning and suboptimal performance [10], [11]. These limitations highlight the need for more effective learning strategies that can improve both the speed and performance of skill acquisition, especially for high-dimensional humanoid control tasks. During human development, external assistance plays a crucial role in learning motion skills [12]. Infants, for example, often rely on parental support during their first steps, with walkers or direct physical assistance to help them gain the confidence and balance needed for independent locomotion [13], [14]. Similarly, in the case of highly complexmovementslikebackflips,experiencedcoachesprovide physical guidance, supporting the learner's back and applying upward forces to prevent falls and promote proper technique [15]. Studies indicate that such external aids not only expedite the learning process but also help prevent learners from adopting ineffective or unsafe strategies [16].
Ad-Hoc Human-AI Coordination Challenge
Dizdareviฤ, Tin, Hammond, Ravi, Gessler, Tobias, Calinescu, Anisoara, Cook, Jonathan, Gallici, Matteo, Lupu, Andrei, Muglich, Darius, Forkel, Johannes, Foerster, Jakob Nicolaus
Achieving seamless coordination between AI agents and humans is crucial for real-world applications, yet it remains a significant open challenge. Hanabi is a cooperative card game featuring imperfect information, constrained communication, theory of mind requirements, and coordinated action -- making it an ideal testbed for human-AI coordination. However, its use for human-AI interaction has been limited by the challenges of human evaluation. In this work, we introduce the Ad-Hoc Human-AI Coordination Challenge (AH2AC2) to overcome the constraints of costly and difficult-to-reproduce human evaluations. We develop \textit{human proxy agents} on a large-scale human dataset that serve as robust, cheap, and reproducible human-like evaluation partners in AH2AC2. To encourage the development of data-efficient methods, we open-source a dataset of 3,079 games, deliberately limiting the amount of available human gameplay data. We present baseline results for both two- and three- player Hanabi scenarios. To ensure fair evaluation, we host the proxy agents through a controlled evaluation system rather than releasing them publicly. The code is available at \href{https://github.com/FLAIROx/ah2ac2}{https://github.com/FLAIROx/ah2ac2}.
Probing Quantum Spin Systems with Kolmogorov-Arnold Neural Network Quantum States
Shamim, Mahmud Ashraf, Reinhardt, Eric A F, Chowdhury, Talal Ahmed, Gleyzer, Sergei, Araujo, Paulo T
Neural Quantum States (NQS) are a class of variational wave functions parametrized by neural networks (NNs) to study quantum many-body systems. In this work, we propose \texttt{SineKAN}, a NQS \textit{ansatz} based on Kolmogorov-Arnold Networks (KANs), to represent quantum mechanical wave functions as nested univariate functions. We show that \texttt{SineKAN} wavefunction with learnable sinusoidal activation functions can capture the ground state energies, fidelities and various correlation functions of the one dimensional Transverse-Field Ising model, Anisotropic Heisenberg model, and Antiferromagnetic $J_{1}-J_{2}$ model with different chain lengths. In our study of the $J_1-J_2$ model with $L=100$ sites, we find that the \texttt{SineKAN} model outperforms several previously explored neural quantum state \textit{ansรคtze}, including Restricted Boltzmann Machines (RBMs), Long Short-Term Memory models (LSTMs), and Multi-layer Perceptrons (MLP) \textit{a.k.a.} Feed Forward Neural Networks, when compared to the results obtained from the Density Matrix Renormalization Group (DMRG) algorithm. We find that \texttt{SineKAN} models can be trained to high precisions and accuracies with minimal computational costs.
Quantum computing and artificial intelligence: status and perspectives
Acampora, Giovanni, Ambainis, Andris, Ares, Natalia, Banchi, Leonardo, Bhardwaj, Pallavi, Binosi, Daniele, Briggs, G. Andrew D., Calarco, Tommaso, Dunjko, Vedran, Eisert, Jens, Ezratty, Olivier, Erker, Paul, Fedele, Federico, Gil-Fuster, Elies, Gรคrttner, Martin, Granath, Mats, Heyl, Markus, Kerenidis, Iordanis, Klusch, Matthias, Kockum, Anton Frisk, Kueng, Richard, Krenn, Mario, Lรคssig, Jรถrg, Macaluso, Antonio, Maniscalco, Sabrina, Marquardt, Florian, Michielsen, Kristel, Muรฑoz-Gil, Gorka, Mรผssig, Daniel, Nautrup, Hendrik Poulsen, Neubauer, Sophie A., van Nieuwenburg, Evert, Orus, Roman, Schmiedmayer, Jรถrg, Schmitt, Markus, Slusallek, Philipp, Vicentini, Filippo, Weitenberg, Christof, Wilhelm, Frank K.
This white paper discusses and explores the various points of intersection between quantum computing and artificial intelligence (AI). It describes how quantum computing could support the development of innovative AI solutions. It also examines use cases of classical AI that can empower research and development in quantum technologies, with a focus on quantum computing and quantum sensing. The purpose of this white paper is to provide a long-term research agenda aimed at addressing foundational questions about how AI and quantum computing interact and benefit one another. It concludes with a set of recommendations and challenges, including how to orchestrate the proposed theoretical work, align quantum AI developments with quantum hardware roadmaps, estimate both classical and quantum resources - especially with the goal of mitigating and optimizing energy consumption - advance this emerging hybrid software engineering discipline, and enhance European industrial competitiveness while considering societal implications.
Super-Resolution Generative Adversarial Networks based Video Enhancement
รetin, Kaฤan, Akรงa, Hacer, Gerek, รmer Nezih
This study introduces an enhanced approach to video super-resolution by extending ordinary Single-Image Super-Resolution (SISR) Super-Resolution Generative Adversarial Network (SRGAN) structure to handle spatio-temporal data. While SRGAN has proven effective for single-image enhancement, its design does not account for the temporal continuity required in video processing. To address this, a modified framework that incorporates 3D Non-Local Blocks is proposed, which is enabling the model to capture relationships across both spatial and temporal dimensions. An experimental training pipeline is developed, based on patch-wise learning and advanced data degradation techniques, to simulate real-world video conditions and learn from both local and global structures and details. This helps the model generalize better and maintain stability across varying video content while maintaining the general structure besides the pixel-wise correctness. Two model variants--one larger and one more lightweight--are presented to explore the trade-offs between performance and efficiency. The results demonstrate improved temporal coherence, sharper textures, and fewer visual artifacts compared to traditional single-image methods. This work contributes to the development of practical, learning-based solutions for video enhancement tasks, with potential applications in streaming, gaming, and digital restoration.
A New Perspective On AI Safety Through Control Theory Methodologies
Ullrich, Lars, Zimmer, Walter, Greer, Ross, Graichen, Knut, Knoll, Alois C., Trivedi, Mohan
While artificial intelligence (AI) is advancing rapidly and mastering increasingly complex problems with astonishing performance, the safety assurance of such systems is a major concern. Particularly in the context of safety-critical, real-world cyber-physical systems, AI promises to achieve a new level of autonomy but is hampered by a lack of safety assurance. While data-driven control takes up recent developments in AI to improve control systems, control theory in general could be leveraged to improve AI safety. Therefore, this article outlines a new perspective on AI safety based on an interdisciplinary interpretation of the underlying data-generation process and the respective abstraction by AI systems in a system theory-inspired and system analysis-driven manner. In this context, the new perspective, also referred to as data control, aims to stimulate AI engineering to take advantage of existing safety analysis and assurance in an interdisciplinary way to drive the paradigm of data control. Following a top-down approach, a generic foundation for safety analysis and assurance is outlined at an abstract level that can be refined for specific AI systems and applications and is prepared for future innovation.
Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor Navigation
Li, Tao, Lei, Haozhe, Yin, Mingsheng, Hu, Yaqi
When using reinforcement learning (RL) to tackle physical control tasks, inductive biases that encode physics priors can help improve sample efficiency during training and enhance generalization in testing. However, the current practice of incorporating these helpful physics-informed inductive biases inevitably runs into significant manual labor and domain expertise, making them prohibitive for general users. This work explores a symbolic approach to distill physics-informed inductive biases into RL agents, where the physics priors are expressed in a domain-specific language (DSL) that is human-readable and naturally explainable. Y et, the DSL priors do not translate directly into an implementable policy due to partial and noisy observations and additional physical constraints in navigation tasks. To address this gap, we develop a physics-informed program-guided RL (PiPRL) framework with applications to indoor navigation. PiPRL adopts a hierarchical and modularized neuro-symbolic integration, where a meta symbolic program receives semantically meaningful features from a neural perception module, which form the bases for symbolic programming that encodes physics priors and guides the RL process of a low-level neural controller. Extensive experiments demonstrate that PiPRL consistently outperforms purely symbolic or neural policies and reduces training time by over 26% with the help of the program-based inductive biases.