bosse
Optimal and Adaptive Non-Stationary Dueling Bandits Under a Generalized Borda Criterion
In dueling bandits, the learner receives preference feedback between arms, and the regret of an arm is defined in terms of its suboptimality to a winner arm. The more challenging and practically motivated non-stationary variant of dueling bandits, where preferences change over time, has been the focus of several recent works (Saha and Gupta, 2022; Buening and Saha, 2023; Suk and Agarwal, 2023). The goal is to design algorithms without foreknowledge of the amount of change. The bulk of known results here studies the Condorcet winner setting, where an arm preferred over any other exists at all times. Yet, such a winner may not exist and, to contrast, the Borda version of this problem (which is always well-defined) has received little attention. In this work, we establish the first optimal and adaptive Borda dynamic regret upper bound, which highlights fundamental differences in the learnability of severe non-stationarity between Condorcet vs. Borda regret objectives in dueling bandits. Surprisingly, our techniques for non-stationary Borda dueling bandits also yield improved rates within the Condorcet winner setting, and reveal new preference models where tighter notions of non-stationarity are adaptively learnable. This is accomplished through a novel generalized Borda score framework which unites the Borda and Condorcet problems, thus allowing reduction of Condorcet regret to a Borda-like task. Such a generalization was not previously known and is likely to be of independent interest.
Virtualization of Tiny Embedded Systems with a robust real-time capable and extensible Stack Virtual Machine REXAVM supporting Material-integrated Intelligent Systems and Tiny Machine Learning
Bosse, Stefan, Bornemann, Sarah, Lüssem, Björn
In the past decades, there has been a significant increase in sensor density and sensor deployment, driven by a significant miniaturization and decrease in size down to the chip level, addressing ubiquitous computing, edge computing, as well as distributed sensor networks. Material-integrated and intelligent systems (MIIS) provide the next integration and application level, but they create new challenges and introduce hard constraints (resources, energy supply, communication, resilience, and security). Commonly, low-resource systems are statically programmed processors with application-specific software or application-specific hardware (FPGA). This work demonstrates the need for and solution to virtualization in such low-resource and constrained systems towards resilient distributed sensor and cyber-physical networks using a unified low-resource, customizable, and real-time capable embedded and extensible stack virtual machine (REXAVM) that can be implemented and cooperate in both software and hardware. In a holistic architecture approach, the VM specifically addresses digital signal processing and tiny machine learning. The REXAVM is highly customizable through the use of VM program code generators at compile time and incremental code processing at run time. The VM uses an integrated, highly efficient just-in-time compiler to create Bytecode from text code. This paper shows and evaluates the suitability of the proposed VM architecture for operationally equivalent software and hardware (FPGA) implementations. Specific components supporting tiny ML and DSP using fixed-point arithmetic with respect to efficiency and accuracy are discussed. An extended use-case section demonstrates the usability of the introduced VM architecture for a broad range of applications.