reference state
Consensus Tracking Control of Multi-agent Systems with A Time-varying Reference State under Binary-valued Communication
Wang, Ting, Qiu, Zhuangzhuang, Lu, Xiaodong, Zhao, Yanlong
This paper investigates the problem of consensus tracking control of discrete time multi-agent systems under binary-valued communication. Different from most existing studies on consensus tracking, the transmitted information between agents is the binary-valued. Parameter identification with binary-valued observations is applied to the estimation of neighbors'states and the tracking control is designed based on the estimation. Two Lyapunov functions are constructed to deal with the strong coupling of estimation and control. Compared with consensus problems under binary-valued communication, a reference state is required for consensus tracking control. Two scenarios of the time-varying reference state are studied respectively. (1) The reference state is asymptotically convergent. An online algorithm that performs estimation and control simultaneously is proposed, in which the estimation step size and the control gain are decreasing with time. By this algorithm, the multi-agent system is proved to achieve consensus tracking with convergence rate O(1/k^{\epsilon} ) under certain conditions. (2) The reference state is bounded, which is less conservative than that in the first case. In this case, the estimation step size and control gain are designed to be constant. By this algorithm, all the followers can reach to a neighborhood of the leader with an exponential rate. Finally, simulations are given to demonstrate theoretical results.
- Asia > China > Beijing > Beijing (0.05)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Asia > China > Shandong Province > Jinan (0.04)
- (2 more...)
On Qualitative Preference in Alternating-time Temporal Logic with Strategy Contexts
We show how to add and eliminate binary preference on plays in Alternating-time Temporal Logic (ATL) with strategy contexts on Concurrent Game Models (CGMs) by means of a translation which preserves satisfaction in models where preference-indiscernibility between plays is an equivalence relation of finite index. The elimination technique also works for a companion second-order path quantifier, which makes quantified path variables range over sets of plays that are closed under preference-indiscernibility. We argue that the preference operator and the specialized quantifier facilitate formulating interesting solution concepts such as Nash equilibrium and secure equilibrium in a straightforward way. We also present a novel translation from ATL with strategy contexts to Quantified Computation Tree Logic (QCTL). Together with the translation which eliminates preference and the specialized form of quantification, this translation allows reasoning about infinite multiplayer synchronous games on CGMs to be translated from the proposed extension of ATL with strategy contexts into QCTL. The setting is related to that of ordered objectives in the works of Bouyer, Brenguier, Markey and Ummels, except that our focus is on the use of the temporal logic languages mentioned above, and we rely on translations into QCTL for the algorithmic solutions.
- Oceania > Australia > Western Australia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Online Fault Tolerance Strategy for Abrupt Reachability Constraint Changes
When a system's constraints change abruptly, the system's reachability safety does no longer sustain. Thus, the system can reach a forbidden/dangerous value. Conventional remedy practically involves online controller redesign (OCR) to re-establish the reachability's compliance with the new constraints, which, however, is usually too slow. There is a need for an online strategy capable of managing runtime changes in reachability constraints. However, to the best of the authors' knowledge, this topic has not been addressed in the existing literature. In this paper, we propose a fast fault tolerance strategy to recover the system's reachability safety in runtime. Instead of redesigning the system's controller, we propose to change the system's reference state to modify the system's reachability to comply with the new constraints. We frame the reference state search as an optimization problem and employ the Karush-Kuhn-Tucker (KKT) method as well as the Interior Point Method (IPM) based Newton's method (as a fallback for the KKT method) for fast solution derivation. The optimization also allows more future fault tolerance. Numerical simulations demonstrate that our method outperforms the conventional OCR method in terms of computational efficiency and success rate. Specifically, the results show that the proposed method finds a solution $10^{2}$ (with the IPM based Newton's method) $\sim 10^{4}$ (with the KKT method) times faster than the OCR method. Additionally, the improvement rate of the success rate of our method over the OCR method is $40.81\%$ without considering the deadline of run time. The success rate remains at $49.44\%$ for the proposed method, while it becomes $0\%$ for the OCR method when a deadline of $1.5 \; seconds$ is imposed.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > India (0.04)
Receding Hamiltonian-Informed Optimal Neural Control and State Estimation for Closed-Loop Dynamical Systems
Rivera, Josue N., Sun, Dengfeng
This paper formalizes Hamiltonian-Informed Optimal Neural (Hion) controllers, a novel class of neural network-based controllers for dynamical systems and explicit non-linear model predictive control. Hion controllers estimate future states and compute optimal control inputs using Pontryagin's Maximum Principle. The proposed framework allows for customization of transient behavior, addressing limitations of existing methods. The Taylored Multi-Faceted Approach for Neural ODE and Optimal Control (T-mano) architecture facilitates training and ensures accurate state estimation. Optimal control strategies are demonstrated for both linear and non-linear dynamical systems.
- Research Report (0.82)
- Instructional Material > Course Syllabus & Notes (0.46)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
The role of data embedding in quantum autoencoders for improved anomaly detection
Araz, Jack Y., Spannowsky, Michael
The performance of Quantum Autoencoders (QAEs) in anomaly detection tasks is critically dependent on the choice of data embedding and ansatz design. This study explores the effects of three data embedding techniques, data re-uploading, parallel embedding, and alternate embedding, on the representability and effectiveness of QAEs in detecting anomalies. Our findings reveal that even with relatively simple variational circuits, enhanced data embedding strategies can substantially improve anomaly detection accuracy and the representability of underlying data across different datasets. Starting with toy examples featuring low-dimensional data, we visually demonstrate the effect of different embedding techniques on the representability of the model. We then extend our analysis to complex, higher-dimensional datasets, highlighting the significant impact of embedding methods on QAE performance.
- Europe > United Kingdom (0.14)
- North America > United States > Virginia > Newport News (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Government (0.68)
- Information Technology (0.47)
Training Efficient Controllers via Analytic Policy Gradient
Wiedemann, Nina, Wüest, Valentin, Loquercio, Antonio, Müller, Matthias, Floreano, Dario, Scaramuzza, Davide
Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficient execution on the robot but hardly match the accuracy of MPC in trajectory tracking tasks. In systems with limited compute, such as aerial vehicles, an accurate controller that is efficient at execution time is imperative. We propose an Analytic Policy Gradient (APG) method to tackle this problem. APG exploits the availability of differentiable simulators by training a controller offline with gradient descent on the tracking error. We address training instabilities that frequently occur with APG through curriculum learning and experiment on a widely used controls benchmark, the CartPole, and two common aerial robots, a quadrotor and a fixed-wing drone. Our proposed method outperforms both model-based and model-free RL methods in terms of tracking error. Concurrently, it achieves similar performance to MPC while requiring more than an order of magnitude less computation time. Our work provides insights into the potential of APG as a promising control method for robotics. To facilitate the exploration of APG, we open-source our code and make it available at https://github.com/lis-epfl/apg_trajectory_tracking.
- Aerospace & Defense > Aircraft (0.68)
- Energy > Oil & Gas > Upstream (0.34)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning
Hoang, Christopher, Sohn, Sungryull, Choi, Jongwook, Carvalho, Wilka, Lee, Honglak
Operating in the real-world often requires agents to learn about a complex environment and apply this understanding to achieve a breadth of goals. This problem, known as goal-conditioned reinforcement learning (GCRL), becomes especially challenging for long-horizon goals. Current methods have tackled this problem by augmenting goal-conditioned policies with graph-based planning algorithms. However, they struggle to scale to large, high-dimensional state spaces and assume access to exploration mechanisms for efficiently collecting training data. In this work, we introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments so as to obtain a policy that is proficient for any goal. SFL leverages the ability of successor features (SF) to capture transition dynamics, using it to drive exploration by estimating state-novelty and to enable high-level planning by abstracting the state-space as a non-parametric landmark-based graph. We further exploit SF to directly compute a goal-conditioned policy for inter-landmark traversal, which we use to execute plans to "frontier" landmarks at the edge of the explored state space. We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces and outperforms state-of-the-art baselines on long-horizon GCRL tasks.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Michigan (0.04)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Faster drug discovery through machine learning
Drugs can only work if they stick to their target proteins in the body. Assessing that stickiness is a key hurdle in the drug discovery and screening process. The new technique, dubbed DeepBAR, quickly calculates the binding affinities between drug candidates and their targets. The approach yields precise calculations in a fraction of the time compared to previous state-of-the-art methods. The researchers say DeepBAR could one day quicken the pace of drug discovery and protein engineering.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)
- North America > United States > California > San Diego County > San Diego (0.05)