direct interaction
Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments
We propose a Bayesian decision making framework for control of Markov Decision Processes (MDPs) with unknown dynamics and large, possibly continuous, state, action, and parameter spaces in data-poor environments. Most of the existing adaptive controllers for MDPs with unknown dynamics are based on the reinforcement learning framework and rely on large data sets acquired by sustained direct interaction with the system or via a simulator. This is not feasible in many applications, due to ethical, economic, and physical constraints. The proposed framework addresses the data poverty issue by decomposing the problem into an offline planning stage that does not rely on sustained direct interaction with the system or simulator and an online execution stage. In the offline process, parallel Gaussian process temporal difference (GPTD) learning techniques are employed for near-optimal Bayesian approximation of the expected discounted reward over a sample drawn from the prior distribution of unknown parameters. In the online stage, the action with the maximum expected return with respect to the posterior distribution of the parameters is selected. This is achieved by an approximation of the posterior distribution using a Markov Chain Monte Carlo (MCMC) algorithm, followed by constructing multiple Gaussian processes over the parameter space for efficient prediction of the means of the expected return at the MCMC sample. The effectiveness of the proposed framework is demonstrated using a simple dynamical system model with continuous state and action spaces, as well as a more complex model for a metastatic melanoma gene regulatory network observed through noisy synthetic gene expression data.
Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments
We propose a Bayesian decision making framework for control of Markov Decision Processes (MDPs) with unknown dynamics and large, possibly continuous, state, action, and parameter spaces in data-poor environments. Most of the existing adaptive controllers for MDPs with unknown dynamics are based on the reinforcement learning framework and rely on large data sets acquired by sustained direct interaction with the system or via a simulator. This is not feasible in many applications, due to ethical, economic, and physical constraints. The proposed framework addresses the data poverty issue by decomposing the problem into an offline planning stage that does not rely on sustained direct interaction with the system or simulator and an online execution stage. In the offline process, parallel Gaussian process temporal difference (GPTD) learning techniques are employed for near-optimal Bayesian approximation of the expected discounted reward over a sample drawn from the prior distribution of unknown parameters. In the online stage, the action with the maximum expected return with respect to the posterior distribution of the parameters is selected. This is achieved by an approximation of the posterior distribution using a Markov Chain Monte Carlo (MCMC) algorithm, followed by constructing multiple Gaussian processes over the parameter space for efficient prediction of the means of the expected return at the MCMC sample. The effectiveness of the proposed framework is demonstrated using a simple dynamical system model with continuous state and action spaces, as well as a more complex model for a metastatic melanoma gene regulatory network observed through noisy synthetic gene expression data.
ExDDI: Explaining Drug-Drug Interaction Predictions with Natural Language
Sun, Zhaoyue, Li, Jiazheng, Pergola, Gabriele, He, Yulan
Predicting unknown drug-drug interactions (DDIs) is crucial for improving medication safety. Previous efforts in DDI prediction have typically focused on binary classification or predicting DDI categories, with the absence of explanatory insights that could enhance trust in these predictions. In this work, we propose to generate natural language explanations for DDI predictions, enabling the model to reveal the underlying pharmacodynamics and pharmacokinetics mechanisms simultaneously as making the prediction. To do this, we have collected DDI explanations from DDInter and DrugBank and developed various models for extensive experiments and analysis. Our models can provide accurate explanations for unknown DDIs between known drugs. This paper contributes new tools to the field of DDI prediction and lays a solid foundation for further research on generating explanations for DDI predictions.
Robotic Blended Sonification: Consequential Robot Sound as Creative Material for Human-Robot Interaction
Johansen, Stine S., Browning, Yanto, Brumpton, Anthony, Donovan, Jared, Rittenbruch, Markus
Current research in robotic sounds generally focuses on either masking the consequential sound produced by the robot or on sonifying data about the robot to create a synthetic robot sound. We propose to capture, modify, and utilise rather than mask the sounds that robots are already producing. In short, this approach relies on capturing a robot's sounds, processing them according to contextual information (e.g., collaborators' proximity or particular work sequences), and playing back the modified sound. Previous research indicates the usefulness of non-semantic, and even mechanical, sounds as a communication tool for conveying robotic affect and function. Adding to this, this paper presents a novel approach which makes two key contributions: (1) a technique for real-time capture and processing of consequential robot sounds, and (2) an approach to explore these sounds through direct human-robot interaction. Drawing on methodologies from design, human-robot interaction, and creative practice, the resulting 'Robotic Blended Sonification' is a concept which transforms the consequential robot sounds into a creative material that can be explored artistically and within application-based studies.
A Graph Dynamics Prior for Relational Inference
Pan, Liming, Shi, Cheng, Dokmanić, Ivan
Relational inference aims to identify interactions between parts of a dynamical system from the observed dynamics. Current state-of-the-art methods fit the dynamics with a graph neural network (GNN) on a learnable graph. They use one-step message-passing GNNs -- intuitively the right choice since non-locality of multi-step or spectral GNNs may confuse direct and indirect interactions. But the \textit{effective} interaction graph depends on the sampling rate and it is rarely localized to direct neighbors, leading to poor local optima for the one-step model. In this work, we propose a \textit{graph dynamics prior} (GDP) for relational inference. GDP constructively uses error amplification in non-local polynomial filters to steer the solution to the ground-truth graph. To deal with non-uniqueness, GDP simultaneously fits a ``shallow'' one-step model and a polynomial multi-step model with shared graph topology. Experiments show that GDP reconstructs graphs far more accurately than earlier methods, with remarkable robustness to under-sampling. Since appropriate sampling rates for unknown dynamical systems are not known a priori, this robustness makes GDP suitable for real applications in scientific machine learning. Reproducible code is available at https://github.com/DaDaCheng/GDP.
Pinaki Laskar on LinkedIn: #AI #Engineering #machinelearning
AI Researcher, Cognitive Technologist Inventor - AI Thinking, Think Chain Innovator - AIOT, XAI, Autonomous Cars, IIOT Founder Fisheyebox Spatial Computing Savant, Transformative Leader, Industry X.0 Practitioner Why Is Transdisciplinary #AI Needed? TransAI should be human-centred by including the following aspects: Explainable AI, i.e., they allow humans to understand the reasons behind their recommendations or decisions; Verifiable AI, i.e., they guarantee fundamental properties like safety, privacy and security; Physical AI, it refers to the use of AI techniques to solve problems that involve direct interaction with the physical world, e.g., by observing the world through sensors or by modifying the world through actuators. What distinguishes Physical AI systems is their direct interaction with the physical world, contrasting with other AI types, e.g., financial recommendation systems (where AI is between the human and a database); chatbots (where AI interacts with the human via Internet); or AI chess-players (where the chess board state to the AI algorithm). Collaborative AI, i.e., they can share knowledge with humans and take decisions jointly with them; Integrative AI, i.e., they can combine different requirements and methods into one AI system. TransAI embraces interdependent elements: Philosophical AI AI has closer scientific connections with philosophy than do other sciences, because AI shares many concepts with philosophy, e.g.
Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments
Imani, Mahdi, Ghoreishi, Seyede Fatemeh, Braga-Neto, Ulisses M.
We propose a Bayesian decision making framework for control of Markov Decision Processes (MDPs) with unknown dynamics and large, possibly continuous, state, action, and parameter spaces in data-poor environments. Most of the existing adaptive controllers for MDPs with unknown dynamics are based on the reinforcement learning framework and rely on large data sets acquired by sustained direct interaction with the system or via a simulator. This is not feasible in many applications, due to ethical, economic, and physical constraints. The proposed framework addresses the data poverty issue by decomposing the problem into an offline planning stage that does not rely on sustained direct interaction with the system or simulator and an online execution stage. In the offline process, parallel Gaussian process temporal difference (GPTD) learning techniques are employed for near-optimal Bayesian approximation of the expected discounted reward over a sample drawn from the prior distribution of unknown parameters.
Towards Con-Resistant Trust Models for Distributed Agent Systems
Salehi-Abari, Amirali (Carleton University) | White, Tony (Carleton University)
Artificial societies — distributed systems of autonomous agents — are becoming increasingly important in e-commerce. Agents base their decisions on trust and reputation in ways analogous to human societies. Many different definitions for trust and reputation have been proposed that incorporate many sources of information; however, system designs have tended to focus much of their attention on direct interactions. Furthermore, trust updating schemes for direct interactions have tended to uncouple updates for positive and negative feedback. Consequently, behaviour in which cycles of positive feedback followed by a single negative feedback results in untrustworthy agents remaining undetected. This con-man style of behaviour is formally described and desirable characteristics of con-resistant trust schemes proposed. A con-resistant scheme is proposed and compared with FIRE, Regret and Yu and Singh's model. Simulation experiments demonstrate the utility of the con-resistant scheme.