Agents
Improving performance in multi-objective decision-making in Bottles environments with soft maximin approaches
Smith, Benjamin J, Klassert, Robert, Pihlakas, Roland
Balancing multiple competing and conflicting objectives is an essential task for any artificial intelligence tasked with satisfying human values or preferences. Conflict arises both from misalignment between individuals with competing values, but also between conflicting value systems held by a single human. Starting with principle of loss-aversion, we designed a set of soft maximin function approaches to multi-objective decision-making. Bench-marking these functions in a set of previously-developed environments, we found that one new approach in particular, 'split-function exp-log loss aversion' (SFELLA), learns faster than the state of the art thresholded alignment objective method (Vamplew et al, 2021) on three of four tasks it was tested on, and achieved the same optimal performance after learning. SFELLA also showed relative robustness improvements against changes in objective scale, which may highlight an advantage dealing with distribution shifts in the environment dynamics. Due to publishing rules, further work could not be presented in the preprint, but in the final published version, we will further compare SFELLA to the multi-objective reward exponentials (MORE) approach (Rolf, 2020), demonstrating that SFELLA performs similarly to MORE in a simple previously-described foraging task, but in a modified foraging environment with a new resource that was not depleted as the agent worked, SFELLA collected more of the new resource with very little cost incurred in terms of the old resource. Overall, we found SFELLA useful for avoiding problems that sometimes occur with a thresholded approach, and more reward-responsive than MORE while retaining its conservative, loss-averse incentive structure.
Multi-Agent Reinforcement Learning with Graph Convolutional Neural Networks for optimal Bidding Strategies of Generation Units in Electricity Markets
Finding optimal bidding strategies for generation units in electricity markets would result in higher profit. However, it is a challenging problem due to the system uncertainty which is due to the unknown other generation units' strategies. Distributed optimization, where each entity or agent decides on its bid individually, has become state of the art. However, it cannot overcome the challenges of system uncertainties. Deep reinforcement learning is a promising approach to learn the optimal strategy in uncertain environments. Nevertheless, it is not able to integrate the information on the spatial system topology in the learning process. This paper proposes a distributed learning algorithm based on deep reinforcement learning (DRL) combined with a graph convolutional neural network (GCN). In fact, the proposed framework helps the agents to update their decisions by getting feedback from the environment so that it can overcome the challenges of the uncertainties. In this proposed algorithm, the state and connection between nodes are the inputs of the GCN, which can make agents aware of the structure of the system. This information on the system topology helps the agents to improve their bidding strategies and increase the profit. We evaluate the proposed algorithm on the IEEE 30-bus system under different scenarios. Also, to investigate the generalization ability of the proposed approach, we test the trained model on IEEE 39-bus system. The results show that the proposed algorithm has more generalization abilities compare to the DRL and can result in higher profit when changing the topology of the system.
3, 2, 1, Drones Go! A Testbed to Take off UAV Swarm Intelligence for Distributed Sensing
Qin, Chuhao, Candan, Fethi, Mihaylova, Lyudmila S., Pournaras, Evangelos
This paper introduces a testbed to study distributed sensing problems of Unmanned Aerial Vehicles (UAVs) exhibiting swarm intelligence. Several Smart City applications, such as transport and disaster response, require efficient collection of sensor data by a swarm of intelligent and cooperative UAVs. This often proves to be too complex and costly to study systematically and rigorously without compromising scale, realism and external validity. With the proposed testbed, this paper sets a stepping stone to emulate, within small laboratory spaces, large sensing areas of interest originated from empirical data and simulation models. Over this sensing map, a swarm of low-cost drones can fly allowing the study of a large spectrum of problems such as energy consumption, charging control, navigation and collision avoidance. The applicability of a decentralized multi-agent collective learning algorithm (EPOS) for UAV swarm intelligence along with the assessment of power consumption measurements provide a proof-of-concept and validate the accuracy of the proposed testbed.
Capturing Dependencies within Machine Learning via a Formal Process Model
Ritz, Fabian, Phan, Thomy, Sedlmeier, Andreas, Altmann, Philipp, Wieghardt, Jan, Schmid, Reiner, Sauer, Horst, Klein, Cornel, Linnhoff-Popien, Claudia, Gabor, Thomas
The development of Machine Learning (ML) models is more than just a special case of software development (SD): ML models acquire properties and fulfill requirements even without direct human interaction in a seemingly uncontrollable manner. Nonetheless, the underlying processes can be described in a formal way. We define a comprehensive SD process model for ML that encompasses most tasks and artifacts described in the literature in a consistent way. In addition to the production of the necessary artifacts, we also focus on generating and validating fitting descriptions in the form of specifications. We stress the importance of further evolving the ML model throughout its life-cycle even after initial training and testing. Thus, we provide various interaction points with standard SD processes in which ML often is an encapsulated task. Further, our SD process model allows to formulate ML as a (meta-) optimization problem. If automated rigorously, it can be used to realize self-adaptive autonomous systems. Finally, our SD process model features a description of time that allows to reason about the progress within ML development processes. This might lead to further applications of formal methods within the field of ML.
Incorporating social norms into a configurable agent-based model of the decision to perform commuting behaviour
Greener, Robert, Lewis, Daniel, Reades, Jon, Miles, Simon, Cummins, Steven
Interventions to increase active commuting have been recommended as a method to increase population physical activity, but evidence is mixed. Social norms related to travel behaviour may influence the uptake of active commuting interventions but are rarely considered in their design and evaluation. In this study we develop an agent-based model that incorporates social norms related to travel behaviour and demonstrate the utility of this through implementing car-free Wednesdays. A synthetic population of Waltham Forest, London, UK was generated using a microsimulation approach with data from the UK Census 2011 and UK HLS datasets. An agent-based model was created using this synthetic population which modelled how the actions of peers and neighbours, subculture, habit, weather, bicycle ownership, car ownership, environmental supportiveness, and congestion affect the decision to trave. The developed model (MOTIVATE) is a configurable agent-based model where social norms related to travel behaviour are used to provide a more realistic representation of the socio-ecological systems in which active commuting interventions may be deployed. The utility of this model is demonstrated using car-free days as a hypothetical intervention. In the control scenario, the odds of active travel were plausible at 0.091 (89% HPDI: [0.091, 0.091]). Compared to the control scenario, the odds of active travel were increased by 70.3% (89% HPDI: [70.3%, 70.3%]), in the intervention scenario, on non-car-free days; the effect is sustained to non-car-free days. The model is a useful tool for investigating the effect of how social networks and social norms influence the effectiveness of various interventions. If configured using real-world built environment data, it may be useful for investigating how social norms interact with the built environment to cause the emergence of commuting conventions.
EvolveHypergraph: Group-Aware Dynamic Relational Reasoning for Trajectory Prediction
Li, Jiachen, Hua, Chuanbo, Park, Jinkyoo, Ma, Hengbo, Dax, Victoria, Kochenderfer, Mykel J.
While the modeling of pair-wise relations has been widely studied in multi-agent interacting systems, its ability to capture higher-level and larger-scale group-wise activities is limited. In this paper, we propose a group-aware relational reasoning approach (named EvolveHypergraph) with explicit inference of the underlying dynamically evolving relational structures, and we demonstrate its effectiveness for multi-agent trajectory prediction. In addition to the edges between a pair of nodes (i.e., agents), we propose to infer hyperedges that adaptively connect multiple nodes to enable group-aware relational reasoning in an unsupervised manner without fixing the number of hyperedges. The proposed approach infers the dynamically evolving relation graphs and hypergraphs over time to capture the evolution of relations, which are used by the trajectory predictor to obtain future states. Moreover, we propose to regularize the smoothness of the relation evolution and the sparsity of the inferred graphs or hypergraphs, which effectively improves training stability and enhances the explainability of inferred relations. The proposed approach is validated on both synthetic crowd simulations and multiple real-world benchmark datasets. Our approach infers explainable, reasonable group-aware relations and achieves state-of-the-art performance in long-term prediction.
Inaccuracy rates for distributed inference over random networks with applications to social learning
This paper studies probabilistic rates of convergence for consensus+innovations type of algorithms in random, generic networks. For each node, we find a lower and also a family of upper bounds on the large deviations rate function, thus enabling the computation of the exponential convergence rates for the events of interest on the iterates. Relevant applications include error exponents in distributed hypothesis testing, rates of convergence of beliefs in social learning, and inaccuracy rates in distributed estimation. The bounds on the rate function have a very particular form at each node: they are constructed as the convex envelope between the rate function of the hypothetical fusion center and the rate function corresponding to a certain topological mode of the node's presence. We further show tightness of the discovered bounds for several cases, such as pendant nodes and regular networks, thus establishing the first proof of the large deviations principle for consensus+innovations and social learning in random networks.
Meet ML@GT: Lara J. Martin Trains AI Agents to Become Storytellers
The Machine Learning Center at Georgia Tech (ML@GT) is home to many talented students from across campus, representing all six of Georgia Tech's colleges and the Georgia Tech Research Institute (GTRI). These students have diverse backgrounds and a wide variety of interests both inside and outside of the classroom. Today, we'd like you to meet Lara Martin, a fifth-year Ph.D. student who is interested in teaching artificial intelligence agents to tell interesting and coherent stories. Tell us about your research interests. Where might people be impacted them in everyday life?
Machine Learning in Event-Triggered Control: Recent Advances and Open Issues
Sedghi, Leila, Ijaz, Zohaib, Noor-A-Rahim, Md., Witheephanich, Kritchai, Pesch, Dirk
Networked control systems have gained considerable attention over the last decade as a result of the trend towards decentralised control applications and the emergence of cyber-physical system applications. However, real-world wireless networked control systems suffer from limited communication bandwidths, reliability issues, and a lack of awareness of network dynamics due to the complex nature of wireless networks. Combining machine learning and event-triggered control has the potential to alleviate some of these issues. For example, machine learning can be used to overcome the problem of a lack of network models by learning system behavior or adapting to dynamically changing models by continuously learning model dynamics. Event-triggered control can help to conserve communication bandwidth by transmitting control information only when necessary or when resources are available. The purpose of this article is to conduct a review of the literature on the use of machine learning in combination with event-triggered control. Machine learning techniques such as statistical learning, neural networks, and reinforcement learning-based approaches such as deep reinforcement learning are being investigated in combination with event-triggered control. We discuss how these learning algorithms can be used for different applications depending on the purpose of the machine learning use. Following the review and discussion of the literature, we highlight open research questions and challenges associated with machine learning-based event-triggered control and suggest potential solutions.
Vehicle Type Specific Waypoint Generation
Liu, Yunpeng, Lavington, Jonathan Wilder, Scibior, Adam, Wood, Frank
We develop a generic mechanism for generating vehicle-type specific sequences of waypoints from a probabilistic foundation model of driving behavior. Many foundation behavior models are trained on data that does not include vehicle information, which limits their utility in downstream applications such as planning. Our novel methodology conditionally specializes such a behavior predictive model to a vehicle-type by utilizing byproducts of the reinforcement learning algorithms used to produce vehicle specific controllers. We show how to compose a vehicle specific value function estimate with a generic probabilistic behavior model to generate vehicle-type specific waypoint sequences that are more likely to be physically plausible then their vehicle-agnostic counterparts.