Goto

Collaborating Authors

 Agents


dm_control: Software and Tasks for Continuous Control

arXiv.org Artificial Intelligence

The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. A MuJoCo wrapper provides convenient bindings to functions and data structures. The PyMJCF and Composer libraries enable procedural model manipulation and task authoring. The Control Suite is a fixed set of tasks with standardised structure, intended to serve as performance benchmarks. The Locomotion framework provides high-level abstractions and examples of locomotion tasks. A set of configurable manipulation tasks with a robot arm and snap-together bricks is also included. dm_control is publicly available at https://www.github.com/deepmind/dm_control


How This Startup Is Using Swarm AI To Make Deep Learning Technology Accessible For Everyone

#artificialintelligence

Swarm AI is a modern AI technology that is relatively new to organisations. It blends global and local insights to improve and optimise business decisions. Though the concept of swarm intelligence is now new in literature, it is increasingly being used to predict everything from stock market movements to forecasting sales. Advances in the Internet of Things technology, machine learning and 5G has made artificial swarm systems faster and more efficient. In today's world of business that constantly witnesses increasing flux, scale, and complexity, artificial swarm intelligence will help them identify new growth opportunities as well as to anticipate and manage disruption.


PSO-PS: Parameter Synchronization with Particle Swarm Optimization for Distributed Training of Deep Neural Networks

arXiv.org Artificial Intelligence

Parameter updating is an important stage in parallelism-based distributed deep learning. Synchronous methods are widely used in distributed training the Deep Neural Networks (DNNs). To reduce the communication and synchronization overhead of synchronous methods, decreasing the synchronization frequency (e.g., every $n$ mini-batches) is a straightforward approach. However, it often suffers from poor convergence. In this paper, we propose a new algorithm of integrating Particle Swarm Optimization (PSO) into the distributed training process of DNNs to automatically compute new parameters. In the proposed algorithm, a computing work is encoded by a particle, the weights of DNNs and the training loss are modeled by the particle attributes. At each synchronization stage, the weights are updated by PSO from the sub weights gathered from all workers, instead of averaging the weights or the gradients. To verify the performance of the proposed algorithm, the experiments are performed on two commonly used image classification benchmarks: MNIST and CIFAR10, and compared with the peer competitors at multiple different synchronization configurations. The experimental results demonstrate the competitiveness of the proposed algorithm.


Real-time and Large-scale Fleet Allocation of Autonomous Taxis: A Case Study in New York Manhattan Island

arXiv.org Artificial Intelligence

Nowadays, autonomous taxis become a highly promising transportation mode, which helps relieve traffic congestion and avoid road accidents. However, it hinders the wide implementation of this service that traditional models fail to efficiently allocate the available fleet to deal with the imbalance of supply (autonomous taxis) and demand (trips), the poor cooperation of taxis, hardly satisfied resource constraints, and on-line platform's requirements. To figure out such urgent problems from a global and more farsighted view, we employ a Constrained Multi-agent Markov Decision Processes (CMMDP) to model fleet allocation decisions, which can be easily split into sub-problems formulated as a 'Dynamic assignment problem' combining both immediate rewards and future gains. We also leverage a Column Generation algorithm to guarantee the efficiency and optimality in a large scale. Through extensive experiments, the proposed approach not only achieves remarkable improvements over the state-of-the-art benchmarks in terms of the individual's efficiency (arriving at 12.40%, 6.54% rise of income and utilization, respectively) and the platform's profit (reaching 4.59% promotion) but also reveals a time-varying fleet adjustment policy to minimize the operation cost of the platform.


A Hierarchical Architecture for Human-Robot Cooperation Processes

arXiv.org Artificial Intelligence

In this paper we propose FlexHRC+, a hierarchical human-robot cooperation architecture designed to provide collaborative robots with an extended degree of autonomy when supporting human operators in high-variability shop-floor tasks. The architecture encompasses three levels, namely for perception, representation, and action. Building up on previous work, here we focus on (i) an in-the-loop decision making process for the operations of collaborative robots coping with the variability of actions carried out by human operators, and (ii) the representation level, integrating a hierarchical AND/OR graph whose online behaviour is formally specified using First Order Logic. The architecture is accompanied by experiments including collaborative furniture assembly and object positioning tasks.


"And the Winner Is...": Dynamic Lotteries for Multi-group Fairness-Aware Recommendation

arXiv.org Artificial Intelligence

As recommender systems are being designed and deployed for an increasing number of socially-consequential applications, it has become important to consider what properties of fairness these systems exhibit. There has been considerable research on recommendation fairness. However, we argue that the previous literature has been based on simple, uniform and often uni-dimensional notions of fairness assumptions that do not recognize the real-world complexities of fairness-aware applications. In this paper, we explicitly represent the design decisions that enter into the trade-off between accuracy and fairness across multiply-defined and intersecting protected groups, supporting multiple fairness metrics. The framework also allows the recommender to adjust its performance based on the historical view of recommendations that have been delivered over a time horizon, dynamically rebalancing between fairness concerns. Within this framework, we formulate lottery-based mechanisms for choosing between fairness concerns, and demonstrate their performance in two recommendation domains.


Particle Swarm Optimized Federated Learning For Industrial IoT and Smart City Services

arXiv.org Machine Learning

Most of the research on Federated Learning (FL) has focused on analyzing global optimization, privacy, and communication, with limited attention focusing on analyzing the critical matter of performing efficient local training and inference at the edge devices. One of the main challenges for successful and efficient training and inference on edge devices is the careful selection of parameters to build local Machine Learning (ML) models. To this aim, we propose a Particle Swarm Optimization (PSO)-based technique to optimize the hyperparameter settings for the local ML models in an FL environment. We evaluate the performance of our proposed technique using two case studies. First, we consider smart city services and use an experimental transportation dataset for traffic prediction as a proxy for this setting. Second, we consider Industrial IoT (IIoT) services and use the real-time telemetry dataset to predict the probability that a machine will fail shortly due to component failures. Our experiments indicate that PSO provides an efficient approach for tuning the hyperparameters of deep Long short-term memory (LSTM) models when compared to the grid search method. Our experiments illustrate that the number of clients-server communication rounds to explore the landscape of configurations to find the near-optimal parameters are greatly reduced (roughly by two orders of magnitude needing only 2%--4% of the rounds compared to state of the art non-PSO-based approaches). We also demonstrate that utilizing the proposed PSO-based technique to find the near-optimal configurations for FL and centralized learning models does not adversely affect the accuracy of the models.


Hybrid DCOP Solvers: Boosting Performance of Local Search Algorithms

arXiv.org Artificial Intelligence

We propose a novel method for expediting both symmetric and asymmetric Distributed Constraint Optimization Problem (DCOP) solvers. The core idea is based on initializing DCOP solvers with greedy fast non-iterative DCOP solvers. This is contrary to existing methods where initialization is always achieved using a random value assignment. We empirically show that changing the starting conditions of existing DCOP solvers not only reduces the algorithm convergence time by up to 50\%, but also reduces the communication overhead and leads to a better solution quality. We show that this effect is due to structural improvements in the variable assignment, which is caused by the spreading pattern of DCOP algorithm activation.) /Subject (Hybrid DCOPs)


Action and Perception as Divergence Minimization

arXiv.org Artificial Intelligence

We introduce a unified objective for action and perception of intelligent agents. Extending representation learning and control, we minimize the joint divergence between the world and a target distribution. Intuitively, such agents use perception to align their beliefs with the world, and use actions to align the world with their beliefs. Minimizing the joint divergence to an expressive target maximizes the mutual information between the agent's representations and inputs, thus inferring representations that are informative of past inputs and exploring future inputs that are informative of the representations. This lets us derive intrinsic objectives, such as representation learning, information gain, empowerment, and skill discovery from minimal assumptions. Moreover, interpreting the target distribution as a latent variable model suggests expressive world models as a path toward highly adaptive agents that seek large niches in their environments, while rendering task rewards optional. The presented framework provides a common language for comparing a wide range of objectives, facilitates understanding of latent variables for decision making, and offers a recipe for designing novel objectives. We recommend deriving future agent objectives from the joint divergence to facilitate comparison, to point out the agent's target distribution, and to identify the intrinsic objective terms needed to reach that distribution.


SEDRo: A Simulated Environment for Developmental Robotics

arXiv.org Artificial Intelligence

Even with impressive advances in application-specific models, we still lack knowledge about how to build a model that can learn in a human-like way and do multiple tasks. To learn in a human-like way, we need to provide a diverse experience that is comparable to humans. In this paper, we introduce our ongoing effort to build a simulated environment for developmental robotics (SEDRo). SEDRo provides diverse human experiences ranging from those of a fetus to a 12th-month-old. A series of simulated tests based on developmental psychology will be used to evaluate the progress of a learning model. We anticipate SEDRo to lower the cost of entry and facilitate research in the developmental robotics community.