Agents
Trustworthiness and Safety for Intelligent Ethical Logical Agents via Interval Temporal Logic and Runtime Self-Checking
Costantini, Stefania (Universita') | Gasperis, Giovanni De (degli Studi dell'Aquila) | Dyoub, Abeer ( Universita') | Pitoni, Valentina (degli Studi dell'Aquila )
Implementing Machine Ethics in Intelligent Agents involves trustworthiness and safety, meaning that agents should do what is expected they should do (at least, even in case of malfunctioning of any kind, concerning high-priority goals) and should not behave in unexpected potentially harmful ways. This topics are strongly related with "assurance", i.e., to ensuring that system users can rely upon the system. This paper deals with assurance of logical agent systems via temporal-logic-based runtime self-monitoring and checking.
SUCAG: Stochastic Unbiased Curvature-aided Gradient Method for Distributed Optimization
Wai, Hoi-To, Freris, Nikolaos M., Nedic, Angelia, Scaglione, Anna
We propose and analyze a new stochastic gradient method, which we call Stochastic Unbiased Curvature-aided Gra- dient (SUCAG), for finite sum optimization problems. SUCAG constitutes an unbiased total gradient tracking technique that uses Hessian information to accelerate convergence. We an- alyze our method under the general asynchronous model of computation, in which functions are selected infinitely often, but with delays that can grow sublinearly. For strongly convex problems, we establish linear convergence for the SUCAG method. When the initialization point is sufficiently close to the optimal solution, the established convergence rate is only dependent on the condition number of the problem, making it strictly faster than the known rate for the SAGA method. Furthermore, we describe a Markov-driven approach of implementing the SUCAG method in a distributed asynchronous multi-agent setting, via gossiping along a random walk on the communication graph. We show that our analysis applies as long as the undirected graph is connected and, notably, establishes an asymptotic linear convergence rate that is robust to the graph topology. Numerical results demonstrate the merit of our algorithm over existing methods.
Swarm Optimization: Goodbye Gradients
These combinations of real-time biological systems can blend knowledge, exploration, and exploitation to unify intelligence and solve problems more efficiently. These simple agents interact locally, within their environment, and new behaviors emerge from the group as a whole. In the world of evolutionary alogirthms one such inspired method is particle swarm optimization (PSO). It is a swarm intelligence based computational technique that can be used to find an approximate solution to a problem by iteratively trying to search candidate solutions (called particles) with regard to a given measure of quality around a global optimum. The movements of the particles are guided by their own best known position in the search-space as well as the entire swarm's best known position.
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Machado, Marlos C., Bellemare, Marc G., Talvitie, Erik, Veness, Joel, Hausknecht, Matthew, Bowling, Michael
The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community, leading to some high-profile success stories such as the much publicized Deep Q-Networks (DQN). In this article we take a big picture look at how the ALE is being used by the research community. We show how diverse the evaluation methodologies in the ALE have become with time, and highlight some key concerns when evaluating agents in the ALE. We use this discussion to present some methodological best practices and provide new benchmark results using these best practices. To further the progress in the field, we introduce a new version of the ALE that supports multiple game modes and provides a form of stochasticity we call sticky actions. We conclude this big picture look by revisiting challenges posed when the ALE was introduced, summarizing the state-of-the-art in various problems and highlighting problems that remain open.
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines
Wu, Cathy, Rajeswaran, Aravind, Duan, Yan, Kumar, Vikash, Bayen, Alexandre M, Kakade, Sham, Mordatch, Igor, Abbeel, Pieter
Policy gradient methods have enjoyed great success in deep reinforcement learning but suffer from high variance of gradient estimates. The high variance problem is particularly exasperated in problems with long horizons or high-dimensional action spaces. To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP. We demonstrate and quantify the benefit of the action-dependent baseline through both theoretical analysis as well as numerical results, including an analysis of the suboptimality of the optimal state-dependent baseline. The result is a computationally efficient policy gradient algorithm, which scales to high-dimensional control problems, as demonstrated by a synthetic 2000-dimensional target matching task. Our experimental results indicate that action-dependent baselines allow for faster learning on standard reinforcement learning benchmarks and high-dimensional hand manipulation and synthetic tasks. Finally, we show that the general idea of including additional information in baselines for improved variance reduction can be extended to partially observed and multi-agent tasks.
Swarm AI: Shaping the Conscience of Tomorrow's Artificial Intelligence - 1redDrop
Artificial intelligence might arguably be the newest frontier of human experience, but there's no denying that man has been fascinated with the concept for millennia. From the mythical stories of Hephaestus creating mechanical servants and brazen-footed bulls that puffed fire from their mouths, to the talking heads of the 13th century, to IBM Watson and modern forms of AI, the subject has been bubbling on the surface of human consciousness. The time is now here for AI to come of age; and, in many ways, it already has. But now there's a new problem, and it's not one of how AI can be implemented, as has been the major challenge in the past. AI has now sprouted into a plethora of forms, each rivaling the other in an attempt to showcase its superior capabilities.
The Near Future: See How Healthcare Tech Will Transform Our Lives
CableLabs just released a cool short film called The Near Future: A Better Place that explores how emerging technologies in healthcare will transform our daily lives. A substantial percentage of the population worldwide is over the age of 60, and it will dramatically increase in the next two decades. This really underscores the importance of healthcare advancements, and connectivity is the underlying component that will power the emerging technologies that can transform our daily lives, such as IoT, telemedicine, intelligent agents and new sensors. For example, Cookie – the little robot AI Agent in the film is an in-home companion that provides social interaction, around the clock monitoring, as well as a direct interface with the complex system of care at the hospital. With this short film, CableLabs wants to inspire you and the entire tech and healthcare industry to help make this vision a reality in the near future.
DEPSO Algorithm: Project Portal – Xiao-Feng Xie, Ph.D.
DEPSO [1], or called DEPS, is an algorithm for (constrained) numerical optimization problem (NOP). DEPSO combines the advantages of Particle Swarm Optimization (PSO) and Differential Evolution (DE). It is incorporated into cooperative group optimization (CGO) system [2]. The DEPSO paper has been cited over 400 times with various applications. DEPSO was also implemented (by Sun Microsystems Inc.) into NLPSolver (Solver for Nonlinear Programming), an extension of Calc in Apache OpenOffice.
New Ideas for Brain Modelling 4
This paper continues the research that considers a new cognitive model based strongly on the human brain. In particular, it considers the neural binding structure of an earlier paper. It also describes some new methods in the areas of image processing and behaviour simulation. The work is all based on earlier research by the author and the new additions are intended to fit in with the overall design. For image processing, a grid-like structure is used with 'full linking'. Each cell in the classifier grid stores a list of all other cells it gets associated with and this is used as the learned image that new input is compared to. For the behaviour metric, a new prediction equation is suggested, as part of a simulation, that uses feedback and history to dynamically determine its course of action. While the new methods are from widely different topics, both can be compared with the binary-analog type of interface that is the main focus of the paper. It is suggested that the simplest of linking between a tree and ensemble can explain neural binding and variable signal strengths.
Video Friday: Human-Drone Interaction, Soft Robotics, and Basketball Robot
Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We'll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!): Let us know if you have suggestions for next week, and enjoy today's videos. We were at the 2018 Human Robot Interaction conference all this week, and on Wednesday, there was a special video session. The audience, who was provided with popcorn, voted by applause, and here are the top three videos.