Goto

Collaborating Authors

 Country


Optimal PAC-Bayesian Posteriors for Stochastic Classifiers and their use for Choice of SVM Regularization Parameter

arXiv.org Machine Learning

PAC-Bayesian set up involves a stochastic classifier characterized by a posterior distribution on a classifier set, offers a high probability bound on its averaged true risk and is robust to the training sample used. For a given posterior, this bound captures the trade off between averaged empirical risk and KL-divergence based model complexity term. Our goal is to identify an optimal posterior with the least PAC-Bayesian bound. We consider a finite classifier set and 5 distance functions: KL-divergence, its Pinsker's and a sixth degree polynomial approximations; linear and squared distances. Linear distance based model results in a convex optimization problem. We obtain closed form expression for its optimal posterior. For uniform prior, this posterior has full support with weights negative-exponentially proportional to number of misclassifications. Squared distance and Pinsker's approximation bounds are possibly quasi-convex and are observed to have single local minimum. We derive fixed point equations (FPEs) using partial KKT system with strict positivity constraints. This obviates the combinatorial search for subset support of the optimal posterior. For uniform prior, exponential search on a full-dimensional simplex can be limited to an ordered subset of classifiers with increasing empirical risk values. These FPEs converge rapidly to a stationary point, even for a large classifier set when a solver fails. We apply these approaches to SVMs generated using a finite set of SVM regularization parameter values on 9 UCI datasets. These posteriors yield stochastic SVM classifiers with tight bounds. KL-divergence based bound is the tightest, but is computationally expensive due to non-convexity and multiple calls to a root finding algorithm. Optimal posteriors for all 5 distance functions have lowest 10% test error values on most datasets, with linear distance being the easiest to obtain.


Learning Deep Generative Models with Short Run Inference Dynamics

arXiv.org Machine Learning

This paper studies the fundamental problem of learning deep generative models that consist of one or more layers of latent variables organized in top-down architectures. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. The inference typically requires Markov chain Monte Caro (MCMC) that can be time consuming. In this paper, we propose to use short run inference dynamics guided by the log-posterior, such as finite-step gradient descent algorithm initialized from the prior distribution of the latent variables, as an approximate sampler of the posterior distribution, where the step size of the gradient descent dynamics is optimized by minimizing the Kullback-Leibler divergence between the distribution produced by the short run inference dynamics and the posterior distribution. Our experiments show that the proposed method outperforms variational auto-encoder (VAE) in terms of reconstruction error and synthesis quality. The advantage of the proposed method is that it is natural and automatic, even for models with multiple layers of latent variables.


Adapting Behaviour for Learning Progress

arXiv.org Artificial Intelligence

A BSTRACT Determining what experience to generate to best facilitate learning (i.e. The advent of distributed agents that interact with parallel instances of the environment has enabled larger scales and greater flexibility, but has not removed the need to tune exploration to the task, because the ideal data for the learning algorithm necessarily depends on its process of learning. We propose to dynamically adapt the data generation by using a non-stationary multi-armed bandit to optimize a proxy of the learning progress. The data distribution is controlled by modulating multiple parameters of the policy (such as stochasticity, consistency or optimism) without significant overhead. The adaptation speed of the bandit can be increased by exploiting the factored modulation structure. We demonstrate on a suite of Atari 2600 games how this unified approach produces results comparable to per-task tuning at a fraction of the cost. 1 I NTRODUCTION Reinforcement learning (RL) is a general formalism modelling sequential decision making, which supports making minimal assumptions about the task at hand and reducing the need for prior knowledge. By learning behaviour from scratch, RL agents have the potential to surpass human expertise or tackle complex domains where human intuition is not applicable. In practice, however, generality is often traded for performance and efficiency, with RL practitioners tuning algorithms, architectures and hyper-parameters to the task at hand (Hessel et al., 2019). A side-effect is that the resulting methods can be brittle, or difficult to reliably reproduce (Nagarajan et al., 2018). Exploration is one of the main aspects commonly designed or tuned specifically for the task being solved. Previous work has shown that large sample-efficiency gains are possible, for example, when the exploratory behaviour's level of stochasticity is adjusted to the environment's hazard rate (Garc ıa & Fern andez, 2015), or when an appropriate prior is used in large action spaces (Dulac-Arnold et al., 2015; Czarnecki et al., 2018; Vinyals et al., 2019). Exploration in the presence of function approximation should ideally be agent-centred. It ought to focus more on generating data that supports the agent's learning at its current parameters θ, rather than making progress on objective measurements of information gathering.


Spatial Influence-aware Reinforcement Learning for Intelligent Transportation System

arXiv.org Artificial Intelligence

Intelligent transportation systems (ITSs) are envisioned to be crucial for smart cities, which aims at improving traffic flow to improve the life quality of urban residents and reducing congestion to improve the efficiency of commuting. However, several challenges need to be resolved before such systems can be deployed, for example, conventional solutions for Markov decision process (MDP) and single-agent Reinforcement Learning (RL) algorithms suffer from poor scalability, and multi-agent systems suffer from poor communication and coordination. In this paper, we explore the potential of mutual information sharing, or in other words, spatial influence based communication, to optimize traffic light control policy. First, we mathematically analyze the transportation system. We conclude that the transportation system does not have stationary Nash Equilibrium, thereby reinforcement learning algorithms offer suitable solutions. Secondly, we describe how to build a multi-agent Deep Deterministic Policy Gradient (DDPG) system with spatial influence and social group utility incorporated. Then we utilize the grid topology road network to empirically demonstrate the scalability of the new system. We demonstrate three types of directed communications to show the effect of directions of social influence on the entire network utility and individual utility. Lastly, we define "selfish index" and analyze the effect of it on total group utility.


Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods

arXiv.org Artificial Intelligence

In this article, we report on the efficiency and effectiveness of multiagent reinforcement learning methods (MARL) for the computation of flight delays to resolve congestion problems in the Air Traffic Management (ATM) domain. Specifically, we aim to resolve cases where demand of airspace use exceeds capacity (demand-capacity problems), via imposing ground delays to flights at the pre-tactical stage of operations (i.e. few days to few hours before operation). Casting this into the multiagent domain, agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others' payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. Specifically, we formalize the problem as a multiagent Markov Decision Process (MA-MDP) and we show that it can be considered as a Markov game in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system. We propose collaborative multiagent reinforcement learning methods to resolve demand-capacity imbalances: Extensive experimental study on real-world cases, shows the potential of the proposed approaches in resolving problems, while advanced visualizations provide detailed views towards understanding the quality of solutions provided.


Ten AI Stepping Stones for Cybersecurity

arXiv.org Artificial Intelligence

With the turmoil in cybersecurity and the mind-blowing advances in AI, it is only natural that cybersecurity practitioners consider further employing learning techniques to help secure their organizations and improve the efficiency of their security operation centers. But with great fears come great opportunities for both the good and the evil, and a myriad of bad deals. This paper discusses ten issues in cybersecurity that hopefully will make it easier for practitioners to ask detailed questions about what they want from an AI system in their cybersecurity operations. We draw on the state of the art to provide factual arguments for a discussion on well-established AI in cybersecurity issues, including the current scope of AI and its application to cybersecurity, the impact of privacy concerns on the cybersecurity data that can be collected and shared externally to the organization, how an AI decision can be explained to the person running the operations center, and the implications of the adversarial nature of cybersecurity in the learning techniques. We then discuss the use of AI by attackers on a level playing field including several issues in an AI battlefield, and an AI perspective on the old cat-and-mouse game including how the adversary may assess your AI power.


PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning

arXiv.org Artificial Intelligence

Autonomous agents are limited in their ability to observe the world state. Partially observable Markov decision processes (POMDPs) formally model the problem of planning under world state uncertainty, but POMDPs with continuous actions and nonlinear dynamics suitable for robotics applications are challenging to solve. In this paper, we present an efficient differential dynamic programming (DDP) algorithm for belief space planning in POMDPs with uncertainty over a discrete latent state, and continuous states, actions, observations, and nonlinear dynamics. This representation allows planning of dynamic trajectories which are sensitive to structured uncertainty over discrete latent world states. We develop dynamic programming techniques to optimize a contingency plan over a tree of possible observations and belief space trajectories, and also derive a hierarchical version of the algorithm. Our method is applicable to problems with uncertainty over the cost or reward function (e.g., the configuration of goals or obstacles), uncertainty over the dynamics (e.g., the dynamical mode of a hybrid system), and uncertainty about interactions, where other agents' behavior is conditioned on latent intentions. Benchmarks show that our algorithm outperforms popular heuristic approaches to planning under uncertainty, and results from an autonomous lane changing task demonstrate that our algorithm can synthesize robust interactive trajectories.


Apple acquires AI startup that uses machine learning to make pictures crisper

Daily Mail - Science & tech

Apple is working on technology for the perfect selfie. The tech giant acquired Spectral Edge, a UK-based AI startup that uses machine learning to make smartphone pictures crisper, with more accurate colors. The system captures and blends an infrared shot with a standard shot to enhance a photograph's overall depth, detail and color. The startup uses a process that completely relies on machine learning that can be combined with both hardware and software to improve pictures. The news was first revealed by Bloomberg, which obtained secret documents'that Apple now controls Spectral.'


Edge compute creates exciting possibilities for emerging technology - SiliconANGLE

#artificialintelligence

Edge computing provides groundbreaking innovations to enterprise cloud organizations, including nearly instant code transfer, reduced latency, and enhanced performance. The lightning speed of edge compute is due to the placement of the platform. Unlike public cloud, edge compute is placed as close as possible to the point of interaction with humans, electronics, and various connected devices. Edge compute becomes more and more relevant to companies as applications evolve, including virtual reality, augmented reality, and video analytics, which rely on artificial intelligence. With real-time code transfer that AI needs to be extremely precise, and as AI evolves, every millisecond counts, according to Paul Savill (pictured), senior vice president of core network and technology solutions at CenturyLink Inc.


Where are the opportunities for medtech and pharma in 2020?

#artificialintelligence

It's that time when we start to look ahead to what next year holds for the life science sector...Lu Rahman outlines 2020s big medtech players A decade ago the healthcare advances create by AI would have seemed the stuff of dreams. But back in 2018 Theresa May announced plans to use artificial intelligence and data to transform the way certain diseases like cancer. The technology is moving at a pace – this year we heard that a team led by the University of Surrey had filed the first ever patent for inventions autonomously created by AI without a human inventor. Professor Ryan Abbott explained the implications this had for the life science sector: "These filings are important to any area of research and development as well as any area that relies on patents. Patents are more important in the life sciences than in many other areas, particularly for drug discovery. AI has also been used extensively in the drug discovery process for a long time for tasks like screening of compounds and in silico analysis. These tasks can be the foundation for patent filings. "As AI is becoming increasingly sophisticated, it is likely to play an increasing role in R&D including in the life sciences.