Goto

Collaborating Authors

 South America


Zeroth-order Deterministic Policy Gradient

arXiv.org Machine Learning

Deterministic Policy Gradient (DPG) removes a level of randomness from standard randomized-action Policy Gradient (PG), and demonstrates substantial empirical success for tackling complex dynamic problems involving Markov decision processes. At the same time, though, DPG loses its ability to learn in a model-free (i.e., actor-only) fashion, frequently necessitating the use of critics in order to obtain consistent estimates of the associated policy-reward gradient. In this work, we introduce Zeroth-order Deterministic Policy Gradient (ZDPG), which approximates policy-reward gradients via two-point stochastic evaluations of the $Q$-function, constructed by properly designed low-dimensional action-space perturbations. Exploiting the idea of random horizon rollouts for obtaining unbiased estimates of the $Q$-function, ZDPG lifts the dependence on critics and restores true model-free policy learning, while enjoying built-in and provable algorithmic stability. Additionally, we present new finite sample complexity bounds for ZDPG, which improve upon existing results by up to two orders of magnitude. Our findings are supported by several numerical experiments, which showcase the effectiveness of ZDPG in a practical setting, and its advantages over both PG and Baseline PG.


How Artificial Intelligence Is Going To Change Hotel Stays

#artificialintelligence

ModiHost is a new platform for hotels that uses artificial intelligence to offer a better hotel management system, centered around personalization of the guest experience. In turn they aim to drive increased spending and brand loyalty. They say they've cracked the code that many hotels haven't, offering a solution for remembering guest preferences and anticipating their needs that most hotels wouldn't be able to employ on their own. As the company says it in its whitepaper: "Hotel management is a complex and convoluted industry. It is also a highly inefficient one. The need to operate multiple systems, integrate different booking systems, and process reservations via mediums ranging from email to fax, have made hotel management hopelessly complicated."


Artificial Intelligence Market Demand & Future Scope Including Top Players – Jewish Market Reports

#artificialintelligence

Brandessence market research publishes market research reports & business insights produced by highly qualified and experienced industry analysts. Our research reports are available in a wide range of industry verticals including aviation, food & beverage, healthcare, ICT, Construction, Chemicals and lot more. Brand Essence Market Research report will be best fit for senior executives, business development managers, marketing managers, consultants, CEOs, CIOs, COOs, and Directors, governments, agencies, organizations and Ph.D. Students. We have a delivery center in Pune, India and our sales office is in London.


EVO-RL: Evolutionary-Driven Reinforcement Learning

arXiv.org Artificial Intelligence

In this work, we propose a novel approach for reinforcement learning driven by evolutionary computation. Our algorithm, dubbed as Evolutionary-Driven Reinforcement Learning (evo-RL), embeds the reinforcement learning algorithm in an evolutionary cycle, where we distinctly differentiate between purely evolvable (instinctive) behaviour versus purely learnable behaviour. Furthermore, we propose that this distinction is decided by the evolutionary process, thus allowing evo-RL to be adaptive to different environments. In addition, evo-RL facilitates learning on environments with rewardless states, which makes it more suited for real-world problems with incomplete information. To show that evo-RL leads to state-of-the-art performance, we present the performance of different state-of-the-art reinforcement learning algorithms when operating within evo-RL and compare it with the case when these same algorithms are executed independently. Results show that reinforcement learning algorithms embedded within our evo-RL approach significantly outperform the stand-alone versions of the same RL algorithms on OpenAI Gym control problems with rewardless states constrained by the same computational budget.


Solving the Clustered Traveling Salesman Problem via TSP methods

arXiv.org Artificial Intelligence

The Clustered Traveling Salesman Problem (CTSP) is a variant of the popular Traveling Salesman Problem (TSP) arising from a number of real-life applications. In this work, we explore an uncharted solution approach that solves the CTSP by transforming it to the well-studied TSP. For this purpose, we first investigate a technique to convert a CTSP instance to a TSP and then apply popular TSP solvers (including exact and heuristic solvers) to solve the resulting TSP instance. We want to answer the following questions: How do state-of-the-art TSP solvers perform on clustered instances converted from the CTSP? Do state-of-the-art TSP solvers compete well with the best performing methods specifically designed for the CTSP? For this purpose, we present intensive computational experiments on various CTSP benchmark instances to draw conclusions.


Neuromorphic Processing and Sensing: Evolutionary Progression of AI to Spiking

arXiv.org Artificial Intelligence

The increasing rise in machine learning and deep learning applications is requiring ever more computational resources to successfully meet the growing demands of an always-connected, automated world. Neuromorphic technologies based on Spiking Neural Network algorithms hold the promise to implement advanced artificial intelligence using a fraction of the computations and power requirements by modeling the functioning, and spiking, of the human brain. With the proliferation of tools and platforms aiding data scientists and machine learning engineers to develop the latest innovations in artificial and deep neural networks, a transition to a new paradigm will require building from the current well-established foundations. This paper explains the theoretical workings of neuromorphic technologies based on spikes, and overviews the state-of-art in hardware processors, software platforms and neuromorphic sensing devices. A progression path is paved for current machine learning specialists to update their skillset, as well as classification or predictive models from the current generation of deep neural networks to SNNs. This can be achieved by leveraging existing, specialized hardware in the form of SpiNNaker and the Nengo migration toolkit. First-hand, experimental results of converting a VGG-16 neural network to an SNN are shared. A forward gaze into industrial, medical and commercial applications that can readily benefit from SNNs wraps up this investigation into the neuromorphic computing future.


The Computational Limits of Deep Learning

arXiv.org Machine Learning

Deep learning's recent history has been one of achievement: from triumphing over humans in the game of Go to world-leading performance in image recognition, voice recognition, translation, and other tasks. But this progress has come with a voracious appetite for computing power. This article reports on the computational demands of Deep Learning applications in five prominent application areas and shows that progress in all five is strongly reliant on increases in computing power. Extrapolating forward this reliance reveals that progress along current lines is rapidly becoming economically, technically, and environmentally unsustainable. Thus, continued progress in these applications will require dramatically more computationally-efficient methods, which will either have to come from changes to deep learning or from moving to other machine learning methods.


Predicting Illegal Fishing on the Patagonia Shelf from Oceanographic Seascapes

arXiv.org Machine Learning

Many of the world's most important fisheries are experiencing increases in illegal fishing, undermining efforts to sustainably conserve and manage fish stocks. A major challenge to ending illegal, unreported, and unregulated (IUU) fishing is improving our ability to identify whether a vessel is fishing illegally and where illegal fishing is likely to occur in the ocean. However, monitoring the oceans is costly, time-consuming, and logistically challenging for maritime authorities to patrol. To address this problem, we use vessel tracking data and machine learning to predict illegal fishing on the Patagonian Shelf, one of the world's most productive regions for fisheries. Specifically, we focus on Chinese fishing vessels, which have consistently fished illegally in this region. We combine vessel location data with oceanographic seascapes -- classes of oceanic areas based on oceanographic variables -- as well as other remotely sensed oceanographic variables to train a series of machine learning models of varying levels of complexity. These models are able to predict whether a Chinese vessel is operating illegally with 69-96% confidence, depending on the year and predictor variables used. These results offer a promising step towards preempting illegal activities, rather than reacting to them forensically.


Generalized Maximum Entropy for Supervised Classification

arXiv.org Machine Learning

The maximum entropy principle advocates to evaluate events' probabilities using a distribution that maximizes entropy among those that satisfy certain expectations' constraints. Such principle can be generalized for arbitrary decision problems where it corresponds to minimax approaches. This paper establishes a framework for supervised classification based on the generalized maximum entropy principle that leads to minimax risk classifiers (MRCs). We develop learning techniques that determine MRCs for general entropy functions and provide performance guarantees by means of convex optimization. In addition, we describe the relationship of the presented techniques with existing classification methods, and quantify MRCs performance in comparison with the proposed bounds and conventional methods.


Generating Adversarial Inputs Using A Black-box Differential Technique

arXiv.org Machine Learning

Neural Networks (NNs) are known to be vulnerable to adversarial attacks. A malicious agent initiates these attacks by perturbing an input into another one such that the two inputs are classified differently by the NN. In this paper, we consider a special class of adversarial examples, which can exhibit not only the weakness of NN models - as do for the typical adversarial examples - but also the different behavior between two NN models. We call them difference-inducing adversarial examples or DIAEs. Specifically, we propose DAEGEN, the first black-box differential technique for adversarial input generation. DAEGEN takes as input two NN models of the same classification problem and reports on output an adversarial example. The obtained adversarial example is a DIAE, so that it represents a point-wise difference in the input space between the two NN models. Algorithmically, DAEGEN uses a local search-based optimization algorithm to find DIAEs by iteratively perturbing an input to maximize the difference of two models on predicting the input. We conduct experiments on a spectrum of benchmark datasets (e.g., MNIST, ImageNet, and Driving) and NN models (e.g., LeNet, ResNet, Dave, and VGG). Experimental results are promising. First, we compare DAEGEN with two existing white-box differential techniques (DeepXplore and DLFuzz) and find that under the same setting, DAEGEN is 1) effective, i.e., it is the only technique that succeeds in generating attacks in all cases, 2) precise, i.e., the adversarial attacks are very likely to fool machines and humans, and 3) efficient, i.e, it requires a reasonable number of classification queries. Second, we compare DAEGEN with state-of-the-art black-box adversarial attack methods (simba and tremba), by adapting them to work on a differential setting. The experimental results show that DAEGEN performs better than both of them.