AITopics

1902.10619

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.50)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Korotin, Alexander, V'yugin, Vladimir, Burnaev, Evgeny

Adaptive Hedging under Delayed Feedback

arXiv.org Machine LearningFeb-27-2019

The article is devoted to investigating the application of hedging strategies to online expert weight allocation under delayed feedback. As the main result, we develop the General Hedging algorithm $\mathcal{G}$ based on the exponential reweighing of experts' losses. We build the artificial probabilistic framework and use it to prove the adversarial loss bounds for the algorithm $\mathcal{G}$ in the delayed feedback setting. The designed algorithm $\mathcal{G}$ can be applied to both countable and continuous sets of experts. We also show how algorithm $\mathcal{G}$ extends classical Hedge (Multiplicative Weights) and adaptive Fixed Share algorithms to the delayed feedback and derive their regret bounds for the delayed setting by using our main result.

algorithm, dt 1, sequence, (13 more...)

1902.10433

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningFeb-27-2019

Distributed Edge Caching via Reinforcement Learning in Fog Radio Access Networks

Lu, Liuyang, Jiang, Yanxiang, Bennis, Mehdi, Ding, Zhiguo, Zheng, Fu-Chun, You, Xiaohu

In this paper, the distributed edge caching problem in fog radio access networks (F-RANs) is investigated. By considering the unknown spatio-temporal content popularity and user preference, a user request model based on hidden Markov process is proposed to characterize the fluctuant spatio-temporal traffic demands in F-RANs. Then, the Q-learning method based on the reinforcement learning (RL) framework is put forth to seek the optimal caching policy in a distributed manner, which enables fog access points (F-APs) to learn and track the potential dynamic process without extra communications cost. Furthermore, we propose a more efficient Q-learning method with value function approximation (Q-VFA-learning) to reduce complexity and accelerate convergence. Simulation results show that the performance of our proposed method is superior to those of the traditional methods.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1902.10574

Country: Asia > China > Jiangsu Province (0.14)

Genre: Research Report (0.70)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Lauri, Mikko, Pajarinen, Joni, Peters, Jan

Information Gathering in Decentralized POMDPs by Policy Graph Improvement

arXiv.org Artificial IntelligenceFeb-26-2019

Decentralized policies for information gathering are required when multiple autonomous agents are deployed to collect data about a phenomenon of interest without the ability to communicate. Decentralized partially observable Markov decision processes (Dec-POMDPs) are a general, principled model well-suited for such decentralized multiagent decision-making problems. In this paper, we investigate Dec-POMDPs for decentralized information gathering problems. An optimal solution of a Dec-POMDP maximizes the expected sum of rewards over time. To encourage information gathering, we set the reward as a function of the agents' state information, for example the negative Shannon entropy. We prove that if the reward is convex, then the finite-horizon value function of the corresponding Dec-POMDP is also convex. We propose the first heuristic algorithm for information gathering Dec-POMDPs, and empirically prove its effectiveness by solving problems an order of magnitude larger than previous state-of-the-art.

artificial intelligence, dec-pomdp, machine learning, (16 more...)

1902.0984

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > Germany > Hamburg (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Garcia-Barcos, Javier, Martinez-Cantin, Ruben

Fully Distributed Bayesian Optimization with Stochastic Policies

arXiv.org Artificial IntelligenceFeb-26-2019

Bayesian optimization has become a popular method for high-throughput computing, like the design of computer experiments or hyperparameter tuning of expensive models, where sample efficiency is mandatory. In these applications, distributed and scalable architectures are a necessity. However, Bayesian optimization is mostly sequential. Even parallel variants require certain computations between samples, limiting the parallelization bandwidth. Thompson sampling has been previously applied for distributed Bayesian optimization. But, when compared with other acquisition functions in the sequential setting, Thompson sampling is known to perform suboptimally. In this paper, we present a new method for fully distributed Bayesian optimization, which can be combined with any acquisition function. Our approach considers Bayesian optimization as a partially observable Markov decision process. In this context, stochastic policies, such as the Boltzmann policy, have some interesting properties which can also be studied for Bayesian optimization. Furthermore, the Boltzmann policy trivially allows a distributed Bayesian optimization implementation with high level of parallelism and scalability. We present results in several benchmarks and applications that shows the performance of our method.

artificial intelligence, machine learning, optimization, (16 more...)

1902.09992

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Tonutti, Michele, Ruffaldi, Emanuele, Cattaneo, Alessandro, Avizzano, Carlo Alberto

Robust and Subject-Independent Driving Manoeuvre Anticipation through Domain-Adversarial Recurrent Neural Networks

arXiv.org Machine LearningFeb-26-2019

Through deep learning and computer vision techniques, driving manoeuvres can be predicted accurately a few seconds in advance. Even though adapting a learned model to new drivers and different vehicles is key for robust driver-assistance systems, this problem has received little attention so far. This work proposes to tackle this challenge through domain adaptation, a technique closely related to transfer learning. A proof of concept for the application of a Domain-Adversarial Recurrent Neural Network (DA-RNN) to multi-modal time series driving data is presented, in which domain-invariant features are learned by maximizing the loss of an auxiliary domain classifier. Our implementation is evaluated using a leave-one-driver-out approach on individual drivers from the Brain4Cars dataset, as well as using a new dataset acquired through driving simulations, yielding an average increase in performance of 30% and 114% respectively compared to no adaptation. We also show the importance of fine-tuning sections of the network to optimise the extraction of domain-independent features. The results demonstrate the applicability of the approach to driver-assistance systems as well as training and simulation environments.

artificial intelligence, deep learning, machine learning, (15 more...)

doi: 10.1016/j.robot.2019.02.007

1902.0982

Country: Europe (0.28)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Durrande, Nicolas, Adam, Vincent, Bordeaux, Lucas, Eleftheriadis, Stefanos, Hensman, James

Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era

arXiv.org Machine LearningFeb-26-2019

These two limitations have been thoroughly Banded matrices can be used as precision studied over the past decades and several approaches matrices in several models including linear have been proposed to overcome them. The most popular state-space models, some Gaussian processes, method for reducing computational complexity is and Gaussian Markov random fields. The the sparse GP framework (Candela and Rasmussen, aim of the paper is to make modern inference 2005; Titsias, 2009), where computations are focussed methods (such as variational inference or on a set of "inducing variables", allowing a tradeoff gradient-based sampling) available for Gaussian between computational requirements and the accuracy models with banded precision.

artificial intelligence, machine learning, matrix, (16 more...)

1902.10078

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Van Huynh, Nguyen, Hoang, Dinh Thai, Nguyen, Diep N., Dutkiewicz, Eryk

Optimal and Fast Real-time Resources Slicing with Deep Dueling Neural Networks

arXiv.org Artificial IntelligenceFeb-25-2019

Effective network slicing requires an infrastructure/network provider to deal with the uncertain demand and real-time dynamics of network resource requests. Another challenge is the combinatorial optimization of numerous resources, e.g., radio, computing, and storage. This article develops an optimal and fast real-time resource slicing framework that maximizes the long-term return of the network provider while taking into account the uncertainty of resource demand from tenants. Specifically, we first propose a novel system model which enables the network provider to effectively slice various types of resources to different classes of users under separate virtual slices. We then capture the real-time arrival of slice requests by a semi-Markov decision process. To obtain the optimal resource allocation policy under the dynamics of slicing requests, e.g., uncertain service time and resource demands, a Q-learning algorithm is often adopted in the literature. However, such an algorithm is notorious for its slow convergence, especially for problems with large state/action spaces. This makes Q-learning practically inapplicable to our case in which multiple resources are simultaneously optimized. To tackle it, we propose a novel network slicing approach with an advanced deep learning architecture, called deep dueling that attains the optimal average reward much faster than the conventional Q-learning algorithm. This property is especially desirable to cope with real-time resource requests and the dynamic demands of users. Extensive simulations show that the proposed framework yields up to 40% higher long-term average return while being few thousand times faster, compared with state of the art network slicing approaches.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1902.09696

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(6 more...)

Genre:

Workflow (0.67)
Research Report (0.63)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Telecommunications (1.00)
Information Technology > Networks (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Francis, Anthony, Faust, Aleksandra, Chiang, Hao-Tien Lewis, Hsu, Jasmine, Kew, J. Chase, Fiser, Marek, Lee, Tsang-Wei Edward

Long-Range Indoor Navigation with PRM-RL

arXiv.org Artificial IntelligenceFeb-25-2019

Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings. We achieve this with PRM-RL, a hierarchical robot navigation method in which reinforcement learning agents that map noisy sensors to robot controls learn to solve short-range obstacle avoidance tasks, and then sampling-based planners map where these agents can reliably navigate in simulation; these roadmaps and agents are then deployed on-robot, guiding the robot along the shortest path where the agents are likely to succeed. Here we use Probabilistic Roadmaps (PRMs) as the sampling-based planner and AutoRL as the reinforcement learning method in the indoor navigation context. We evaluate the method in simulation for kinematic differential drive and kinodynamic car-like robots in several environments, and on-robot for differential-drive robots at two physical sites. Our results show PRM-RL with AutoRL is more successful than several baselines, is robust to noise, and can guide robots over hundreds of meters in the face of noise and obstacles in both simulation and on-robot, including over 3.3 kilometers of physical robot navigation.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1902.09458

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Energy (0.67)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ahn, Hongjoon, Moon, Taesup

Iterative Channel Estimation for Discrete Denoising under Channel Uncertainty

arXiv.org Artificial IntelligenceFeb-24-2019

We propose a novel iterative channel estimation (ICE) algorithm that essentially removes the critical known noisy channel assumption for universal discrete denoising problem. Our algorithm is based on Neural DUDE (N-DUDE), a recently proposed neural network-based discrete denoiser, and it estimates the channel transition matrix as well as the neural network parameters in an alternating manner until convergence. While we do not make any probabilistic assumption on the underlying clean data, our ICE resembles Expectation-Maximization (EM) with variational approximation, and it takes advantage of the property of N-DUDE being locally robust around the true channel. With extensive experiments on several radically different types of data, we show that the ICE equipped N-DUDE (dubbed as ICE-N-DUDE) can perform \emph{universally} well regardless of the uncertainties in both the channel and the clean source. Moreover, we show ICE-N-DUDE becomes extremely robust to its hyperparameters and significantly outperforms the strong baseline that can deal with the channel uncertainties for denoising, the widely used Baum-Welch (BW) algorithm for hidden Markov models (HMM).

artificial intelligence, ice-n-dude, machine learning, (14 more...)

1902.08921

Country: Asia > South Korea > Gyeonggi-do > Suwon (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)