Goto

Collaborating Authors

 Edmonton


Count-Based Exploration with the Successor Representation

arXiv.org Artificial Intelligence

The problem of exploration in reinforcement learning is well-understood in the tabular case and many sample-efficient algorithms are known. Nevertheless, it is often unclear how the algorithms in the tabular setting can be extended to tasks with large state-spaces where generalization is required. Recent promising developments generally depend on problem-specific density models or handcrafted features. In this paper we introduce a simple approach for exploration that allows us to develop theoretically justified algorithms in the tabular case but that also give us intuitions for new algorithms applicable to settings where function approximation is required. Our approach and its underlying theory is based on the substochastic successor representation, a concept we develop here. While the traditional successor representation is a representation that defines state generalization by the similarity of successor states, the substochastic successor representation is also able to implicitly count the number of times each state (or feature) has been observed. This extension connects two until now disjoint areas of research. We show in traditional tabular domains (RiverSwim and SixArms) that our algorithm empirically performs as well as other sample-efficient algorithms. We then describe a deep reinforcement learning algorithm inspired by these ideas and show that it matches the performance of recent pseudo-count-based methods in hard exploration Atari 2600 games.


Call Detail Records Driven Anomaly Detection and Traffic Prediction in Mobile Cellular Networks

arXiv.org Artificial Intelligence

Mobile networks possess information about the users as well as the network. Such information is useful for making the network end-to-end visible and intelligent. Big data analytics can efficiently analyze user and network information, unearth meaningful insights with the help of machine learning tools. Utilizing big data analytics and machine learning, this work contributes in three ways. First, we utilize the call detail records (CDR) data to detect anomalies in the network. For authentication and verification of anomalies, we use k-means clustering, an unsupervised machine learning algorithm. Through effective detection of anomalies, we can proceed to suitable design for resource distribution as well as fault detection and avoidance. Second, we prepare anomaly-free data by removing anomalous activities and train a neural network model. By passing anomaly and anomaly-free data through this model, we observe the effect of anomalous activities in training of the model and also observe mean square error of anomaly and anomaly free data. Lastly, we use an autoregressive integrated moving average (ARIMA) model to predict future traffic for a user. Through simple visualization, we show that anomaly free data better generalizes the learning models and performs better on prediction task.


Per-decision Multi-step Temporal Difference Learning with Control Variates

arXiv.org Machine Learning

Multi-step temporal difference (TD) learning is an important approach in reinforcement learning, as it unifies one-step TD learning with Monte Carlo methods in a way where intermediate algorithms can outperform either extreme. They address a bias-variance trade off between reliance on current estimates, which could be poor, and incorporating longer sampled reward sequences into the updates. Especially in the off-policy setting, where the agent aims to learn about a policy different from the one generating its behaviour, the variance in the updates can cause learning to diverge as the number of sampled rewards used in the estimates increases. In this paper, we introduce per-decision control variates for multi-step TD algorithms, and compare them to existing methods. Our results show that including the control variates can greatly improve performance on both on and off-policy multi-step temporal difference learning tasks.


LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration

arXiv.org Artificial Intelligence

We consider the problem of configuring general-purpose solvers to run efficiently on problem instances drawn from an unknown distribution. The goal of the configurator is to find a configuration that runs fast on average on most instances, and do so with the least amount of total work. It can run a chosen solver on a random instance until the solver finishes or a timeout is reached. We propose LeapsAndBounds, an algorithm that tests configurations on randomly selected problem instances for longer and longer time. We prove that the capped expected runtime of the configuration returned by LeapsAndBounds is close to the optimal expected runtime, while our algorithm's running time is near-optimal. Our results show that LeapsAndBounds is more efficient than the recent algorithm of Kleinberg et al. (2017), which, to our knowledge, is the only other algorithm configuration method with non-trivial theoretical guarantees. Experimental results on configuring a public SAT solver on a new benchmark dataset also stand witness to the superiority of our method.


The 13th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment

AI Magazine

The 13th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 2017) was held at the Snowbird Ski and Summer Resort in Little Cottonwod Canyon in the Wasatch Range of the Rock Mountains near Salt Lake County, Utah. Along with the main conference presentations, the meeting included two tutorials, three workshops, and invited keynotes. This report summarizes the main conference. It also includes contributions from the organizers of the three workshops.


AAAI Conferences Calendar

AI Magazine

This page includes forthcoming AAAI sponsored conferences, conferences presented by AAAI Affiliates, and conferences held in cooperation with AAAI. AI Magazine also maintains a calendar listing that includes nonaffiliated conferences at www.aaai.org/Magazine/calendar.php. KR will be held January 29-31, 2019 at 2018 will be held October 30 - No - the Hilton Hawaiian Village in Honolulu, vember 2, 2018, in Tempe, Arizona, Hawaii USA. The IAAI-19 Conference will be held January 29-31, 2019 at the Hilton Hawaiian Village in Honolulu, Hawaii USA. We invite all interested individuals to check out our Facebook site by searching for AAAI.


AAAI News

AI Magazine

While artificial intelligence AAAI-19 will comprise a host of programs, well as strong outreach programs for including the Senior Member (AI) and human-computer interaction students, women, and sister conferences. Track, the Technical Demonstration (HCI) represent traditional They have absorbed all former Program, the Tutorial and Workshop mainstays of the conference, HCOMP special tracks into the main conference Programs, and several student programs, believes strongly in inviting, fostering, technical program, with provision for such as the Student Abstract and promoting broad, interdisciplinary distinguished oversight of reviews for and Poster Program and the Doctoral research. This field is particularly these areas.


A Bot Backed by Elon Musk Has Made an AI Breakthrough in Video Game World

#artificialintelligence

Artificial-intelligence research group OpenAI said it created software capable of beating teams of five skilled human players in the video game Dota 2, a milestone in computer science. The achievement puts San Francisco-based OpenAI, whose backers include billionaire Elon Musk, ahead of other artificial-intelligence researchers in developing software that can master complex games combining fast, real-time action, longer-term strategy, imperfect information and team play. The ability to learn these kinds of video games at human or super-human levels is important for the advancement of AI because they more closely approximate the uncertainties and complexity of the real world than games such as chess, which IBM's software mastered in the late 1990s, or Go, which was conquered in 2016 with software created by DeepMind, the London-based AI company owned by Alphabet Inc. Dota 2 is a multiplayer science-fiction fantasy video game created by Bellevue, Washington-based Valve Corp. Each team is assigned a base on opposing ends of a map that can only be learned through exploration. Each player controls a separate character with unique powers and weapons. Each team must battle to reach the opposing team's territory and destroy a structure called an Ancient.


The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces

arXiv.org Artificial Intelligence

Dyna is an architecture for reinforcement learning agents that interleaves planning, acting, and learning in an online setting. This architecture aims to make fuller use of limited experience to achieve better performance with fewer environmental interactions. Dyna has been well studied in problems with a tabular representation of states, and has also been extended to some settings with larger state spaces that require function approximation. However, little work has studied Dyna in environments with high-dimensional state spaces like images. In Dyna, the environment model is typically used to generate one-step transitions from selected start states. We applied one-step Dyna to several games from the Arcade Learning Environment and found that the model-based updates offered surprisingly little benefit, even with a perfect model. However, when the model was used to generate longer trajectories of simulated experience, performance improved dramatically. This observation also holds when using a model that is learned from experience; even though the learned model is flawed, it can still be used to accelerate learning.


Non-Intrusive Signature Extraction for Major Residential Loads

arXiv.org Artificial Intelligence

The data collected by smart meters contain a lot of useful information. One potential use of the data is to track the energy consumptions and operating statuses of major home appliances.The results will enable homeowners to make sound decisions on how to save energy and how to participate in demand response programs. This paper presents a new method to breakdown the total power demand measured by a smart meter to those used by individual appliances. A unique feature of the proposed method is that it utilizes diverse signatures associated with the entire operating window of an appliance for identification. As a result, appliances with complicated middle process can be tracked. A novel appliance registration device and scheme is also proposed to automate the creation of appliance signature database and to eliminate the need of massive training before identification. The software and system have been developed and deployed to real houses in order to verify the proposed method.