Goto

Collaborating Authors

 Oceania


Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic Control

arXiv.org Machine Learning

Recent networking research has identified that data-driven congestion control (CC) can be more efficient than traditional CC in TCP. Deep reinforcement learning (RL), in particular, has the potential to learn optimal network policies. However, RL suffers from instability and over-fitting, deficiencies which so far render it unacceptable for use in datacenter networks. In this paper, we analyze the requirements for RL to succeed in the datacenter context. We present a new emulator, Iroko, which we developed to support different network topologies, congestion control algorithms, and deployment scenarios. Iroko interfaces with the OpenAI gym toolkit, which allows for fast and fair evaluation of different RL and traditional CC algorithms under the same conditions. We present initial benchmarks on three deep RL algorithms compared to TCP New Vegas and DCTCP. Our results show that these algorithms are able to learn a CC policy which exceeds the performance of TCP New Vegas on a dumbbell and fat-tree topology. We make our emulator open-source and publicly available: https://github.com/dcgym/iroko


Decentralized Decision-Making Over Multi-Task Networks

arXiv.org Machine Learning

In important applications involving multi-task networks with multiple objectives, agents in the network need to decide between these multiple objectives and reach an agreement about which single objective to follow for the network. In this work we propose a distributed decision-making algorithm. The agents are assumed to observe data that may be generated by different models. Through localized interactions, the agents reach agreement about which model to track and interact with each other in order to enhance the network performance. We investigate the approach for both static and mobile networks. The simulations illustrate the performance of the proposed strategies.


Transition-Based Neural Word Segmentation Using Word-Level Features

Journal of Artificial Intelligence Research

Character-based and word-based methods are two different solutions for Chinese word segmentation, the former exploiting sequence labeling models over characters and the latter using word-level features. Neural models have been exploited for character-based Chinese word segmentation, giving high accuracies by making use of external character embeddings, yet requiring less feature engineering. In this paper, we study a neural model for word-based Chinese word segmentation, by replacing the manually-designed discrete features with neural features in a transition-based word segmentation framework. Experimental results demonstrate that word features lead to comparable performance to the best systems in the literature, and a further combination of discrete and neural features obtains top accuracies on several benchmarks.


Modified Causal Forests for Estimating Heterogeneous Causal Effects

arXiv.org Machine Learning

Although science and the public celebrated the amazing predictive power of the new machine learningmethods, many researchers are left with some unease, simply because prediction does not imply causation. The ability to uncover causal relations is, however, at the core of most questions concerning the effects of particular policies, medical treatments, marketing campaigns, businessdecisions, etc. (see Athey, 2017, for a recent discussion). The recently rapidly expanding causal machine learning literature holds great promise for the improved estimation of causal effects by merging the statistics and econometrics literature oncausality with the supervised statistical and machine learning (ML) literature focussing on prediction. The classical causality literature clarifies the conditions needed for being able to estimate causal effects. It also shows how to transform a counterfactual causal problem into specific prediction problems (e.g., Imbens and Wooldridge, 2009). The latter literature on ML provides tools that can be highly effective in solving prediction problems (e.g.


Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing

arXiv.org Artificial Intelligence

Small base stations (SBs) of fifth-generation (5G) cellular networks are envisioned to have storage devices to locally serve requests for reusable and popular contents by \emph{caching} them at the edge of the network, close to the end users. The ultimate goal is to shift part of the predictable load on the back-haul links, from on-peak to off-peak periods, contributing to a better overall network performance and service experience. To enable the SBs with efficient \textit{fetch-cache} decision-making schemes operating in dynamic settings, this paper introduces simple but flexible generic time-varying fetching and caching costs, which are then used to formulate a constrained minimization of the aggregate cost across files and time. Since caching decisions per time slot influence the content availability in future slots, the novel formulation for optimal fetch-cache decisions falls into the class of dynamic programming. Under this generic formulation, first by considering stationary distributions for the costs and file popularities, an efficient reinforcement learning-based solver known as value iteration algorithm can be used to solve the emerging optimization problem. Later, it is shown that practical limitations on cache capacity can be handled using a particular instance of the generic dynamic pricing formulation. Under this setting, to provide a light-weight online solver for the corresponding optimization, the well-known reinforcement learning algorithm, $Q$-learning, is employed to find optimal fetch-cache decisions. Numerical tests corroborating the merits of the proposed approach wrap up the paper.


Classification of load forecasting studies by forecasting problem to select load forecasting techniques and methodologies

arXiv.org Machine Learning

This article proposes a two-dimensional classification methodology to select the relevant forecasting tools developed by the scientific community based on a classification of load forecasting studies. The inputs of the classifier are the articles of the literature and the outputs are articles classified into categories. The classification process relies on two couple of parameters that defines a forecasting problem. The temporal couple is the forecasting horizon and the forecasting resolution. The system couple is the system size and the load resolution. Each article is classified with key information about the dataset used and the forecasting tools implemented: the forecasting techniques (probabilistic or deterministic) and methodologies, the cleansing data techniques and the error metrics. This process is illustrated by reviewing and classifying thirty-four articles.


Dynamic Graph Representation Learning via Self-Attention Networks

arXiv.org Machine Learning

Learning latent representations of nodes in graphs is an important and ubiquitous task with widespread applications such as link prediction, node classification, and graph visualization. Previous methods on graph representation learning mainly focus on static graphs, however, many real-world graphs are dynamic and evolve over time. In this paper, we present Dynamic Self-Attention Network (DySAT), a novel neural architecture that operates on dynamic graphs and learns node representations that capture both structural properties and temporal evolutionary patterns. Specifically, DySAT computes node representations by jointly employing self-attention layers along two dimensions: structural neighborhood and temporal dynamics. We conduct link prediction experiments on two classes of graphs: communication networks and bipartite rating networks. Our experimental results show that DySAT has a significant performance gain over several different state-of-the-art graph embedding baselines.


Learning to Navigate the Web

arXiv.org Machine Learning

Learning in environments with large state and action spaces, and sparse rewards, can hinder a Reinforcement Learning (RL) agent's learning through trial-and-error. For instance, following natural language instructions on the Web (such as booking a flight ticket) leads to RL settings where input vocabulary and number of actionable elements on a page can grow very large. Even though recent approaches improve the success rate on relatively simple environments with the help of human demonstrations to guide the exploration, they still fail in environments where the set of possible instructions can reach millions. We approach the aforementioned problems from a different perspective and propose guided RL approaches that can generate unbounded amount of experience for an agent to learn from. Instead of learning from a complicated instruction with a large vocabulary, we decompose it into multiple sub-instructions and schedule a curriculum in which an agent is tasked with a gradually increasing subset of these relatively easier sub-instructions. In addition, when the expert demonstrations are not available, we propose a novel meta-learning framework that generates new instruction following tasks and trains the agent more effectively. We train DQN, deep reinforcement learning agent, with Q-value function approximated with a novel QWeb neural network architecture on these smaller, synthetic instructions. We evaluate the ability of our agent to generalize to new instructions on World of Bits benchmark, on forms with up to 100 elements, supporting 14 million possible instructions. The QWeb agent outperforms the baseline without using any human demonstration achieving 100% success rate on several difficult environments.


Multi-component Image Translation for Deep Domain Generalization

arXiv.org Artificial Intelligence

Domain adaption (DA) and domain generalization (DG) are two closely related methods which are both concerned with the task of assigning labels to an unlabeled data set. The only dissimilarity between these approaches is that DA can access the target data during the training phase, while the target data is totally unseen during the training phase in DG. The task of DG is challenging as we have no earlier knowledge of the target samples. If DA methods are applied directly to DG by a simple exclusion of the target data from training, poor performance will result for a given task. In this paper, we tackle the domain generalization challenge in two ways. In our first approach, we propose a novel deep domain generalization architecture utilizing synthetic data generated by a Generative Adversarial Network (GAN). The discrepancy between the generated images and synthetic images is minimized using existing domain discrepancy metrics such as maximum mean discrepancy or correlation alignment. In our second approach, we introduce a protocol for applying DA methods to a DG scenario by excluding the target data from the training phase, splitting the source data to training and validation parts, and treating the validation data as target data for DA. We conduct extensive experiments on four cross-domain benchmark datasets. Experimental results signify our proposed model outperforms the current state-of-the-art methods for DG.


Lifelong Testing of Smart Autonomous Systems by Shepherding a Swarm of Watchdog Artificial Intelligence Agents

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) technologies could be broadly categorised into Analytics and Autonomy. Analytics focuses on algorithms offering perception, comprehension, and projection of knowledge gleaned from sensorial data. Autonomy revolves around decision making, and influencing and shaping the environment through action production. A smart autonomous system (SAS) combines analytics and autonomy to understand, learn, decide and act autonomously. To be useful, SAS must be trusted and that requires testing. Lifelong learning of a SAS compounds the testing process. In the remote chance that it is possible to fully test and certify the system pre-release, which is theoretically an undecidable problem, it is near impossible to predict the future behaviours that these systems, alone or collectively, will exhibit. While it may be feasible to severely restrict such systems\textquoteright \ learning abilities to limit the potential unpredictability of their behaviours, an undesirable consequence may be severely limiting their utility. In this paper, we propose the architecture for a watchdog AI (WAI) agent dedicated to lifelong functional testing of SAS. We further propose system specifications including a level of abstraction whereby humans shepherd a swarm of WAI agents to oversee an ecosystem made of humans and SAS. The discussion extends to the challenges, pros, and cons of the proposed concept.