AITopics

Owing to its unique literal and aesthetical characteristics, automatic generation of Chinese poetry is still challenging in Artificial Intelligence, which can hardly be straightforwardly realized by end-to-end methods. In this paper, we propose a novel iterative polishing framework for highly qualified Chinese poetry generation. In the first stage, an encoder-decoder structure is utilized to generate a poem draft. Afterwards, our proposed Quality-Aware Masked Language Model (QAMLM) is employed to polish the draft towards higher quality in terms of linguistics and literalness. Based on a multi-task learning scheme, QA-MLM is able to determine whether polishing is needed based on the poem draft. Furthermore, QAMLM is able to localize improper characters of the poem draft and substitute with newly predicted ones accordingly. Benefited from the masked language model structure, QAMLM incorporates global context information into the polishing process, which can obtain more appropriate polishing results than the unidirectional sequential decoding. Moreover, the iterative polishing process will be terminated automatically when QA-MLM regards the processed poem as a qualified one. Both human and automatic evaluation have been conducted, and the results demonstrate that our approach is effective to improve the performance of encoder-decoder structure.

poem, poem line, qa-mlm, (14 more...)

1911.13182

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Shen, William, Trevizan, Felipe, Thiébaux, Sylvie

Learning Domain-Independent Planning Heuristics with Hypergraph Networks

We present the first approach capable of learning domain-independent planning heuristics entirely from scratch. The heuristics we learn map the hypergraph representation of the delete-relaxation of the planning problem at hand, to a cost estimate that approximates that of the least-cost path from the current state to the goal through the hypergraph. We generalise Graph Networks to obtain a new framework for learning over hypergraphs, which we specialise to learn planning heuristics by training over state/value pairs obtained from optimal cost plans. Our experiments show that the resulting architecture, STRIPS-HGNs, is capable of learning heuristics that are competitive with existing delete-relaxation heuristics including LM-cut. We show that the heuristics we learn are able to generalise across different problems and domains, including to domains that were not seen during training.

hyperedge, hypergraph, strips-hgn, (15 more...)

1911.13101

Country:

North America > Canada > Alberta (0.14)
Europe > Germany > Saarland (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Risi, Sebastian, Togelius, Julian

Procedural Content Generation: From Automatically Generating Game Levels to Increasing Generality in Machine Learning

The idea behind procedural content generation (PCG) in games is to create content automatically, using algorithms, instead of relying on user-designed content. While PCG approaches have traditionally focused on creating content for video games, they are now being applied to all kinds of virtual environments, thereby enabling training of machine learning systems that are significantly more general. For example, PCG's ability to generate never-ending streams of new levels has allowed DeepMind's Capture the Flag agent to reach beyond human-level-performance. Additionally, PCG-inspired methods such as domain randomization enabled OpenAI's robot arm to learn to manipulate objects with unprecedented dexterity. Level generation in 2D arcade games has also illuminated some shortcomings of standard deep RL methods, suggesting potential ways to train more general policies. This Review looks at key aspect of PCG approaches, including its ability to (1) enable new video games (such as No Man's Sky), (2) create open-ended learning environments, (3) combat overfitting in supervised and reinforcement learning tasks, and (4) create better benchmarks that could ultimately spur the development of better learning algorithms. We hope this article can introduce the broader machine learning community to PCG, which we believe will be a critical tool in creating a more general machine intelligence.

agent, arxiv preprint arxiv, domain randomization, (11 more...)

1911.13071

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre:

Research Report (0.50)
Overview (0.48)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation

Akimov, Dmitry

In this paper, we describe NeurIPS 2019 Learning to Move - Walk Around challenge physics-based environment and present our solution to this competition which scored 1303.727 mean reward points and took 3rd place. Our method combines recent advances from both continuous- and discrete-action space reinforcement learning, such as Soft Actor-Critic and Recurrent Experience Replay in Distributed Reinforcement Learning. We trained our agent in two stages: to move somewhere at the first stage and to follow the target velocity field at the second stage. We also introduce novel Q-function split technique, which we believe facilitates the task of training an agent, allows critic pretraining and reusing it for solving harder problems, and mitigate reward shaping design efforts.

agent, multivariate reward representation, target velocity, (11 more...)

1911.13056

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Lopes, Manuel, Melo, Francisco

Class Teaching for Inverse Reinforcement Learners

In this paper we propose the first machine teaching algorithm for multiple inverse reinforcement learners. Specifically, our contributions are: (i) we formally introduce the problem of teaching a sequential task to a heterogeneous group of learners; (ii) we identify conditions under which it is possible to conduct such teaching using the same demonstration for all learners; and (iii) we propose and evaluate a simple algorithm that computes a demonstration(s) ensuring that all agents in a heterogeneous class learn a task description that is compatible with the target task. Our analysis shows that, contrary to other teaching problems, teaching a heterogeneous class with a single demonstration may not be possible as the differences between agents increase. We also showcase the advantages of our proposed machine teaching approach against several possible alternatives.

demonstration, learner, teaching, (16 more...)

1911.13009

Country: Europe > Portugal (0.04)

Genre: Research Report (0.64)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.47)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Gambardella, Andrew, Baydin, Atılım Güneş, Torr, Philip H. S.

Transflow Learning: Repurposing Flow Models Without Retraining

It is well known that deep generative models have a rich latent space, and that it is possible to smoothly manipulate their outputs by traversing this latent space. Recently, architectures have emerged that allow for more complex manipulations, such as making an image look as though it were from a different class, or painted in a certain style. These methods typically require large amounts of training in order to learn a single class of manipulations. We present Transflow Learning, a method for transforming a pre-trained generative model so that its outputs more closely resemble data that we provide afterwards. In contrast to previous methods, Transflow Learning does not require any training at all, and instead warps the probability distribution from which we sample latent vectors using Bayesian inference. Transflow Learning can be used to solve a wide variety of tasks, such as neural style transfer and few-shot classification.

flow model, generative model, transflow learning, (15 more...)

1911.1327

Country:

North America > United States (0.15)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.90)
Information Technology > Artificial Intelligence > Vision (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Awasthi, Pranjal, Chatziafratis, Vaggos, Chen, Xue, Vijayaraghavan, Aravindan

Adversarially Robust Low Dimensional Representations

Adversarial or test time robustness measures the susceptibility of a machine learning system to small perturbations made to the input at test time. This has attracted much interest on the empirical side, since many existing ML systems perform poorly under imperceptible adversarial perturbations to the test inputs. On the other hand, our theoretical understanding of this phenomenon is limited, and has mostly focused on supervised learning tasks. In this work we study the problem of computing adversarially robust representations of data. We formulate a natural extension of Principal Component Analysis (PCA) where the goal is to find a low dimensional subspace to represent the given data with minimum projection error, and that is in addition robust to small perturbations measured in $\ell_q$ norm (say $q=\infty$). Unlike PCA which is solvable in polynomial time, our formulation is computationally intractable to optimize as it captures the well-studied sparse PCA objective. We show the following algorithmic and statistical results. - Polynomial time algorithms in the worst-case that achieve constant factor approximations to the objective while only violating the robustness constraint by a constant factor. - We prove that our formulation (and algorithms) also enjoy significant statistical benefits in terms of sample complexity over standard PCA on account of a "regularization effect", that is formalized using the well-studied spiked covariance model. - Surprisingly, we show that our algorithmic techniques can also be made robust to corruptions in the training data, in addition to yielding representations that are robust at test time! Here an adversary is allowed to corrupt potentially every data point up to a specified amount in the $\ell_q$ norm. We further apply these techniques for mean estimation and clustering under adversarial corruptions to the training data.

algorithm, anull 2, matrix, (15 more...)

1911.13268

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Provoost, Jesper, Wismans, Luc, Van der Drift, Sander, Kamilaris, Andreas, Van Keulen, Maurice

Short Term Prediction of Parking Area states Using Real Time Data and Machine Learning Techniques

Public road authorities and private mobility service providers need information derived from the current and predicted traffic states to act upon the daily urban system and its spatial and temporal dynamics. In this research, a real-time parking area state (occupancy, in- and outflux) prediction model (up to 60 minutes ahead) has been developed using publicly available historic and real time data sources. Based on a case study in a real-life scenario in the city of Arnhem, a Neural Network-based approach outperforms a Random Forest-based one on all assessed performance measures, although the differences are small. Both are outperforming a naive seasonal random walk model. Although the performance degrades with increasing prediction horizon, the model shows a performance gain of over 150% at a prediction horizon of 60 minutes compared with the naive model. Furthermore, it is shown that predicting the in- and outflux is a far more difficult task (i.e. performance gains of 30%) which needs more training data, not based exclusively on occupancy rate. However, the performance of predicting in- and outflux is less sensitive to the prediction horizon. In addition, it is shown that real-time information of current occupancy rate is the independent variable with the highest contribution to the performance, although time, traffic flow and weather variables also deliver a significant contribution. During real-time deployment, the model performs three times better than the naive model on average. As a result, it can provide valuable information for proactive traffic management as well as mobility service providers.

occupancy rate, prediction, provoost, (15 more...)

1911.13178

Country:

Europe > Netherlands > Gelderland > Arnhem (0.25)
Europe > Germany (0.04)
Africa > Nigeria > Plateau State (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Moran, Sean, Slabaugh, Gregory

DIFAR: Deep Image Formation and Retouching

Given (a) poorly exposed image, DIF AR(c) produces an image with pleasing contrast and colour better matching the groundtruth (d) compared to the state-of-the-art DeepUPE model [42] (b). Abstract W e present a novel neural network architecture for the image signal processing (ISP) pipeline. In a camera system, the ISP is a critical component that forms a high quality RGB image from RA W camera sensor data. Typical ISP pipelines sequentially apply a complex set of traditional image processing modules, such as demosaicing, denoising, tone mapping, etc. W e introduce a new deep network that replaces all these modules, dubbed Deep Image Formation And Retouching (DIFAR) . DIF AR introduces a multi-scale context-aware pixel-level block for local de-noising/demosaicing operations and a retouching block for global refinement of image colour, luminance and saturation. DIF AR can also be trained for RGB to RGB image enhancement. DIF AR is parameter-efficient and outperforms recently proposed deep learning approaches in both objective and perceptual metrics, setting new state-of-the-art performance on multiple datasets including Samsung S7 [38] and MIT-Adobe 5k [6]. 1. Introduction Image quality is of fundamental importance in any imaging system, including DSLR and smartphone cameras. At the imaging sensor, RA W data is normally captured on a color filter array (such as the well-known Bayer pattern) where at each pixel, only a red, green, or blue color is available. This mosaiced RA W data suffers from noise, vignetting, lack of white balance, and many other defects and additionally has a high dynamic range. The camera's image signal processing (ISP) pipeline is responsible for forming a high quality RGB image with minimal noise, pleasing colors, sharp detail, and good contrast from the degraded RA W data. In most cases, the ISP is realised as a modular sequence of traditional image signal processing algorithms (Figure 2) each responsible for a single well-defined image operation (e.g.

dataset, dif ar, difar, (14 more...)

1911.13175

Country: North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (0.82)

Industry: Semiconductors & Electronics (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

VIABLE: Fast Adaptation via Backpropagating Learned Loss

Feng, Leo, Zintgraf, Luisa, Peng, Bei, Whiteson, Shimon

In few-shot learning, typically, the loss function which is applied at test time is the one we are ultimately interested in minimising, such as the mean-squared-error loss for a regression problem. However, given that we have few samples at test time, we argue that the loss function that we are interested in minimising is not necessarily the loss function most suitable for computing gradients in a few-shot setting. We propose VIABLE, a generic meta-learning extension that builds on existing meta-gradient-based methods by learning a differentiable loss function, replacing the pre-defined inner-loop loss function in performing task-specific updates. We show that learning a loss function capable of leveraging relational information between samples reduces underfitting, and significantly improves performance and sample efficiency on a simple regression task. Furthermore, we show VIABLE is scalable by evaluating on the Mini-Imagenet dataset.

loss function, loss network, viable, (12 more...)

1911.13159

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > Canada (0.04)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)