Goto

Collaborating Authors

 arxiv vanity


Learning to Navigate the Web – Arxiv Vanity

#artificialintelligence

Learning in environments with large state and action spaces, and sparse rewards, can hinder a Reinforcement Learning (RL) agent's learning through trial-and-error. For instance, following natural language instructions on the Web (such as booking a flight ticket) leads to RL settings where input vocabulary and number of actionable elements on a page can grow very large. Even though recent approaches improve the success rate on relatively simple environments with the help of human demonstrations to guide the exploration, they still fail in environments where the set of possible instructions can reach millions. We approach the aforementioned problems from a different perspective and propose guided RL approaches that can generate unbounded amount of experience for an agent to learn from. Instead of learning from a complicated instruction with a large vocabulary, we decompose it into multiple sub-instructions and schedule a curriculum in which an agent is tasked with a gradually increasing subset of these relatively easier sub-instructions.


Deep Learning on Graphs: A Survey – Arxiv Vanity

#artificialintelligence

For CNNs, convolution is the most fundamental operation. However, standard convolution for image or text can not be directly applied to graphs because of the lack of a grid structure [6]. Bruna et al. [33] first introduce convolution for graph data from spectral domain using the graph Laplacian matrix L [54], which plays a similar role as the Fourier basis for signal processing [6]. The idea of Eq. (6) is similar to conventional convolutions: passing the input signals through a set of learnable filters to aggregate the information, followed by some non-linear transformation. By using nodes features FV as the input layer and stacking multiple convolutional layers, the overall architecture is similar to CNNs. Theoretical analysis shows that such definition of convolution operation on graphs can mimic certain geometric properties of CNNs, which we refer readers to [7] for a comprehensive survey.


Local Linear Forests – Arxiv Vanity

#artificialintelligence

In order to address this weakness, we take the perspective of random forests as an adaptive kernel method. This interpretation follows work by Athey et al. (2018), Hothorn et al. (2004), and Meinshausen (2006), and complements the traditional view of forests as an ensemble method (i.e., an average of predictions made by individual trees). These types of adjustments are particularly important near boundaries, where neighborhoods are asymmetric by necessity, but with many covariates, the adjustments are also important away from boundaries given that local neighborhoods are often unbalanced due to sampling variation. The goal of this paper is improve the accuracy of forests on smooth signals using regression adjustments, potentially in many dimensions. By using the local regression adjustment, it is possible to adjust for asymmetries and imbalances in the set of nearby points used for prediction, ensuring that the weighted average of the feature vector of neighboring points is approximately equal to the target feature vector, and that predictions are centered.


GAN Q-learning – Arxiv Vanity

#artificialintelligence

Up to now, deep learning methods in RL used multiple function approximators (typically a network with shared hidden layers) to fit a state value or state-action value distribution. For instance, bootstrappedDQN () used k-heads on the state-action value function Q for every available action and used it to model a distribution. In bayesianpol (), a Bayesian framework was applied to the actor-critic architecture by fitting a Gaussian Process (GP) instead of the critic, hence allowing for a closed-form derivation of update rules. More recently, bellemare2017distributional () introduced a distributional algorithm C51 which aimed to solve the RL problem by learning a categorical probability vector over returns Q. Unlike GANRL () which uses a generative network to learn the underlying transition model of the environment, we utilize a generative network to model the distribution approximation of the Bellman updates.


Sentence-State LSTM for Text Representation – Arxiv Vanity

#artificialintelligence

Hyperparameters: Table 2 shows the development results of various S-LSTM settings, where Time refers to training time per epoch. Adding one additional sentence-level node as described in Section 3.2 does not lead to accuracy improvements, although the number of parameters and decoding time increase accordingly. As a result, we use only 1 sentence-level node for the remaining experiments. The accuracies of S-LSTM increases as the hidden layer size for each node increases from 100 to 300, but does not further increase when the size increases beyond 300. We fix the hidden size to 300 accordingly.


Machine Learning in Compiler Optimisation – Arxiv Vanity

#artificialintelligence

EAs are useful for exploring a large optimisation space where it is infeasible to just enumerate all possible solutions. This is because an EA can often converge to the most promising area in the optimisation space quicker than a general search heuristic. The EA is also shown to be faster than a dynamic programming based search [24] in finding the optimal transformation for the Fast Fourier Transformation (FFT) [102]. When compared to supervised learning, EAs have the advantage of requiring little problem specific knowledge, and hence that they can be applied on a broad range of problems. However, because an EA typically relies on the empirical evidences (e.g.


Failure Prediction for Autonomous Driving – Arxiv Vanity

#artificialintelligence

Failure prediction is more useful, the earlier it can be done, i.e. the more time can be given to the human driver to take over. By learning to predict g [t,t m], our model will alert the human driver if either the speed prediction and/or the steering angle prediction is going to fail at any of the time points in the time period [t,t m]. The learning goal is then changed to training a deep network model to make a prediction for driving actions for current time t and to make a prediction for the drivability score for the time period from t to a future time point t m. A different length can be used if the application needs. Please see Figure 2 for the illustrative flowchart of the training procedure and solution space of our driving model and the failure prediction model.


Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks – Arxiv Vanity

#artificialintelligence

We use the addition of two vectors of 4-bit numbers to explain how addition works in the SRAM. The 2 words that are going to be added together have to be put in the same bit line. The vectors A and B should be aligned in the array like Figure 5. Vector A occupies the first 4 rows of the SRAM array and vector B the next 4 rows. Another 4 empty rows of storage are reserved for the results. There is a row of latches inside the column peripheral for the carry storage.


Dialog-based Interactive Image Retrieval – Arxiv Vanity

#artificialintelligence

In this work, we propose a new approach to interactive image search by introducing a novel form of user feedback based on natural language. The proposed approach, which we call dialog-based interactive retrieval, enables users to directly express in natural language, what the most prominent visual attributes of the image they have in mind are, improving image search results and allowing for a more natural human-computer interaction. We formulate the task as a reinforcement learning (RL) problem, and train a dialog system that takes natural language responses as user input, and produces retrieved images as output. We train this system by directly optimizing the rank of the target image, which is a non-differentiable objective. To avoid the cumbersome, inefficient, and costly process of collecting and annotating human-machine dialogs as the system learns, we utilize a model-based reinforcement learning approach by training a user simulator based on human-written relative descriptions.


Learning to Sketch with Shortcut Cycle Consistency – Arxiv Vanity

#artificialintelligence

In order to achieve photo-to-sketch synthesis with noisy photo-sketch pairs as supervision, we address the limitations of existing cross-domain image translation models by proposing a novel framework based on multi-task supervised and unsupervised hybrid learning (see Figure 2(c)). Taking an encoder-decoder architecture, our primary task is D(E(photo)) sketch) where a photo is first encoded by E and then decoded into a sketch by D. To help learn a better encoder and decoder, we introduce the inverse problem (D(E(sketch)) photo) so that the supervised model learning can be done in both directions. Importantly, we also introduce two unsupervised learning tasks for within-domain reconstruction, \ie, D(E(photo)) photo and D(E(sketch)) sketch. This hybrid learning framework differs significantly from existing approaches in that: (1) It combines supervised and unsupervised learning in a multi-task learning framework in order to make the best use of the noisy supervision signal. In particular, by sharing the encoder and decoder in various tasks, a more robust and effective encoder and decoder for the main photo-to-sketch synthesis task can be obtained.