AITopics

Finite Sample Analyses for TD(0) With Function Approximation

Dalal, Gal (Technion, Israel Institute of Technology) | Szörényi, Balázs (Technion, Israel Institute of Technology ) | Thoppe, Gugan (Duke University) | Mannor, Shie (Technion, Israel Institute of Technology)

TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Existing convergence rates for Temporal Difference (TD) methods apply only to somewhat modified versions, e.g., projected variants or ones where stepsizes depend on unknown problem parameters. Our analyses obviate these artificial alterations by exploiting strong properties of TD(0). We provide convergence rates both in expectation and with high-probability. The two are obtained via different approaches that use relatively unknown, recently developed stochastic approximation techniques.

artificial intelligence, fuzzy logic, stepsize, (20 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

Deep Learning for Case-Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions

Li, Oscar (Duke University) | Liu, Hao (Nanjing University) | Chen, Chaofan (Duke University) | Rudin, Cynthia (Duke University)

Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability---they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as "black box" models, and in the past, have been trained purely to optimize the accuracy of predictions. In this work, we create a novel network architecture for deep learning that naturally explains its own reasoning for each prediction. This architecture contains an autoencoder and a special prototype layer, where each unit of that layer stores a weight vector that resembles an encoded training input. The encoder of the autoencoder allows us to do comparisons within the latent space, while the decoder allows us to visualize the learned prototypes. The training objective has four terms: an accuracy term, a term that encourages every prototype to be similar to at least one encoded input, a term that encourages every encoded input to be close to at least one prototype, and a term that encourages faithful reconstruction by the autoencoder. The distances computed in the prototype layer are used as part of the classification process. Since the prototypes are learned during training, the learned network naturally comes with explanations for each prediction, and the explanations are loyal to what the network actually computes.

deep learning, neural network, prototype, (21 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Deconvolutional Latent-Variable Model for Text Sequence Matching

Shen, Dinghan (Duke University) | Zhang, Yizhe (Duke University) | Henao, Ricardo (Duke University) | Su, Qinliang (Duke University) | Carin, Lawrence (Duke University)

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization. Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long Short-Term Memory (LSTM), with less parameters and considerably faster training. Further, we apply it to text sequence-matching problems. The proposed model significantly outperforms several strong sentence-encoding baselines, especially in the semi-supervised setting.

deep learning, latent-variable model, neural network, (21 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Learning Conditional Generative Models for Temporal Point Processes

Xiao, Shuai (Shanghai Jiao Tong University) | Xu, Hongteng (Duke University) | Yan, Junchi (Shanghai Jiao Tong University) | Farajtabar, Mehrdad (Georgia Institute of Technology) | Yang, Xiaokang (Shanghai Jiao Tong University) | Song, Le (Georgia Institute of Technology) | Zha, Hongyuan (Georgia Institute of Technology)

Our learning method is based on the following two facts: On one hand, MLE loss or KL divergence requires strict The ability of looking into the future is a challenging but luring matching between two probability distributions and is nonbiased task. People are willing to estimate the occurrence probability estimation of parameters, which is sensitive to sample for their interested events so that they can take preemptive noise and outliers; on the other hand, unlike MLE loss, action. For example, after reviewing the admission which does not consider how close two samples are but only history of patients, the doctors may give kind warning for the their relatively probability, Wasserstein distance is sensitive patients who are at high risk of certain diseases. When having to the underlying geometry structure of samples but has biased access to working experience of job seekers, headhunters gradients(Bellemare et al. 2017). To take advantage of can evaluate one's future career path and recommend a suitable the strengths of these two methods and mitigate the bias position at proper time. In these cases, the historical observations exposure in long-term prediction, our method incorporate always provide us with important guidance to predict Wasserstein distance besides MLE -- both the KL divergence future events -- not only the order of events but also the and the Wasserstein distance between generated and time span between them contain useful information about real samples are minimized jointly.

deep learning, neural network, sequence, (20 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Coalition Manipulation of Gale-Shapley Algorithm

Shen, Weiran (Tsinghua University) | Tang, Pingzhong (Tsinghua University) | Deng, Yuan (Duke University)

It is well-known that the Gale-Shapley algorithm is not truthful for all agents. Previous studies in this category concentrate on manipulations using incomplete preference lists by a single woman and by the set of all women. Little is known about manipulations by a subset of women. In this paper, we consider manipulations by any subset of women with arbitrary preferences. We show that a strong Nash equilibrium of the induced manipulation game always exists among the manipulators and the equilibrium outcome is unique and Pareto-dominant. In addition, the set of matchings achievable by manipulations has a lattice structure. We also examine the super-strong Nash equilibrium in the end.

artificial intelligence, manipulation, matching, (18 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.93)
North America > Canada > Quebec (0.28)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

On the Distortion of Voting With Multiple Representative Candidates

Cheng, Yu (Duke University) | Dughmi, Shaddin (University of Southern California) | Kempe, David (University of Southern California)

We study positional voting rules when candidates and voters are embedded in a common metric space, and cardinal preferences are naturally given by distances in the metric space. In a positional voting rule, each candidate receives a score from each ballot based on the ballot's rank order; the candidate with the highest total score wins the election. The cost of a candidate is his sum of distances to all voters, and the distortion of an election is the ratio between the cost of the elected candidate and the cost of the optimum candidate. We consider the case when candidates are representative of the population, in the sense that they are drawn i.i.d. from the population of the voters, and analyze the expected distortion of positional voting rules. Our main result is a clean and tight characterization of positional voting rules that have constant expected distortion (independent of the number of candidates and the metric space). Our characterization result immediately implies constant expected distortion for Borda Count and elections in which each voter approves a constant fraction of all candidates. On the other hand, we obtain super-constant expected distortion for Plurality, Veto, and approving a constant number of candidates.These results contrast with previous results on voting with metric preferences: When the candidates are chosen adversarially, all of the preceding voting rules have distortion linear in the number of candidates or voters. Thus, the model of representative candidates allows us to distinguish voting rules which seem equally bad in the worst case.

artificial intelligence, distortion, machine learning, (17 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > California (0.14)
Europe (0.14)

Industry: Government > Voting & Elections (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.76)

When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks

Wang, Dong (Duke University) | Zhang, Junbo (Microsoft Research) | Cao, Wei (Tsinghua University, Institute for Interdisciplinary Information Sciences) | Li, Jian (Tsinghua University, Institute for Interdisciplinary Information Sciences) | Zheng, Yu (Microsoft Research)

Estimating the travel time of any path (denoted by a sequence of connected road segments) in a city is of great importance to traffic monitoring, route planning, ridesharing, taxi/Uber dispatching, etc. However, it is a very challenging problem, affected by diverse complex factors, including spatial correlations, temporal dependencies, external conditions (e.g. weather, traffic lights). Prior work usually focuses on estimating the travel times of individual road segments or sub-paths and then summing up these times, which leads to an inaccurate estimation because such approaches do not consider road intersections/traffic lights, and local errors may accumulate. To address these issues, we propose an end-to-end Deep learning framework for Travel Time Estimation called DeepTTE that estimates the travel time of the whole path directly. More specifically, we present a geo-convolution operation by integrating the geographic information into the classical convolution, capable of capturing spatial correlations. By stacking recurrent unit on the geo-convoluton layer, our DeepTTE can capture the temporal dependencies simultaneously. A multi-task learning component is given on the top of DeepTTE, that estimates the travel time of both the entire path and each local path simultaneously during the training phase. The extensive experiments on two large-scale datasets shows our DeepTTE significantly outperforms the state-of-the-art methods.

deep learning, neural network, travel time, (21 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.49)

Genre: Research Report (0.88)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Disarmament Games With Resource

Deng, Yuan (Duke University) | Conitzer, Vincent (Duke University)

A paper by Deng and Conitzer in AAAI'17 introduces disarmament games, in which players alternatingly commit not to play certain pure strategies. However, in practice, disarmament usually does not consist in removing a strategy, but rather in removing a resource (and doing so rules out all the strategies in which that resource is used simultaneously). In this paper, we introduce a model of disarmament games in which resources, rather than strategies, are removed. We prove NP-completeness of several formulations of the problem of achieving desirable outcomes via disarmament. We then study the case where resources can be fractionally removed, and prove a result analogous to the folk theorem that all desirable outcomes can be achieved. We show that we can approximately achieve any desirable outcome in a polynomial number of rounds, though determining whether a given outcome can be obtained in a given number of rounds remains NP-complete.

artificial intelligence, disarmament game, game theory, (15 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence (1.00)

Zero-Shot Learning via Class-Conditioned Deep Generative Models

We present a deep generative model for Zero-Shot Learning (ZSL). Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen class using a class-specific latent-space distribution, conditioned on class attributes. We use these latent-space distributions as a prior for a supervised variational autoencoder (VAE), which also facilitates learning highly discriminative feature representations for the inputs. The entire framework is learned end-to-end using only the seen-class training data. At test time, the label for an unseen-class test input is the class that maximizes the VAE lower bound. We further extend the model to a (i) semi-supervised/transductive setting by leveraging unlabeled unseen-class data via an unsupervised learning module, and (ii) few-shot learning where we also have a small number of labeled inputs from the unseen classes. We compare our model with several state-of-the-art methods through a comprehensive set of experiments on a variety of benchmark data sets.

deep learning, neural network, unseen class, (18 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > India (0.14)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)