AITopics | He, Xiaodong

Collaborating Authors

He, Xiaodong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

Shang, Chao, Tang, Yun, Huang, Jing, Bi, Jinbo, He, Xiaodong, Zhou, Bowen

arXiv.org Artificial IntelligenceNov-14-2018

Knowledge graph embedding has been an active research topic for knowledge base completion, with progressive improvement from the initial TransE, TransH, DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution over embeddings and multiple layers of nonlinear features to model knowledge graphs. The model can be efficiently trained and scalable to large knowledge graphs. However, there is no structure enforcement in the embedding space of ConvE. The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure. In this work, we propose a novel end-to-end Structure-Aware Convolutional Network (SACN) that takes the benefit of GCN and ConvE together. SACN consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE. WGCN utilizes knowledge graph node structure, node attributes and edge relation types. It has learnable weights that adapt the amount of information from neighbors used in local aggregation, leading to more accurate embeddings of graph nodes. Node attributes in the graph are represented as additional nodes in the WGCN. The decoder Conv-TransE enables the state-of-the-art ConvE to be translational between entities and relations while keeps the same link prediction performance as ConvE. We demonstrate the effectiveness of the proposed SACN on standard FB15k-237 and WN18RR datasets, and it gives about 10% relative improvement over the state-of-the-art ConvE in terms of HITS@1, HITS@3 and HITS@10.

expert system, neural network, node, (17 more...)

arXiv.org Artificial Intelligence

1811.04441

Country: North America > United States > Connecticut > Tolland County > Storrs (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.62)

Add feedback

Attentive Tensor Product Learning

Huang, Qiuyuan, Deng, Li, Wu, Dapeng, Liu, Chang, He, Xiaodong

arXiv.org Artificial IntelligenceNov-1-2018

This paper proposes a new architecture - Attentive Tensor Product Learning (ATPL) - to represent grammatical structures in deep learning models. ATPL is a new architecture to bridge this gap by exploiting Tensor Product Representations (TPR), a structured neural-symbolic model developed in cognitive science, aiming to integrate deep learning with explicit language structures and rules. The key ideas of ATPL are: 1) unsupervised learning of role-unbinding vectors of words via TPR-based deep neural network; 2) employing attention modules to compute TPR; and 3) integration of TPR with typical deep learning architectures including Long Short-Term Memory (LSTM) and Feedforward Neural Network (FFNN). The novelty of our approach lies in its ability to extract the grammatical structure of a sentence by using role-unbinding vectors, which are obtained in an unsupervised manner. This ATPL approach is applied to 1) image captioning, 2) part of speech (POS) tagging, and 3) constituency parsing of a sentence. Experimental results demonstrate the effectiveness of the proposed approach.

architecture, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1802.07089

Country:

North America > United States > California (0.28)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

Huang, Qiuyuan, Gan, Zhe, Celikyilmaz, Asli, Wu, Dapeng, Wang, Jianfeng, He, Xiaodong

arXiv.org Artificial IntelligenceSep-7-2018

We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherentmulti-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder.The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchicallystructured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.

deep learning, great time, neural network, (18 more...)

arXiv.org Artificial Intelligence

1805.08191

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Stacked Cross Attention for Image-Text Matching

Lee, Kuang-Huei, Chen, Xi, Hua, Gang, Hu, Houdong, He, Xiaodong

arXiv.org Artificial IntelligenceMar-21-2018

In this paper, we study the problem of image-text matching. Inferring the latent semantic alignment between objects or other salient stuffs (e.g. snow, sky, lawn) and the corresponding words in sentences allows to capture fine-grained interplay between vision and language, and makes image-text matching more interpretable. Prior works either simply aggregate the similarity of all possible pairs of regions and words without attending differentially to more and less important words or regions, or use a multi-step attentional process to capture limited number of semantic alignments which is less interpretable. In this paper, we present Stacked Cross Attention to discover the full latent alignments using both image regions and words in sentence as context and infer the image-text similarity. Our approach achieves the state-of-the-art results on the MS-COCO and Flickr30K datasets. On Flickr30K, our approach outperforms the current best methods by 22.1% in text retrieval from image query, and 18.2% in image retrieval with text query (based on Recall@1). On MS-COCO, our approach improves sentence retrieval by 17.8% and image retrieval by 16.6% (based on Recall@1 using the 5K test set).

deep learning, image region, neural network, (19 more...)

arXiv.org Artificial Intelligence

1803.08024

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.46)
Leisure & Entertainment > Sports > Baseball (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the Discrimination-Generalization Tradeoff in GANs

Zhang, Pengchuan, Liu, Qiang, Zhou, Dengyong, Xu, Tao, He, Xiaodong

arXiv.org Machine LearningFeb-23-2018

Generative adversarial training can be generally understood as minimizing certain moment matching loss defined by a set of discriminator functions, typically neural networks. The discriminator set should be large enough to be able to uniquely identify the true distribution (discriminative), and also be small enough to go beyond memorizing samples (generalizable). In this paper, we show that a discriminator set is guaranteed to be discriminative whenever its linear span is dense in the set of bounded continuous functions. This is a very mild condition satisfied even by neural networks with a single neuron. Further, we develop generalization bounds between the learned distribution and true distribution under different evaluation metrics. When evaluated with neural distance, our bounds show that generalization is guaranteed as long as the discriminator set is small enough, regardless of the size of the generator or hypothesis set. When evaluated with KL divergence, our bound provides an explanation on the counter-intuitive behaviors of testing likelihood in GAN training. Our analysis sheds lights on understanding the practical performance of GANs.

artificial intelligence, discriminator, neural network, (15 more...)

arXiv.org Machine Learning

1711.02771

Country: North America > United States > Virginia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Question-Answering with Grammatically-Interpretable Representations

Palangi, Hamid (Microsoft Research AI) | Smolensky, Paul (Microsoft Research AI, Johns Hopkins University) | He, Xiaodong (Microsoft Research AI) | Deng, Li (Citadel)

AAAI ConferencesFeb-8-2018

We introduce an architecture, the Tensor Product RecurrentNetwork (TPRN). In our application of TPRN, internal representations—learned by end-to-end optimization in a deep neural network performing a textual question-answering(QA) task—can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model performs comparably to a state-of-the-art system on the SQuAD QA task.The internal representation which is interpreted is a Tensor Product Representation: for each input word, the model selects a symbol to encode the word, and a role in which to place the symbol, and binds the two together. The selection is via soft attention. The overall interpretation is built from interpretations of the symbols, as recruited by the trained model, and interpretations of the roles as used by the model. We find support for our initial hypothesis that symbols can be interpreted as lexical-semantic word meanings, while roles can be interpreted as approximations of grammatical roles (or categories)such as subject, wh-word, determiner, etc. Fine-grained analysis reveals specific correspondences between the learned roles and parts of speech as assigned by a standard tagger(Toutanova et al. 2003), and finds several discrepancies in the model’s favor. In this sense, the model learns significant aspectsof grammar, after having been exposed solely to linguistically unannotated text, questions, and answers: no prior linguistic knowledge is given to the model. What is given is the means to build representations using symbols and roles, with an inductive bias favoring use of these in an approximately discrete manner.

deep learning, neural network, representation, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe (1.00)
North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adversarial Ranking for Language Generation

Lin, Kevin, Li, Dianqi, He, Xiaodong, Zhang, Zhengyou, Sun, Ming-ting

Neural Information Processing SystemsDec-31-2017

Generative adversarial networks (GANs) have great successes on synthesizing data. However, the existing GANs restrict the discriminator to be a binary classifier, and thus limit their learning capacity for tasks that need to synthesize output with rich structures such as natural language descriptions. In this paper, we propose a novel generative adversarial network, RankGAN, for generating high-quality language descriptions. Rather than training the discriminator to learn and assign absolute binary predicate for individual data sample, the proposed RankGAN is able to analyze and rank a collection of human-written and machine-written sentences by giving a reference group. By viewing a set of data samples collectively and evaluating their quality through relative ranking scores, the discriminator is able to make better assessment which in turn helps to learn a better generator. The proposed RankGAN is optimized through the policy gradient technique. Experimental results on multiple public datasets clearly demonstrate the effectiveness of the proposed approach.

deep learning, generator, neural network, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Deep Reinforcement Learning with a Natural Language Action Space

He, Ji, Chen, Jianshu, He, Xiaodong, Gao, Jianfeng, Li, Lihong, Deng, Li, Ostendorf, Mari

arXiv.org Artificial IntelligenceJun-8-2016

This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games. Termed a deep reinforcement relevance network (DRRN), the architecture represents action and state spaces with separate embedding vectors, which are combined with an interaction function to approximate the Q-function in reinforcement learning. We evaluate the DRRN on two popular text games, showing superior performance over other deep Q-learning architectures. Experiments with paraphrased action descriptions show that the model is extracting meaning rather than simply memorizing strings of text.

computer game, deep learning, drrn, (20 more...)

arXiv.org Artificial Intelligence

1511.04636

Country: North America > United States > Washington > King County > Seattle (0.14)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

Chen, Jianshu, He, Ji, Shen, Yelong, Xiao, Lin, He, Xiaodong, Gao, Jianfeng, Song, Xinying, Deng, Li

Neural Information Processing SystemsDec-31-2015

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i.e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document. Different from traditional variational learning or Gibbs sampling approaches, the proposed learning method applies (i) the mirror descent algorithm for maximum a posterior inference and (ii) back propagation over a deep architecture together with stochastic gradient/mirror descent for model parameter estimation, leading to scalable and end-to-end discriminative learning of the model. As a byproduct, we also apply this technique to develop a new learning method for the traditional unsupervised LDA model (i.e., BP-LDA). Experimental results on three real-world regression and classification tasks show that the proposed methods significantly outperform the previous supervised topic models, neural networks, and is on par with deep neural networks.

deep learning, neural network, topic model, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Language Models for Image Captioning: The Quirks and What Works

Devlin, Jacob, Cheng, Hao, Fang, Hao, Gupta, Saurabh, Deng, Li, He, Xiaodong, Zweig, Geoffrey, Mitchell, Margaret

arXiv.org Artificial IntelligenceOct-14-2015

Two recent approaches have achieved state-of-the-art results in image captioning. The first uses a pipelined process where a set of candidate words is generated by a convolutional neural network (CNN) trained on images, and then a maximum entropy (ME) language model is used to arrange these words into a coherent sentence. The second uses the penultimate activation layer of the CNN as input to a recurrent neural network (RNN) that then generates the caption sequence. In this paper, we compare the merits of these different language modeling approaches for the first time by using the same state-of-the-art CNN as input. We examine issues in the different approaches, including linguistic irregularities, caption repetition, and data set overlap. By combining key aspects of the ME and RNN methods, we achieve a new record performance over previously published results on the benchmark COCO dataset. However, the gains we see in BLEU do not translate to human judgments.

caption, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

1505.01809

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback