AITopics | Xu, Can

Collaborating Authors

Xu, Can

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multimodal Dialogue Response Generation

Sun, Qingfeng, Wang, Yujing, Xu, Can, Zheng, Kai, Yang, Yaming, Hu, Huang, Xu, Fei, Zhang, Jessica, Geng, Xiubo, Jiang, Daxin

arXiv.org Artificial IntelligenceOct-16-2021

Responsing with image has been recognized as an important capability for an intelligent conversational agent. Yet existing works only focus on exploring the multimodal dialogue models which depend on retrieval-based methods, but neglecting generation methods. To fill in the gaps, we first present a multimodal dialogue generation model, which takes the dialogue history as input, then generates a textual sequence or an image as response. Learning such a model often requires multimodal dialogues containing both texts and images which are difficult to obtain. Motivated by the challenge in practice, we consider multimodal dialogue generation under a natural assumption that only limited training examples are available. In such a low-resource setting, we devise a novel conversational agent, Divter, in order to isolate parameters that depend on multimodal dialogues from the entire generation model. By this means, the major part of the model can be learned from a large number of text-only dialogues and text-image pairs respectively, then the whole parameters can be well fitted using the limited training examples. Extensive experiments demonstrate our method achieves state-of-the-art results in both automatic and human evaluation, and can generate informative text and high-resolution image responses.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2110.08515

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Learning Neural Templates for Recommender Dialogue System

Liang, Zujie, Hu, Huang, Xu, Can, Miao, Jian, He, Yingying, Chen, Yining, Geng, Xiubo, Liang, Fan, Jiang, Daxin

arXiv.org Artificial IntelligenceSep-25-2021

Though recent end-to-end neural models have shown promising progress on Conversational Recommender System (CRS), two key challenges still remain. First, the recommended items cannot be always incorporated into the generated replies precisely and appropriately. Second, only the items mentioned in the training corpus have a chance to be recommended in the conversation. To tackle these challenges, we introduce a novel framework called NTRD for recommender dialogue system that decouples the dialogue generation from the item recommendation. NTRD has two key components, i.e., response template generator and item selector. The former adopts an encoder-decoder model to generate a response template with slot locations tied to target items, while the latter fills in slot locations with the proper items using a sufficient attention mechanism. Our approach combines the strengths of both classical slot filling approaches (that are generally controllable) and modern neural NLG approaches (that are generally more natural and accurate). Extensive experiments on the benchmark ReDial show our NTRD significantly outperforms the previous state-of-the-art methods. Besides, our approach has the unique advantage to produce novel items that do not appear in the training set of dialogue corpus. The code is available at \url{https://github.com/jokieleung/NTRD}.

neural network, recommendation, survey article, (21 more...)

arXiv.org Artificial Intelligence

2109.12302

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.84)

Industry:

Media > Film (1.00)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Maria: A Visual Experience Powered Conversational Agent

Liang, Zujie, Hu, Huang, Xu, Can, Tao, Chongyang, Geng, Xiubo, Chen, Yining, Liang, Fan, Jiang, Daxin

arXiv.org Artificial IntelligenceMay-27-2021

Arguably, the visual perception of conversational agents to the physical world is a key way for them to exhibit the human-like intelligence. Image-grounded conversation is thus proposed to address this challenge. Existing works focus on exploring the multimodal dialog models that ground the conversation on a given image. In this paper, we take a step further to study image-grounded conversation under a fully open-ended setting where no paired dialog and image are assumed available. Specifically, we present Maria, a neural conversation agent powered by the visual world experiences which are retrieved from a large-scale image index. Maria consists of three flexible components, i.e., text-to-image retriever, visual concept detector and visual-knowledge-grounded response generator. The retriever aims to retrieve a correlated image to the dialog from an image index, while the visual concept detector extracts rich visual knowledge from the image. Then, the response generator is grounded on the extracted visual knowledge and dialog context to generate the target response. Extensive experiments demonstrate Maria outperforms previous state-of-the-art methods on automatic metrics and human evaluation, and can generate informative responses that have some visual commonsense of the physical world.

chatbot, neural network, proceedings, (20 more...)

arXiv.org Artificial Intelligence

2105.13073

Country:

Europe (1.00)
Asia (0.68)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.90)

Add feedback

Learning Matching Representations for Individualized Organ Transplantation Allocation

Xu, Can, Alaa, Ahmed M., Bica, Ioana, Ershoff, Brent D., Cannesson, Maxime, van der Schaar, Mihaela

arXiv.org Machine LearningFeb-1-2021

Organ transplantation is often the last resort for treating end-stage illness, but the probability of a successful transplantation depends greatly on compatibility between donors and recipients. Current medical practice relies on coarse rules for donor-recipient matching, but is short of domain knowledge regarding the complex factors underlying organ compatibility. In this paper, we formulate the problem of learning data-driven rules for organ matching using observational data for organ allocations and transplant outcomes. This problem departs from the standard supervised learning setup in that it involves matching the two feature spaces (i.e., donors and recipients), and requires estimating transplant outcomes under counterfactual matches not observed in the data. To address these problems, we propose a model based on representation learning to predict donor-recipient compatibility; our model learns representations that cluster donor features, and applies donor-invariant transformations to recipient features to predict outcomes for a given donor-recipient feature instance. Experiments on semi-synthetic and real-world datasets show that our model outperforms state-of-art allocation methods and policies executed by human experts.

health & medicine, neural network, representation, (16 more...)

arXiv.org Machine Learning

2101.11769

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Surgery > Transplant Surgery (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Large Margin Discriminant Dimensionality Reduction in Prediction Space

Saberian, Mohammad, Pereira, Jose Costa, Xu, Can, Yang, Jian, Nvasconcelos, Nuno

Neural Information Processing SystemsFeb-14-2020, 08:58:49 GMT

In this paper we establish a duality between boosting and SVM, and use this to derive a novel discriminant dimensionality reduction algorithm. In particular, using the multiclass formulation of boosting and SVM we note that both use a combination of mapping and linear classification to maximize the multiclass margin. In SVM this is implemented using a pre-defined mapping (induced by the kernel) and optimizing the linear classifiers. In boosting the linear classifiers are pre-defined and the mapping (predictor) is learned through combination of weak learners. We argue that the intermediate mapping, e.g.

artificial intelligence, dimensionality reduction, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.70)

Add feedback

Towards Neural Mixture Recommender for Long Range Dependent User Sequences

Tang, Jiaxi, Belletti, Francois, Jain, Sagar, Chen, Minmin, Beutel, Alex, Xu, Can, Chi, Ed H.

arXiv.org Machine LearningFeb-22-2019

Understanding temporal dynamics has proved to be highly valuable for accurate recommendation. Sequential recommenders have been successful in modeling the dynamics of users and items over time. However, while different model architectures excel at capturing various temporal ranges or dynamics, distinct application contexts require adapting to diverse behaviors. In this paper we examine how to build a model that can make use of different temporal ranges and dynamics depending on the request context. We begin with the analysis of an anonymized Youtube dataset comprising millions of user sequences. We quantify the degree of long-range dependence in these sequences and demonstrate that both short-term and long-term dependent behavioral patterns co-exist. We then propose a neural Multi-temporal-range Mixture Model (M3) as a tailored solution to deal with both short-term and long-term dependencies. Our approach employs a mixture of models, each with a different temporal range. These models are combined by a learned gating mechanism capable of exerting different model combinations given different contextual information. In empirical evaluations on a public dataset and our own anonymized YouTube dataset, M3 consistently outperforms state-of-the-art sequential recommendation methods.

deep learning, neural network, sequence, (21 more...)

arXiv.org Machine Learning

doi: 10.1145/3308558.3313650

1902.08588

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.95)
(2 more...)

Add feedback

Playing 20 Question Game with Policy-Based Reinforcement Learning

Hu, Huang, Wu, Xianchao, Luo, Bingfeng, Tao, Chongyang, Xu, Can, Wu, Wei, Chen, Zhan

arXiv.org Artificial IntelligenceAug-26-2018

The 20 Questions (Q20) game is a well known game which encourages deductive reasoning and creativity. In the game, the answerer first thinks of an object such as a famous person or a kind of animal. Then the questioner tries to guess the object by asking 20 questions. In a Q20 game system, the user is considered as the answerer while the system itself acts as the questioner which requires a good strategy of question selection to figure out the correct object and win the game. However, the optimal policy of question selection is hard to be derived due to the complexity and volatility of the game environment. In this paper, we propose a novel policy-based Reinforcement Learning (RL) method, which enables the questioner agent to learn the optimal policy of question selection through continuous interactions with users. To facilitate training, we also propose to use a reward network to estimate the more informative reward. Compared to previous methods, our RL method is robust to noisy answers and does not rely on the Knowledge Base of objects. Experimental results show that our RL method clearly outperforms an entropy-based engineering system and has competitive performance in a noisy-free simulation environment.

agent, computer game, neural network, (21 more...)

arXiv.org Artificial Intelligence

1808.07645

Country:

Asia (0.68)
North America > United States (0.28)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Computer Games (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Explainable and Controllable Open Domain Dialogue Generation with Dialogue Acts

Xu, Can, Wu, Wei, Wu, Yu

arXiv.org Artificial IntelligenceJul-19-2018

We study open domain dialogue generation with dialogue acts designed to explain how people engage in social chat. To imitate human behavior, we propose managing the flow of human-machine interactions with the dialogue acts as policies. The policies and response generation are jointly learned from human-human conversations, and the former is further optimized with a reinforcement learning approach. With the dialogue acts, we achieve significant improvement over state-of-the-art methods on response quality for given contexts and dialogue length in both machine-machine simulation and human-machine conversation.

deep learning, dialogue act, neural network, (19 more...)

arXiv.org Artificial Intelligence

1807.07255

Country:

Asia > China (0.68)
North America > United States > Arizona (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Neural Response Generation With Dynamic Vocabularies

Wu, Yu (Beihang University) | Wu, Wei (Microsoft Research) | Yang, Dejian (Beihang University) | Xu, Can (Microsoft Research) | Li, Zhoujun (Beihang University)

AAAI ConferencesFeb-8-2018

We study response generation for open domain conversation in chatbots. Existing methods assume that words in responses are generated from an identical vocabulary regardless of their inputs, which not only makes them vulnerable to generic patterns and irrelevant noise, but also causes a high cost in decoding. We propose a dynamic vocabulary sequence-to-sequence (DVS2S) model which allows each input to possess their own vocabulary in decoding. In training, vocabulary construction and response generation are jointly learned by maximizing a lower bound of the true objective with a Monte Carlo sampling method. In inference, the model dynamically allocates a small vocabulary for an input with the word prediction model, and conducts decoding only with the small vocabulary. Because of the dynamic vocabulary mechanism, DVS2S eludes many generic patterns and irrelevant words in generation, and enjoys efficient decoding at the same time. Experimental results on both automatic metrics and human annotations show that DVS2S can significantly outperform state-of-the-art methods in terms of response quality, but only requires 60% decoding time compared to the most efficient baseline.

deep learning, neural network, response generation, (23 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.15)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Knowledge Enhanced Hybrid Neural Network for Text Matching

Wu, Yu (Beihang University) | Wu, Wei (Microsoft Research) | Xu, Can (Microsoft Research) | Li, Zhoujun (Beihang University)

AAAI ConferencesFeb-8-2018

Long text brings a big challenge to neural network based text matching approaches due to their complicated structures. To tackle the challenge, we propose a knowledge enhanced hybrid neural network (KEHNN) that leverages prior knowledge to identify useful information and filter out noise in long text and performs matching from multiple perspectives. The model fuses prior knowledge into word representations by knowledge gates and establishes three matching channels with words, sequential structures of text given by Gated Recurrent Units (GRUs), and knowledge enhanced representations. The three channels are processed by a convolutional neural network to generate high level features for matching, and the features are synthesized as a matching score by a multilayer perceptron. In this paper, we focus on exploring the use of taxonomy knowledge for text matching. Evaluation results from extensive experiments on public data sets of question answering and conversation show that KEHNN can significantly outperform state-of-the-art matching models and particularly improve matching accuracy on pairs with long text.

deep learning, knowledge, neural network, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia (0.47)
North America > United States > Colorado (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback