AITopics

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

AAAI ConferencesFeb-8-2018

Lateral Inhibition-Inspired Convolutional Neural Network for Visual Attention and Saliency Detection

Cao, Chunshui (University of Science and Technology of China) | Huang, Yongzhen (Institute of Automation, Chinese Academy of Sciences) | Wang, Zilei (University of Science and Technology of China) | Wang, Liang (Institute of Automation, Chinese Academy of Sciences) | Xu, Ninglong (Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Institute of Neuroscience) | Tan, Tieniu (Institute of Automation, Chinese Academy of Sciences)

Lateral inhibition in top-down feedback is widely existing in visual neurobiology, but such an important mechanism has not be well explored yet in computer vision. In our recent research, we find that modeling lateral inhibition in convolutional neural network (LICNN) is very useful for visual attention and saliency detection. In this paper, we propose to formulate lateral inhibition inspired by the related studies from neurobiology, and embed it into the top-down gradient computation of a general CNN for classification, i.e. only category-level information is used. After this operation (only conducted once), the network has the ability to generate accurate category-specific attention maps. Further, we apply LICNN for weakly-supervised salient object detection.Extensive experimental studies on a set of databases, e.g., ECSSD, HKU-IS, PASCAL-S and DUT-OMRON, demonstrate the great advantage of LICNN which achieves the state-of-the-art performance. It is especially impressive that LICNN with only category-level supervised information even outperforms some recent methods with segmentation-level supervised learning.

deep learning, lateral inhibition, neural network, (20 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AAAI ConferencesFeb-8-2018

Privacy-Preserving Policy Iteration for Decentralized POMDPs

Wu, Feng (University of Science and Technology of China) | Zilberstein, Shlomo (University of Massachusetts Amherst) | Chen, Xiaoping (University of Science and Technology of China)

We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents' policies are optimized using the cross-entropy method. In our algorithm, the agents' private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP benchmark problems confirm the effectiveness of our approach.

agent, artificial intelligence, machine learning, (16 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts (0.14)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.95)

Infinitely Many-Armed Bandits with Budget Constraints

Li, Haifang (Institute of Automation, Chinese Academy of Sciences) | Xia, Yingce (University of Science and Technology of China)

We study the infinitely many-armed bandit problem with budget constraints, where the number of arms can be infinite and much larger than the number of possible experiments. The player aims at maximizing his/her total expected reward under a budget constraint B for the cost of pulling arms. We introduce a weak stochastic assumption on the ratio of expected-reward to expected-cost of a newly pulled arm which characterizes its probability of being a near-optimal arm. We propose an algorithm named RCB-I to this new problem, in which the player first randomly picks K arms, whose order is sub-linear in terms of B, and then runs the algorithm for the finite-arm setting on the selected arms. Theoretical analysis shows that this simple algorithm enjoys a sub-linear regret in term of the budget B . We also provide a lower bound of any algorithm under Bernoulli setting. The regret bound of RCB-I matches the lower bound up to a logarithmic factor. We further extend this algorithm to the any-budget setting (i.e., the budget is unknown in advance) and conduct corresponding theoretical analysis.

algorithm, artificial intelligence, big data, (17 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Patch Reordering: A NovelWay to Achieve Rotation and Translation Invariance in Convolutional Neural Networks

Shen, Xu (University of Science and Technology of China) | Tian, Xinmei (University of Science and Technology of China) | Sun, Shaoyan (University of Science and Technology of China) | Tao, Dacheng (University of Technology Sydney)

Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in meaningful objects in input. Sometimes, such networks are trained using data augmentation to encode this invariance into the parameters, which restricts the capacity of the model to learn the content of these objects. A more efficient use of the parameter budget is to encode rotation or translation invariance into the model architecture, which relieves the model from the need to learn them. To enable the model to focus on learning the content of objects other than their locations, we propose to conduct patch ranking of the feature maps before feeding them into the next layer. When patch ranking is combined with convolution and pooling operations, we obtain consistent representations despite the location of meaningful objects in input. We show that the patch ranking module improves the performance of the CNN on many benchmark tasks, including MNIST digit recognition, large-scale image recognition, and image retrieval.

deep learning, invariance, neural network, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China > Anhui Province (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

A Context-Enriched Neural Network Method for Recognizing Lexical Entailment

Zhang, Kun (University of Science and Technology of China) | Chen, Enhong (University of Science and Technology of China) | Liu, Qi (University of Science and Technology of China) | Liu, Chuanren (Drexel University) | Lv, Guangyi (University of Science and Technology of China)

Recognizing lexical entailment (RLE) always plays an important role in inference of natural language, i.e., identifying whether one word entails another, for example, fox entails animal. In the literature, automatically recognizing lexical entailment for word pairs deeply relies on words' contextual representations. However, as a "prototype" vector, a single representation cannot reveal multifaceted aspects of the words due to their homonymy and polysemy. In this paper, we propose a supervised Context-Enriched Neural Network (CENN) method for recognizing lexical entailment. To be specific, we first utilize multiple embedding vectors from different contexts to represent the input word pairs. Then, through different combination methods and attention mechanism, we integrate different embedding vectors and optimize their weights to predict whether there are entailment relations in word pairs. Moreover, our proposed framework is flexible and open to handle different word contexts and entailment perspectives in the text corpus. Extensive experiments on five datasets show that our approach significantly improves the performance of automatic RLE in comparison with several state-of-the-art methods.

dataset, deep learning, neural network, (21 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Differentiating Between Posed and Spontaneous Expressions with Latent Regression Bayesian Network

Gan, Quan (University of Science and Technology of China) | Nie, Siqi (Rensselaer Polytechnic Institute) | Wang, Shangfei (University of Science and Technology of China) | Ji, Qiang (Rensselaer Polytechnic Institute)

Spatial patterns embedded in human faces are crucial for differentiating posed expressions from spontaneous ones, yet they have not been thoroughly exploited in the literature. To tackle this problem, we present a generative model, i.e., Latent Regression Bayesian Network (LRBN), to effectively capture the spatial patterns embedded in facial landmark points to differentiate between posed and spontaneous facial expressions. The LRBN is a directed graphical model consisting of one latent layer and one visible layer. Due to the “explaining away“ effect in Bayesian networks, LRBN is able to capture both the dependencies among the latent variables given the observation and the dependencies among visible variables. We believe that such dependencies are crucial for faithful data representation. Specifically, during training, we construct two LRBNs to capture spatial patterns inherent in displacements of landmark points from spontaneous facial expressions and posed facial expressions respectively. During testing, the samples are classified into posed or spontaneous expressions according to their likelihoods on two models. Efficient learning and inference algorithms are proposed. Experimental results on two benchmark databases demonstrate the advantages of the proposed approach in modeling spatial patterns as well as its superior performance to the existing methods in differentiating between posed and spontaneous expressions.

artificial intelligence, bayesian inference, expression, (18 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bridging Video Content and Comments: Synchronized Video Description with Temporal Summarization of Crowdsourced Time-Sync Comments

Xu, Linli (University of Science and Technology of China) | Zhang, Chao ( University of Science and Technology of China )

With the rapid growth of online sharing media, we are facing a huge collection of videos. In the meantime, due to the volume and complexity of video data, it can be tedious and time consuming to index or annotate videos. In this paper, we propose to generate temporal descriptions of videos by exploiting the information of crowdsourced time-sync comments which are receiving increasing popularity on many video sharing websites. In this framework, representative and interesting comments of a video are selected and highlighted along the timeline, which provide an informative description of the video in a time-sync manner. The challenge of the proposed application comes from the extremely informal and noisy nature of the comments, which are usually short sentences and on very different topics. To resolve these issues, we propose a novel temporal summarization model based on the data reconstruction principle, where representative comments are selected in order to best reconstruct the original corpus at the text level as well as the topic level while incorporating the temporal correlations of the comments. Experimental results on real-world data demonstrate the effectiveness of the proposed framework and justify the idea of exploiting crowdsourced time-sync comments as a bridge to describe videos.

artificial intelligence, optimization problem, summarization, (20 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Communications > Social Media > Crowdsourcing (0.82)

Capturing Dependencies among Labels and Features for Multiple Emotion Tagging of Multimedia Data

Wu, Shan (University of Science and Technology of China) | Wang, Shangfei (University of Science and Technology of China) | Ji, Qiang (Rensselaer Polytechnic Institute)

In this paper, we tackle the problem of emotion tagging of multimedia data by modeling the dependencies among multiple emotions in both the feature and label spaces. These dependencies, which carry crucial top-down and bottom-up evidence for improving multimedia affective content analysis, have not been thoroughly exploited yet. To this end, we propose two hierarchical models that independently and dependently learn the shared features and global semantic relationships among emotion labels to jointly tag multiple emotion labels of multimedia data. Efficient learning and inference algorithms of the proposed models are also developed. Experiments on three benchmark emotion databases demonstrate the superior performance of our methods to existing methods.

emotion, neural network, survey article, (21 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York (0.14)
Asia > China > Anhui Province (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.46)
(2 more...)

Question Difﬁculty Prediction for READING Problems in Standard Tests

Huang, Zhenya (University of Science and Technology of China) | Liu, Qi (University of Science and Technology of China) | Chen, Enhong (University of Science and Technology of China) | Zhao, Hongke (University of Science and Technology of China) | Gao, Mingyong ( iFLYTEK Co., Ltd. ) | Wei, Si ( iFLYTEK Co., Ltd. ) | Su, Yu (Anhui University) | Hu, Guoping ( iFLYTEK Co., Ltd. )

Standard tests aim to evaluate the performance of examinees using different tests with consistent difficulties. Thus, a critical demand is to predict the difficulty of each test question before the test is conducted. Existing studies are usually based on the judgments of education experts (e.g., teachers), which may be subjective and labor intensive. In this paper, we propose a novel Test-aware Attention-based Convolutional Neural Network (TACNN) framework to automatically solve this Question Difficulty Prediction (QDP) task for READING problems (a typical problem style in English tests) in standard tests. Specifically, given the abundant historical test logs and text materials of questions, we first design a CNN-based architecture to extract sentence representations for the questions. Then, we utilize an attention strategy to qualify the difficulty contribution of each sentence to questions. Considering the incomparability of question difficulties in different tests, we propose a test-dependent pairwise strategy for training TACNN and generating the difficulty prediction value. Extensive experiments on a real-world dataset not only show the effectiveness of TACNN, but also give interpretable insights to track the attention information for questions.

deep learning, neural network, tacnn, (20 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)