AITopics

Applications are increasingly expected to make smart decisions based on what humans consider basic commonsense. An often overlooked but essential form of commonsense involves comparisons, e.g. the fact that bears are typically more dangerous than dogs, that tables are heavier than chairs, or that ice is colder than water. In this paper, we first rely on open information extraction methods to obtain large amounts of comparisons from the Web. We then develop a joint optimization model for cleaning and disambiguating this knowledge with respect to WordNet. This model relies on integer linear programming and semantic coherence scores. Experiments show that our model outperforms strong baselines and allows us to obtain a large knowledge base of disambiguated commonsense assertions.

artificial intelligence, knowledge, natural language, (19 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.29)
Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.89)

Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo Simulations

Ohsaka, Naoto (The University of Tokyo) | Akiba, Takuya (The University of Tokyo) | Yoshida, Yuichi (National Institute of Informatics) | Kawarabayashi, Ken-ichi (National Institute of Informatics)

Influence maximization is a problem to find small sets of highly influential individuals in a social network to maximize the spread of influence under stochastic cascade models of propagation. Although the problem has been well-studied, it is still highly challenging to find solutions of high quality in large-scale networks of the day. While Monte-Carlo-simulation-based methods produce near-optimal solutions with a theoretical guarantee, they are prohibitively slow for large graphs. As a result, many heuristic methods without any theoretical guarantee have been developed, but all of them substantially compromise solution quality. To address this issue, we propose a new method for the influence maximization problem. Unlike other recent heuristic methods, the proposed method is a Monte-Carlo-simulation-based method, and thus it consistently produces solutions of high quality with the theoretical guarantee. On the other hand, unlike other previous Monte-Carlo-simulation-based methods, it runs as fast as other state-of-the-art methods, and can be applied to large networks of the day. Through our extensive experiments, we demonstrate the scalability and the solution quality of the proposed method.

algorithm, artificial intelligence, optimization problem, (16 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre: Research Report > New Finding (0.31)

Technology:

Information Technology > Communications > Networks (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Source Free Transfer Learning for Text Classification

Lu, Zhongqi (Hong Kong University of Science and Technology) | Zhu, Yin (Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Institute for Infocomm Research) | Xiang, Evan Wei (Baidu Inc.) | Wang, Yujing (Microsoft Research Asia, Beijing) | Yang, Qiang (Hong Kong University of Science and Technology)

Transfer learning uses relevant auxiliary data to help the learning task in a target domain where labeled data is usually insufficient to train an accurate model. Given appropriate auxiliary data, researchers have proposed many transfer learning models. How to find such auxiliary data, however, is of little research so far. In this paper, we focus on the problem of auxiliary data retrieval, and propose a transfer learning framework that effectively selects helpful auxiliary data from an open knowledge space (e.g. the World Wide Web). Because there is no need of manually selecting auxiliary data for different target domain tasks, we call our framework Source Free Transfer Learning (SFTL). For each target domain task, SFTL framework iteratively queries for the helpful auxiliary data based on the learned model and then updates the model using the retrieved auxiliary data. We highlight the automatic constructions of queries and the robustness of the SFTL framework. Our experiments on 20NewsGroup dataset and a Google search snippets dataset suggest that the framework is capable of achieving comparable performance to those state-of-the-art methods with dedicated selections of auxiliary data.

artificial intelligence, auxiliary data, machine learning, (11 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.29)
North America > United States (0.28)

Genre: Research Report (0.67)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Fraudulent Support Telephone Number Identification Based on Co-Occurrence Information on the Web

Li, Xin (Tsinghua University) | Liu, Yiqun (Tsinghua University) | Zhang, Min (Tsinghua University) | Ma, Shaoping (Tsinghua University)

"Fraudulent support phones" refers to the misleading telephone numbers placed on Web pages or other media that claim to provide services with which they are not associated. Most fraudulent support phone information is found on search engine result pages (SERPs), and such information substantially degrades the search engine user experience. In this paper, we propose an approach to identify fraudulent support telephone numbers on the Web based on the co-occurrence relations between telephone numbers that appear on SERPs. We start from a small set of seed official support phone numbers and seed fraudulent numbers. Then, we construct a co-occurrence graph according to the co-occurrence relationships of the telephone numbers that appear on Web pages. Additionally, we take the page layout information into consideration on the assumption that telephone numbers that appear in nearby page blocks should be regarded as more closely related. Finally, we develop a propagation algorithm to diffuse the trust scores of seed official support phone numbers and the distrust scores of the seed fraudulent numbers on the co-occurrence graph to detect additional fraudulent numbers. Experimental results based on over 1.5 million SERPs produced by a popular Chinese commercial search engine indicate that our approach outperforms TrustRank, Anti-TrustRank and Good-Bad Rank algorithms by achieving an AUC value of over 0.90.

information retrieval, machine learning, natural language, (18 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.28)

Industry: Information Technology > Security & Privacy (0.95)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Predicting Emotions in User-Generated Videos

Jiang, Yu-Gang (Fudan University, Shanghai) | Xu, Baohan (Fudan University, Shanghai) | Xue, Xiangyang (Fudan University, Shanghai)

User-generated video collections are expanding rapidly in recent years, and systems for automatic analysis of these collections are in high demands. While extensive research efforts have been devoted to recognizing semantics like "birthday party" and "skiing", little attempts have been made to understand the emotions carried by the videos, e.g., "joy" and "sadness". In this paper, we propose a comprehensive computational framework for predicting emotions in user-generated videos. We first introduce a rigorously designed dataset collected from popular video-sharing websites with manual annotations, which can serve as a valuable benchmark for future research. A large set of features are extracted from this dataset, ranging from popular low-level visual descriptors, audio features, to high-level semantic attributes. Results of a comprehensive set of experiments indicate that combining multiple types of features---such as the joint use of the audio and visual clues---is important, and attribute features such as those containing sentiment-level semantics are very effective.

artificial intelligence, machine learning, natural language, (19 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.15)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.70)
Information Technology > Artificial Intelligence > Natural Language (0.69)
(2 more...)

User Group Oriented Temporal Dynamics Exploration

Hu, Zhiting (Peking University) | Yao, Junjie (University of California, Santa Barbara) | Cui, Bin (Peking University)

Temporal online content becomes the zeitgeist to reflect our interests and changes. Active users are essential participants and promoters behind it. Temporal dynamics becomes a viable way to investigate users. However, most current work only use global temporal trend and fail to distinguish such fine-grained patterns across groups. Different users have diverse interest and exhibit distinct behaviors, and temporal dynamics tend to be different. This paper proposes GrosToT (Group Specific Topics-over-Time), a unified probabilistic model to infer latent user groups and temporal topics at the same time. It models group-specific temporal topic variation from social content. By leveraging the comprehensive group-specific temporal patterns, GrosToT significantly outperforms state-of-the-art dynamics modeling methods. Our proposed approach shows advantage not only in temporal dynamics but also group content modeling. The dynamics over different groups vary, reflecting the groups' intention. GrosToT uncovers the interplay between group interest and temporal dynamics. Specifically, groups' attention to their medium-interested topics are event-driven, showing rich bursts; while its engagement in group's dominating topics are interest-driven, remaining stable over time.

artificial intelligence, natural language, social media, (19 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia (0.29)

Industry:

Media (0.68)
Leisure & Entertainment > Sports > Tennis (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Feng, Shanshan (Nanyang Technological University) | Chen, Xuefeng (University of Electronic Science and Technology of China) | Cong, Gao (Nanyang Technological University) | Zeng, Yifeng (Teesside University) | Chee, Yeow Meng (Nanyang Technological University) | Xiang, Yanping (University of Electronic Science and Technology of China)

Influence Maximization with Novelty Decay in Social Networks

Influence maximization problem is to find a set of seed nodes in a social network such that their influence spread is maximized under certain propagation models. A few algorithms have been proposed for solving this problem. However, they have not considered the impact of novelty decay on influence propagation, i.e., repeated exposures will have diminishing influence on users. In this paper, we consider the problem of influence maximization with novelty decay (IMND). We investigate the effect of novelty decay on influence propagation on real-life datasets and formulate the IMND problem. We further analyze the problem properties and propose an influence estimation technique. We demonstrate the performance of our algorithms on four social networks.

algorithm, artificial intelligence, social media, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia (0.15)

Industry: Information Technology > Services (0.96)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Machine Translation with Real-Time Web Search

Cui, Lei (Harbin Institute of Technology) | Zhou, Ming (Microsoft Research) | Chen, Qiming (Shanghai Jiao Tong University) | Zhang, Dongdong (Microsoft Research) | Li, Mu (Microsoft Research)

Contemporary machine translation systems usually rely on offline data retrieved from the web for individual model training, such as translation models and language models. In contrast to existing methods, we propose a novel approach that treats machine translation as a web search task and utilizes the web on the fly to acquire translation knowledge. This end-to-end approach takes advantage of fresh web search results that are capable of leveraging tremendous web knowledge to obtain phrase-level candidates on demand and then compose sentence-level translations. Experimental results show that our web-based machine translation method demonstrates very promising performance in leveraging fresh translation knowledge and making translation decisions. Furthermore, when combined with offline models, it significantly outperforms a state-of-the-art phrase-based statistical machine translation system.

artificial intelligence, natural language, translation, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.69)
Asia > China (0.68)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Improving Context and Category Matching for Entity Search

Chen, Yueguo (Renmin University of China) | Gao, Lexi (Renmin University of China) | Shi, Shuming (Microsoft Research Asia) | Du, Xiaoyong (Renmin University of China) | Wen, Ji-Rong (Renmin University of China)

Entity search is to retrieve a ranked list of named entities of target types to a given query. In this paper, we propose an approach of entity search by formalizing both context matching and category matching. In addition, we propose a result re-ranking strategy that can be easily adapted to achieve a hybrid of two context matching strategies. Experiments on the INEX 2009 entity ranking task show that the proposed approach achieves a significant improvement of the entity search performance (xinfAP from 0.27 to 0.39) over the existing solutions.

category, machine learning, natural language, (22 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia (0.15)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Context-Aware Collaborative Topic Regression with Social Matrix Factorization for Recommender Systems

Chen, Chaochao (Zhejiang University) | Zheng, Xiaolin (Zhejiang University) | Wang, Yan (Macquarie University) | Hong, Fuxing (Zhejiang University) | Lin, Zhen (Zhejiang University)

Online social networking sites have become popular platforms on which users can link with each other and share information, not only basic rating information but also information such as contexts, social relationships, and item contents. However, as far as we know, no existing works systematically combine diverse types of information to build more accurate recommender systems. In this paper, we propose a novel context-aware hierarchical Bayesian method. First, we propose the use of spectral clustering for user-item subgrouping, so that users and items in similar contexts are grouped. We then propose a novel hierarchical Bayesian model that can make predictions for each user-item subgroup, our model incorporate not only topic modeling to mine item content but also social matrix factorization to handle ratings and social relationships. Experiments on an Epinions dataset show that our method significantly improves recommendation performance compared with six categories of state-of-the-art recommendation methods in terms of both prediction accuracy and recall. We have also conducted experiments to study the extent to which ratings, contexts, social relationships, and item contents contribute to recommendation performance in terms of prediction accuracy and recall.

artificial intelligence, information, machine learning, (18 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Genre: Research Report (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)