AITopics

Transfer learning as a new machine learning paradigm has gained increasing attention lately. In situations where the training data in a target domain are not sufficient to learn predictive models effectively, transfer learning leverages auxiliary source data from other related source domains for learning. While most of the existing works in this area only focused on using the source data with the same structure as the target data, in this paper, we push this boundary further by proposing a heterogeneous transfer learning framework for knowledge transfer between text and images. We observe that for a target-domain classification problem, some annotated images can be found on many social Web sites, which can serve as a bridge to transfer knowledge from the abundant text documents available over the Web. A key question is how to effectively transfer the knowledge in the source data even though the text can be arbitrarily found. Our solution is to enrich the representation of the target images with semantic concepts extracted from the auxiliary source data through a novel matrix factorization method. By using the latent semantic features generated by the auxiliary data, we are able to build a better integrated image classifier. We empirically demonstrate the effectiveness of our algorithm on the Caltech-256 image dataset.

artificial intelligence, classification, machine learning, (18 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California (0.04)
(2 more...)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

SemRec: A Semantic Enhancement Framework for Tag Based Recommendation

Xu, Guandong (Victoria University) | Gu, Yanhui (University of Tokyo) | Dolog, Peter (Aalborg University) | Zhang, Yanchun (Victoria University) | Kitsuregawa, Masaru (University of Tokyo)

Collaborative tagging services provided by various social web sites become popular means to mark web resources for different purposes such as categorization, expression of a preference and so on. However, the tags are of syntactic nature, in a free style and do not reflect semantics, resulting in the problems of redundancy, ambiguity and less semantics. Current tag-based recommender systems mainly take the explicit structural information among users, resources and tags into consideration, while neglecting the important implicit semantic relationships hidden in tagging data. In this study, we propose a Semantic Enhancement Recommendation strategy (SemRec), based on both structural information and semantic information through a unified fusion model. Extensive experiments conducted on two real datasets demonstarte the effectiveness of our approaches.

artificial intelligence, machine learning, natural language, (20 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.05)
Asia > Middle East > Jordan (0.04)
(3 more...)

Industry: Information Technology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Wu, Shanchan (University of Maryland, College Park) | Elsayed, Tamer (King Abdullah University of Science and Technology (KAUST)) | Rand, William (University of Maryland, College Park) | Raschid, Louiqa (University of Maryland, College Park)

Predicting Author Blog Channels with High Value Future Posts for Monitoring

The phenomenal growth of social media, both in scale and importance, has created a unique opportunity to track information diffusion and the spread of influence, but can also make efficient tracking difficult. Given data streams representing blog posts on multiple blog channels and a focal query post on some topic of interest, our objective is to predict which of those channels are most likely to contain a future post that is relevant, or similar, to the focal query post. We denote this task as the future author prediction problem (FAPP). This problem has applications in information diffusion for brand monitoring and blog channel personalization and recommendation. We develop prediction methods inspired by (naive) information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem. We evaluate our methods on an extensive social media dataset; despite the difficulty of the task, all methods perform reasonably well. Results show that ranking SVM prediction can exploit blog channel and diffusion characteristics to improve prediction accuracy. Moreover, it is surprisingly good for prediction in emerging topics and identifying inconsistent authors.

artificial intelligence, blog channel, machine learning, (18 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

CCRank: Parallel Learning to Rank with Cooperative Coevolution

Wang, Shuaiqiang (Shandong University of Finance) | Gao, Byron J. (Texas State University-San Marcos) | Wang, Ke (Simon Fraser University) | Lauw, Hady W. (Institute for Infocomm Research)

We propose CCRank, the first parallel algorithm for learning to rank, targeting simultaneous improvement in learning accuracy and efficiency. CCRank is based on cooperative coevolution (CC), a divide-and-conquer framework that has demonstrated high promise in function optimization for problems with large search space and complex structures. Moreover, CC naturally allows parallelization of sub-solutions to the decomposed subproblems, which can substantially boost learning efficiency. With CCRank, we investigate parallel CC in the context of learning to rank. Extensive experiments on benchmarks in comparison with the state-of-the-art algorithms show that CCRank gains in both accuracy and efficiency.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Texas > Hays County > San Marcos (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.47)

Cross-Language Latent Relational Search: Mapping Knowledge across Languages

Duc, Nguyen Tuan (The University of Tokyo) | Bollegala, Danushka (The University of Tokyo) | Ishizuka, Mitsuru (The University of Tokyo)

Latent relational search (LRS) is a novel approach for mapping knowledge across two domains. Given a source domain knowledge concerning the Moon, "The Moon is a satellite of the Earth," one can form a question {(Moon, Earth), (Ganymede, ?)} to query an LRS engine for new knowledge in the target domain concerning the Ganymede. An LRS engine relies on some supporting sentences such as ``Ganymede is a natural satellite of Jupiter.'' to retrieve and rank "Jupiter" as the first answer. This paper proposes cross-language latent relational search (CLRS) to extend the knowledge mapping capability of LRS from cross-domain knowledge mapping to cross-domain and cross-language knowledge mapping. In CLRS, the supporting sentences for the source pair might be in a different language with that of the target pair. We represent the relation between two entities in an entity pair by lexical patterns of the context surrounding the two entities. We then propose a novel hybrid lexical pattern clustering algorithm to capture the semantic similarity between paraphrased lexical patterns across languages. Experiments on Japanese-English datasets show that the proposed method achieves an MRR of 0.579 for CLRS task, which is comparable to the MRR of an existing monolingual LRS engine.

lexical pattern, machine learning, natural language, (20 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.17)
Africa > South Africa > Gauteng > Pretoria (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.57)

Generating True Relevance Labels in Chinese Search Engine Using Clickthrough Data

Song, Hengjie (Nanyang Technological University) | Miao, Chunyan (Nanyang Technological University) | Shen, Zhiqi (Nanyang Technological University)

In current search engines, ranking functions are learned from a large number of labeled <query, URL> pairs in which the labels are assigned by human judges, describing how well the URLs match the different queries. However in commercial search engines, collecting high quality labels is time-consuming and labor-intensive. To tackle this issue, this paper studies how to produce the true relevance labels for <query, URL> pairs using clickthrough data. By analyzing the correlations between query frequency, true relevance labels and users’ behaviors, we demonstrate that the users who search the queries with similar frequency have similar search intents and behavioral characteristics. Based on such properties, we propose an efficient discriminative parameter estimation in a multiple instance learning algorithm (MIL) to automatically produce true relevance labels for <query, URL> pairs. Furthermore, we test our approach using a set of real world data extracted from a Chinese commercial search engine. Experimental results not only validate the effectiveness of the proposed approach, but also indicate that our approach is more likely to agree with the aggregation of the multiple judgments when strong disagreements exist in the panel of judges. In the event that the panel of judges is consensus, our approach provides more accurate automatic label results. In contrast with other models, our approach effectively improves the correlation between automatic labels and manual labels.

information retrieval, machine learning, natural language, (20 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
North America > United States > Nevada (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(5 more...)

Genre: Research Report (0.66)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Fast Query Recommendation by Search

Jiang, Qixia (Tsinghua University) | Sun, Maosong (Tsinghua University)

Query recommendation can not only effectively facilitate users to obtain their desired information but alsoincrease ads’ click-through rates. This paper presentsa general and highly efficient method for query recommendation. Given query sessions, we automatically generate many similar and dissimilar query-pairs as the prior knowledge. Then we learn a transformation from the prior knowledge to move similar queries closer such that similar queries tend to have similar hash values.This is formulated as minimizing the empirical error on the prior knowledge while maximizing the gap between the data and some partition hyperplanes randomly generated in advance. In the recommendation stage, we search queries that have similar hash values to the given query, rank the found queries and return the top K queries as the recommendation result. All the experimental results demonstrate that our method achieves encouraging results in terms of efficiency and recommendation performance.

data mining, information retrieval, machine learning, (22 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Figueroa, Alejandro (Yahoo! Research Latin America) | Atkinson, John (Universidad de Concepcion)

Maximum Entropy Context Models for Ranking Biographical Answers to Open-Domain Definition Questions

In the context of question-answering systems, there are several strategies for scoring candidate answers to definition queries including centroid vectors, bi-term and context language models. These techniques use only positive examples (i.e., descriptions) when building their models. In this work, a maximum entropy based extension is proposed for context language models so as to account for regularities across non-descriptions mined from web-snippets. Experiments show that this extension outperforms other strategies increasing the precision of the top five ranked answers by more than 5%. Results suggest that web-snippets are a cost-efficient source of non-descriptions, and that some relationships extracted from dependency trees are effective to mine for candidate answer sentences.

artificial intelligence, machine learning, natural language, (18 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Chile > Biobío Region > Concepción Province > Concepción (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Identifying Missing Node Information in Social Networks

Eyal, Ron (Bar Ilan University) | Kraus, Sarit (Bar Ilan University) | Rosenfeld, Avi (Jerusalem College of Technology)

In recent years, social networks have surged in popularity as one of the main applications of the Internet. This has generated great interest in researching these networks by various fields in the scientific community. One key aspect of social network research is identifying important missing information which is not explicitly represented in the network, or is not visible to all. To date, this line of research typically focused on what connections were missing between nodes,or what is termed the "Missing Link Problem." This paper introduces a new Missing Nodes Identification problem where missing members in the social network structure must be identified. Towards solving this problem, we present an approach based on clustering algorithms combined with measures from missing link research. We show that this approach has beneficial results in the missing nodes identification process and we measure its performance in several different scenarios.

data mining, machine learning, node, (19 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.06)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > Idaho (0.04)

Genre:

Workflow (0.46)
Research Report > New Finding (0.46)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.52)

User-Controllable Learning of Location Privacy Policies With Gaussian Mixture Models

Cranshaw, Justin (Carnegie Mellon University) | Mugan, Jonathan (Carnegie Mellon University) | Sadeh, Norman (Carnegie Mellon University)

With smart-phones becoming increasingly commonplace, there has been a subsequent surge in applications that continuously track the location of users. However, serious privacy concerns arise as people start to widely adopt these applications. Users will need to maintain policies to determine under which circumstances to share their location. Specifying these policies however, is a cumbersome task, suggesting that machine learning might be helpful. In this paper, we present a user-controllable method for learning location sharing policies. We use a classifier based on multivariate Gaussian mixtures that is suitably modified so as to restrict the evolution of the underlying policy to favor incremental and therefore human-understandable changes as new data arrives. We evaluate the model on real location-sharing policies collected from a live location-sharing social network, and we show that our method can learn policies in a user-controllable setting that are just as accurate as policies that do not evolve incrementally. Additionally, we highlight the strength of the generative modeling approach we take, by showing how our model easily extends to the semi-supervised setting.

artificial intelligence, gaussian, machine learning, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > District of Columbia > Washington (0.04)
Europe > Portugal (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)