AITopics

Statutory records of underground utility apparatus (such as pipes andcables) are notoriously inaccurate, so street surveys are usually undertakenbefore road excavation takes place to minimize the extent and duration ofexcavation and for health and safety reasons. This involves the use ofsensors such as Ground Penetrating Radar (GPR). The GPR scans are thenmanually interpreted and combined with the expectations from the utilityrecords and other data such as surveyed manholes. The task is complex owingto the difficulty in interpreting the sensor data, and the spatialcomplexity and extent of under street assets. We explore the application ofAI techniques, in particular Bayesian data fusion (BDF), to automaticallygenerate maps of buried apparatus. Hypotheses about the spatial location anddirection of buried assets are extracted by identifying hyperbolae in theGPR scans. The spatial location of surveyed manholes provides further inputto the algorithm, as well as the prior expectations from the statutoryrecords. These three data sources are used to produce the most probable mapof the buried assets. Experimental results on real and simulated data setsare presented.

algorithm, pipe, statutory record, (13 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)

Recommender Systems from "Words of Few Mouths"

Zhang, Richong (University of Ottawa) | Tran, Thomas (University of Ottawa) | Mao, Yongyi (University of Ottawa)

This paper identifies a widely existing phenomenon in web data, which we call the "words of few mouths" phenomenon. This phenomenon, in the context of online reviews, refers to the case that a large fraction of the reviews are each voted only by very few users. We discuss the challenges of "words of few mouths" in the development of recommender systems based on users' opinions and advocate probabilistic methodologies to handle such challenges. We develop a probabilistic model and correspondingly a logistic regression based learning algorithm for review helpfulness prediction. Our experimental results indicate that the proposed model outperforms the current state-of-the-art algorithms not only in the presence of the "words of few mouths" phenomenon, but also in the absence of such phenomena.

algorithm, helpfulness, widget, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Efficient Searching Top-k Semantic Similar Words

Yang, Zhenglu (The University of Tokyo) | Kitsuregawa, Masaru (The University of Tokyo)

Measuring the semantic meaning between words is an important issue because it is the basis for many applications, such as word sense disambiguation, document summarization, and so forth. Although it has been explored for several decades, most of the studies focus on improving the effectiveness of the problem, i.e., precision and recall. In this paper, we propose to address the efficiency issue, that given a collection of words, how to efficiently discover the top-k most semantic similar words to the query. This issue is very important for real applications yet the existing state-of-the-art strategies cannot satisfy users with reasonable performance. Efficient strategies on searching top-k semantic similar words are proposed. We provide an extensive comparative experimental evaluation demonstrating the advantages of the introduced strategies over the state-of-the-art approaches.

pagecount, query, similarity, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Predicting Epidemic Tendency through Search Behavior Analysis

The possibility that influenza activity can be generally detected through search log analysis has been explored in recent years. However, previous studies have mainly focused on influenza, and little attention has been paid to other epidemics. With an analysis of web user behavior data, we consider the problem of predicting the tendency of hand-foot -and-mouth disease (HFMD), whose out-break in 2010 resulted in a great panic in China. In addi-tion to search queries, we consider users’ interactions with search engines. Given the collected search logs, we cluster HFMD-related search queries, medical pages and news reports into the following sets: epidemic-related queries (ERQs), epidemic-related pages (ERPs) and ep-idemic-related news (ERNs). Furthermore, we count their own frequencies as different features, and we conduct a regression analysis with current HFMD occurrences. The experimental results show that these features exhibit good performances on both accuracy and timeliness.

erq, frequency, query, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Line Orthogonality in Adjacency Eigenspace with Application to Community Partition

Wu, Leting (University of North Carolina at Charlotte) | Ying, Xiaowei (University of North Carolina at Charlotte) | Wu, Xintao (University of North Carolina at Charlotte) | Zhou, Zhi-Hua (Nanjing University)

Different from Laplacian or normal matrix, the properties of the adjacency eigenspace received much less attention. Recent work showed that nodes projected into the adjacency eigenspace exhibit an orthogonal line pattern and nodes from the same community locate along the same line. In this paper, we conduct theoretical studies based on graph perturbation to demonstrate why this line orthogonality property holds in the adjacency eigenspace and why it generally disappears in the Laplacian and normal eigenspaces. Using the orthogonality property in the adjacency eigenspace, we present a graph partition algorithm, AdjCluster, which first projects node coordinates to the unit sphere and then applies the classic k-means to find clusters. Empirical evaluations on synthetic data and real-world social networks validate our theoretical findings and show the effectiveness of our graph partition algorithm.

eigenspace, matrix, spectral coordinate, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > North Carolina (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Tang, Jintao (National University of Defense Technology) | Wang, Ting (National University of Defense Technology) | Lu, Qin (Hong Kong Polytechnic University) | Wang, Ji (National Laboratory for Parallel and Distributed Processing) | Li, Wenjie (Hong Kong Polytechnic University)

A Wikipedia Based Semantic Graph Model for Topic Tracking in Blogosphere

There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the named entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through a graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested using real-world blog data. Experimental results show the advantage of the proposed method on tracking topics in short, noisy text.

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.04)
Africa > Southern Africa (0.04)
Oceania > Australia (0.04)
(8 more...)

Genre: Research Report (0.48)

Industry: Media > News (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Pan, Weike (Hong Kong University of Science and Technology) | Liu, Nathan N. (Hong Kong University of Science and Technology) | Xiang, Evan W. (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

Transfer Learning to Predict Missing Ratings Via Heterogeneous User Feedbacks

Data sparsity due to missing ratings is a major challenge for collaborative filtering (CF) techniques in recommender systems. This is especially true for CF domains where the ratings are expressed numerically. We observe that, while we may lack the information in numerical ratings, we may have more data in the form of binary ratings. This is especially true when users can easily express themselves with their likes and dislikes for certain items. In this paper, we explore how to use the binary preference data expressed in the form of like/dislike to help reduce the impact of data sparsity of more expressive numerical ratings. We do this by transferring the rating knowledge from some auxiliary data source in binary form (that is, likes or dislikes), to a target numerical rating matrix. Our solution is to model both numerical ratings and like/dislike in a principled way, using a novel framework of Transfer by Collective Factorization (TCF). In particular, we construct the shared latent space collectively and learn the data-dependent effect separately. A major advantage of the TCF approach over previous collective matrix factorization (or bi-factorization) methods is that we are able to capture the data-dependent effect when sharing the data-independent knowledge, so as to increase the overall quality of knowledge transfer. Experimental results demonstrate the effectiveness of TCF at various sparsity levels as compared to several state-of-the-art methods.

auxiliary data, factorization, matrix factorization, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Asia > China > Hong Kong (0.05)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Finding the Hidden Gems: Recommending Untagged Music

Horsburgh, Ben (Robert Gordon University) | Craw, Susan (Robert Gordon University) | Massie, Stewart (Robert Gordon University) | Boswell, Robin (Robert Gordon University)

We have developed a novel hybrid representation for Music Information Retrieval. Our representation is built by incorporating audio content into the tag space in a tag-track matrix, and then learning hybrid concepts using latent semantic analysis. We apply this representation to the task of music recommendation, using similarity-based retrieval from a query music track. We also develop a new approach to evaluating music recommender systems, which is based upon the relationship of users liking tracks. We are interested in measuring the recommendation quality, and the rate at which cold-start tracks are recommended. Our hybrid representation is able to outperform a tag-only representation, in terms of both recommendation quality and the rate that cold-start tracks are included as recommendations.

recommendation, recommender system, representation, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Genre: Research Report (0.68)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Fujiwara, Yasuhiro (Nippon Telegraph and Telephone Corporation) | Irie, Go (Nippon Telegraph and Telephone Corporation) | Kitahara, Tomoe (Nihon University)

Fast Algorithm for Affinity Propagation

Affinity Propagation is a state-of-the-art clustering method recently proposed by Frey and Dueck. It has been successfully applied to broad areas of computer science research because it has much better clustering performance than traditional clustering methods such as k -means. In order to obtain high quality sets of clusters, the original Affinity Propagation algorithm iteratively exchanges real-valued messages between all pairs of data points until convergence. However, this algorithm does not scale for large datasets because it requires quadratic CPU time in the number of data points to compute the messages. This paper proposes an efficient Affinity Propagation algorithm that guarantees the same clustering result as the original algorithm after convergence. The heart of our approach is (1) to prune unnecessary message exchanges in the iterations and (2) to compute the convergence values of pruned messages after the iterations to determine clusters. Experimental evaluations on several different datasets demonstrate the effectiveness of our algorithm.

algorithm, fsap, original algorithm, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York (0.16)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Leveraging Unlabeled Data to Scale Blocking for Record Linkage

Record linkage is the process of matching records between two (or multiple) data sets that represent the same real-world entity. An exhaustive record linkage process involves computing the similarities between all pairs of records, which can be very expensive for large data sets. Blocking techniques alleviate this problem by dividing the records into blocks and only comparing records within the same block. To be adaptive from domain to domain, one category of blocking technique formalizes 'construction of blocking scheme' as a machine learning problem. In the process of learning the best blocking scheme, previous learning-based techniques utilize only a set of labeled data. However, since the set of labeled data is usually not large enough to well characterize the unseen (unlabeled) data, the resultant blocking scheme may poorly perform on the unseen data by generating too many candidate matches. To address that, in this paper, we propose to utilize unlabeled data (in addition to labeled data) for learning blocking schemes. Our experimental results show that using unlabeled data in learning can remarkably reduce the number of candidate matches while keeping the same level of coverage for true matches.

candidate match, conjunction, unlabeled data, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)