AITopics

doi: 10.1613/jair.3872

AI Access Foundation

10840

Country:

Europe > Germany (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Greece > Attica > Athens (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Active Evaluation of Ranking Functions Based on Graded Relevance (Extended Abstract)

Sawade, Christoph (University of Potsdam) | Bickel, Steffen (Nokia gate5 GmbH) | Oertzen, Timo von (University of Virginia) | Scheffer, Tobias (University of Potsdam) | Landwehr, Niels (University of Potsdam)

Evaluating the quality of ranking functions is a core task in web search and other information retrieval domains. Because query distributions and item relevance change over time, ranking models often cannot be evaluated accurately on held-out training data. Instead, considerable effort is spent on manually labeling the relevance of query results for test queries in order to track ranking performance. We address the problem of estimating ranking performance as accurately as possible on a fixed labeling budget. Estimates are based on a set of most informative test queries selected by an active sampling distribution. Query labeling costs depend on the number of result items and item-specific attributes such as document length. We derive cost-optimal sampling distributions for commonly used ranking performance measures. Experiments on web search engine data illustrate significant reductions in labeling costs.

active evaluation, extended abstract, graded relevance, (1 more...)

Twenty-Third International Joint Conference on Artificial Intelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)

Integrating Semantic Relatedness and Words' Intrinsic Features for Keyword Extraction

Zhang, Wei (Tsinghua University) | Feng, Wei (Tsinghua University) | Wang, Jianyong (Tsinghua University)

integrating semantic relatedness and word, intrinsic feature, keyword extraction

Keyword extraction attracts much attention for its significant role in various natural language processing tasks. While some existing methods for keyword extraction have considered using single type of semantic relatedness between words or inherent attributes of words, almost all of them ignore two important issues: 1) how to fuse multiple types of semantic relations between words into a uniform semantic measurement and automatically learn the weights of the edges between the words in the word graph of each document, and 2) how to integrate the relations between words and words' intrinsic features into a unified model. In this work, we tackle the two issues based on the supervised random walk model. We propose a supervised ranking based method for keyword extraction, which is called SEAFARER. It can not only automatically learn the weights of the edges in the unified graph of each document which includes multiple semantic relations but also combine the merits of semantic relations of edges and intrinsic attributes of nodes together. We conducted extensive experimental study on an established benchmark and the experimental results demonstrate that SEAFARER outperforms the state-of-the-art supervised and unsupervised methods.

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Harmonious Hashing

Hashing-based fast nearest neighbor search technique has attracted great attention in both research and industry areas recently.Many existing hashing approaches encode data with projection-based hash functions and represent each projected dimension by 1-bit.However, the dimensions with high variance hold large energy or information of data but treated equivalently as dimensions with low variance,which leads to a serious information loss.In this paper, we introduce a novel hashing algorithm called Harmonious Hashing which aims at learning hash functions with low information loss.Specifically, we learn a set of optimized projections to preserve the maximum cumulative energy and meet the constraint of equivalent variance on each dimension as much as possible.In this way, we could minimize the information loss after binarization.Despite the extreme simplicity, our method outperforms superiorly to many state-of-the-art hashing methods in large-scale and high-dimensional nearest neighbor search experiments.

harmonious hashing

Twenty-Third International Joint Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.73)

A Unified Approximate Nearest Neighbor Search Scheme by Combining Data Structure and Hashing

approximate nearest neighbor search scheme, combining data structure and hashing

Nowadays, Nearest Neighbor Search becomes more and more important when facing the challenge of big data. Traditionally, to solve this problem, researchers mainly focus on building effective data structures such as hierarchical k-means tree or using hashing methods to accelerate the query process. In this paper, we propose a novel unified approximate nearest neighbor search scheme to combine the advantages of both the effective data structure and the fast Hamming distance computation in hashing methods. In this way, the searching procedure can be further accelerated. Computational complexity analysis and extensive experiments have demonstrated the effectiveness of our proposed scheme.

Twenty-Third International Joint Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.80)

Moffitt, Michael D. (IBM Corp.)

Search Strategies for Optimal Multi-Way Number Partitioning

The number partitioning problem seeks to divide a set of n numbers across k distinct subsets so as to minimize the sum of the largest partition. In this work, we develop a new optimal algorithm for multi-way number partitioning. A critical observation motivating our methodology is that a globally optimal k -way partition may be recursively constructed by obtaining suboptimal solutions to subproblems of size k – 1. We introduce a new principle of optimality that provides necessary and sufficient conditions for this construction, and use it to strengthen the relationship between sequential decompositions by enforcing upper and lower bounds on intermediate solutions. We also demonstrate how to further prune unpromising partial assignments by detecting and eliminating dominated solutions. Our approach outperforms the previous state-of-the-art by up to four orders of magnitude , reducing average runtime on the largest benchmarks from several hours to less than a second.

optimal multi-way number partitioning, search strategy

Twenty-Third International Joint Conference on Artificial Intelligence

Technology:

Information Technology > Information Management > Search (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Journal of Artificial Intelligence ResearchJul-22-2013

Topic Segmentation and Labeling in Asynchronous Conversations

Joty, S., Carenini, G., Ng, R. T.

Topic segmentation and labeling is often considered a prerequisite for higher-level conversation analysis and has been shown to be useful in many Natural Language Processing (NLP) applications. We present two new corpora of email and blog conversations annotated with topics, and evaluate annotator reliability for the segmentation and labeling tasks in these asynchronous conversations. We propose a complete computational framework for topic segmentation and labeling in asynchronous conversations. Our approach extends state-of-the-art methods by considering a fine-grained structure of an asynchronous conversation, along with other conversational features by applying recent graph-based methods for NLP. For topic segmentation, we propose two novel unsupervised models that exploit the fine-grained conversational structure, and a novel graph-theoretic supervised model that combines lexical, conversational and topic features. For topic labeling, we propose two novel (unsupervised) random walk models that respectively capture conversation specific clues from two different sources: the leading sentences and the fine-grained conversational structure. Empirical evaluation shows that the segmentation and the labeling performed by our best models beat the state-of-the-art, and are highly correlated with human annotations.

annotation, asynchronous conversation, segmentation, (15 more...)

doi: 10.1613/jair.3940

AI Access Foundation

10826

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
(29 more...)

Genre:

Research Report > New Finding (1.00)
Workflow (0.92)
Research Report > Promising Solution (0.87)
Research Report > Experimental Study (0.68)

Industry:

Leisure & Entertainment > Games (0.46)
Media > News (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.67)
(2 more...)

AAAI ConferencesJul-9-2013

Data-Parallel Computing Meets STRIPS

Karpas, Erez (Technion - Israel Institute of Technology) | Sagi, Tomer (Technion - Israel Institute of Technology) | Domshlak, Carmel (Technion - Israel Institute of Technology) | Gal, Avigdor ( Technion - Israel Institute of Technology ) | Mendelson, Avi (Technion - Israel Institute of Technology ) | Tennenholtz, Moshe (Microsoft Research and Technion)

The increased demand for distributed computations on “big data” has led to solutions such as SCOPE, DryadLINQ, Pig, and Hive, which allow the user to specify queries in an SQL-like language, enriched with sets of user-defined operators. The lack of exact semantics for user-defined operators interferes with the query optimization process, thus putting the burden of suggesting, at least partial, query plans on the user. In an attempt to ease this burden, we propose a formal model that allows for data-parallel program synthesis (DPPS) in a semantically well-defined manner. We show that this model generalizes existing frameworks for data-parallel computation, while providing the flexibility of query plan generation that is currently absent from these frameworks. In particular, we show how existing, off-the-shelf, AI planning tools can be used for solving DPPS tasks.

computation, data chunk, operator, (13 more...)

Twenty-Seventh AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.14)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

AAAI ConferencesJul-9-2013

RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models

Noessner, Jan (University of Mannheim) | Niepert, Mathias (University of Washington) | Stuckenschmidt, Heiner (University of Mannheim)

RockIt is a maximum a-posteriori (MAP) query engine for statistical relational models. MAP inference in graphical models is an optimization problem which can be compiled to integer linear programs (ILPs).We describe several advances in translating MAP queries to ILP instances and present the novel meta-algorithm cutting plane aggregation (CPA). CPA exploits local context-specific symmetries and bundles up sets of linear constraints. The resulting counting constraints lead to more compact ILPs and make the symmetry of the ground model more explicit to state-of-the-art ILP solvers. Moreover, RockIt parallelizes most parts of the MAP inference pipeline taking advantage of ubiquitous shared-memory multi-core architectures. We report on extensive experiments with Markov logic network (MLN) benchmarks showing that RockIt outperforms the state-of-the-art systems Alchemy, Markov TheBeast, and Tuffy both in terms of efficiency and quality of results.

constraint, ground clause, proceedings, (15 more...)

Twenty-Seventh AAAI Conference on Artificial Intelligence

Country:

Europe > Germany (0.04)
North America > United States > Washington > King County > Seattle (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)

AAAI ConferencesJul-9-2013

Reciprocal Hash Tables for Nearest Neighbor Search

Liu, Xianglong (Beihang University) | He, Junfeng (Columbia University and Facebook) | Lang, Bo (Beihang University)

Recent years have witnessed the success of hashingtechniques in approximate nearest neighbor search. Inpractice, multiple hash tables are usually employed toretrieve more desired results from all hit buckets ofeach table. However, there are rare works studying theunified approach to constructing multiple informativehash tables except the widely used random way. In thispaper, we regard the table construction as a selectionproblem over a set of candidate hash functions. Withthe graph representation of the function set, we proposean efficient solution that sequentially applies normal-ized dominant set to finding the most informative andindependent hash functions for each table. To furtherreduce the redundancy between tables, we explore thereciprocal hash tables in a boosting manner, where thehash function graph is updated with high weights em-phasized on the misclassified neighbor pairs of previoushash tables. The construction method is general andcompatible with different types of hashing algorithmsusing different feature spaces and/or parameter settings.Extensive experiments on two large-scale benchmarksdemonstrate that the proposed method outperforms bothnaive construction method and state-of-the-art hashingalgorithms, with up to 65.93% accuracy gains.

hash function, information retrieval, natural language, (18 more...)

Twenty-Seventh AAAI Conference on Artificial Intelligence

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.99)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.62)