AITopics

Many applications in Multilingual and Multimodal Information Access involve searching large databases of high dimensional data objects with multiple (conditionally independent) views. In this work we consider the problem of learning hash functions for similarity search across the views for such applications. We propose a principled method for learning a hash function for each view given a set of multiview training data objects. The hash functions map similar objects to similar codes across the views thus enabling cross-view similarity search. We present results from an extensive empirical study of the proposed approach which demonstrate its effectiveness on Japanese language People Search and Multilingual People Search problems.

codeword, hash function, similarity search, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > Maryland > Baltimore (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.91)
(2 more...)

Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph

Hajebi, Kiana (University of Alberta) | Abbasi-Yadkori, Yasin (University of Alberta) | Shahbazi, Hossein (University of Alberta) | Zhang, Hong (University of Alberta)

There are a number of papers that use hill-climbing or k-We introduce a new nearest neighbor search algorithm. NN graphs for nearest neighbor search, but to the best of our The algorithm builds a nearest neighbor knowledge, using hill-climbing on k-NN graphs is a new idea.

algorithm, dataset, neighbor, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > Canada > Alberta (0.14)
North America > United States > New York > New York County > New York City (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.89)

An Efﬁcient Framework for Constructing Generalized Locally-Induced Text Metrics

Amizadeh, Saeed (University of Pittsburgh) | Wang, Shuguang (University of Pittsburgh) | Hauskrecht, Milos (University of Pittsburgh)

In this paper, we propose a new framework for constructing text metrics which can be used to compare and support inferences among terms and sets of terms. Our metric is derived from data-driven kernels on graphs that let us capture global relations among terms and sets of terms, regardless of their complexity and size. To compute the metric efficiently for any two subsets of terms, we develop an approximation technique that relies on the precompiled term-term similarities. To scale-up the approach to problems with huge number of terms, we develop and experiment with a solution that subsamples the term space. We demonstrate the benefits of the whole framework on two text inference tasks: prediction of terms in the article from its abstract and query expansion in information retrieval.

graph, kernel, query expansion, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Lebanon (0.05)
North America > United States > District of Columbia > Washington (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.72)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

An Assertion Retrieval Algebra for Object Queries over Knowledge Bases

Pound, Jeffrey (University of Waterloo) | Toman, David (University of Waterloo) | Weddell, Grant (University of Waterloo) | Wu, Jiewen (University of Waterloo)

We consider a generalization of instance retrieval over knowledge bases that provides users with assertions in which descriptions of qualifying objects are given in addition to their identifiers. Notably, this involves a transfer of basic database paradigms involving caching and query rewriting in the context of an assertion retrieval algebra. We present an optimization framework for this algebra, with a focus on finding plans that avoid any need for general knowledge base reasoning at query execution time when sufficient cached results of earlier requests exist.

assertion, knowledge base, query, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.93)

AAAI ConferencesJul-12-2011

Why do People Retweet? Anti-Homophily Wins the Day!

Macskassy, Sofus A. ( Fetch Technologies ) | Michelson, Matthew (Fetch Technologies)

Twitter and other microblogs have rapidly become a significant means by which people communicate with the world and each other in near realtime. There has been a large number of studies surrounding these social media, focusing on areas such as information spread, various centrality measures, topic detection and more. However, one area which has not received much attention is trying to better understand what information is being spread and why it is being spread. This work looks to get a better understanding of what makes people spread information in tweets or microblogs through the use of retweeting. Several retweet behavior models are presented and evaluated on a Twitter data set consisting of over 768,000 tweets gathered from monitoring over 30,000 users for a period of one month. We evaluate the proposed models against each user and show how people use different retweet behavior models. For example, we find that although users in the majority of cases do not retweet information on topics that they themselves Tweet about as or from people who are "like them" (hence anti-homophily), we do find that models which do take homophily, or similarity, into account fits the observed retweet behaviors much better than other more general models which do not take this into account. We further find that, not surprisingly, people's retweeting behavior is better explained through multiple different models rather than one model.

category, retweet, tweet, (15 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Middle East (0.04)
Asia > Middle East (0.04)
(5 more...)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

Devezas, José (Labs SAPO/UP) | Nunes, Sérgio (Instituto de Engenharia de Sistemas e Computadores do Porto, Universidade do Porto) | Ribeiro, Cristina (Instituto de Engenharia de Sistemas e Computadores do Porto)

Using the H-Index to Estimate Blog Authority

AAAI ConferencesJul-12-2011

Link analysis is a technique frequently used in the ranking of web sites. On the web, we often encounter content that is organized by entries, sorted from recent to old, and generally follows the structure of a blog. In this paper we explore and evaluate the usage of a bibliometrics measure, called h-index, for the task of blog ranking, in an information retrieval context. We base our experiments on the TREC Blogs08 collection, which comprises over 28 million posts. The results obtained indicate that the h-index is a robust metric that allows for an improved relevance discrimination between blogs, when compared to the in-degree. Additionally, tests performed using distinct versions of the post graph, indicate that this metric might tolerate a certain level of link clutter.

artificial intelligence, information retrieval, natural language, (17 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States (0.04)
Europe > Portugal > Porto > Porto (0.04)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

AAAI ConferencesJul-12-2011

Event Summarization Using Tweets

Chakrabarti, Deepayan (Yahoo! Research) | Punera, Kunal (Yahoo! Research)

Twitter has become exceedingly popular, with hundreds of millions of tweets being posted every day on a wide variety of topics. This has helped make real-time search applications possible with leading search engines routinely displaying relevant tweets in response to user queries. Recent research has shown that a considerable fraction of these tweets are about "events," and the detection of novel events in the tweet-stream has attracted a lot of research interest. However, very little research has focused on properly displaying this real-time information about events. For instance, the leading search engines simply display all tweets matching the queries in reverse chronological order. In this paper we argue that for some highly structured and recurring events, such as sports, it is better to use more sophisticated techniques to summarize the relevant tweets. We formalize the problem of summarizing event-tweets and give a solution based on learning the underlying hidden state representation of the event via Hidden Markov Models. In addition, through extensive experiments on real-world data we show that our model significantly outperforms some intuitive and competitive baselines.

information retrieval, machine learning, real time system, (19 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)

Leuski, Anton (Institute for Creative Technologies) | Traum, David (Institute for Creative Technologies)

NPCEditor: Creating Virtual Human Dialogue Using Information Retrieval Techniques

AI MagazineJul-9-2011

See Leuski et al. (2006) and to the same question -- for example, "What Leuski and Traum (2008) for more details. is your name?" -- depending on who the interactor The final parameter is the classification threshold is looking at. NPCEditor's user interface allows the on the KL-divergence value: only answers that designer to define arbitrary annotation classes or score above the threshold value are returned from categories and specify which of these annotation the classifier. The threshold is determined by tuning categories should be used in classification.

interactor, npceditor, proceedings, (17 more...)

AI Magazine

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > New York (0.05)
North America > United States > District of Columbia > Washington (0.04)
(11 more...)

Genre:

Instructional Material (1.00)
Research Report (0.68)

Industry:

Government > Military > Army (0.94)
Education > Educational Setting (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)

Cancer: A Computational Disease that AI Can Cure

Tenenbaum, Jay M. (CommerceNet) | Shrager, Jeff (CollabRx)

AI MagazineJul-9-2011

Cancer kills millions of people each year. From an AI perspective, finding effective treatments for cancer is a high-dimensional search problem characterized by many molecularly distinct cancer subtypes, many potential targets and drug combinations, and a dearth of high quality data to connect molecular subtypes and treatments to responses. The broadening availability of molecular diagnostics and electronic medical records, presents both opportunities and challenges to apply AI techniques to personalize and improve cancer treatment. We discuss these in the context of Cancer Commons, a “rapid learning” community where patients, physicians, and researchers collect and analyze the molecular and clinical data from every cancer patient, and use these results to individualize therapies. Research opportunities include: adaptively-planning and executing individual treatment experiments across the whole patient population, inferring the causal mechanisms of tumors, predicting drug response in individuals, and generalizing these findings to new cases. The goal is to treat each patient in accord with the best available knowledge, and to continually update that knowledge to benefit subsequent patients. Achieving this goal is a worthy grand challenge for AI.

cancer, pathway, therapy, (17 more...)

AI Magazine

Country:

North America > United States > Virginia > Alexandria County > Alexandria (0.04)
North America > United States > Oregon (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.46)
(2 more...)

AAAI ConferencesJul-5-2011

A Preliminary Evaluation of Machine Learning in Algorithm Selection for Search Problems

Kotthoff, Lars (University of St. Andrews) | Gent, Ian P. (University of St. Andrews) | Miguel, Ian (University of St. Andrews)

Machine learning is an established method of selecting algorithms to solve hard search problems. Despite this, to date no systematic comparison and evaluation of the different techniques has been performed and the performance of existing systems has not been critically compared to other approaches. We compare machine learning techniques for algorithm selection on real-world data sets of hard search problems. In addition to well-established approaches, for the first time we also apply statistical relational learning to this problem. We demonstrate that most machine learning techniques and existing systems perform less well than one might expect. To guide practitioners, we close by giving clear recommendations as to which machine learning techniques are likely to perform well based on our experiments.

algorithm, majority predictor, portfolio, (15 more...)

Fourth Annual Symposium on Combinatorial Search

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)