Goto

Collaborating Authors

 Genre


Improving Twitter Retrieval by Exploiting Structural Information

AAAI Conferences

Most Twitter search systems generally treat a tweet as a plain text when modeling relevance. However, a series of conventions allows users to tweet in structural ways using combination of different blocks of texts.These blocks include plain texts, hashtags, links, mentions, etc. Each block encodes a variety of communicative intent and sequence of these blocks captures changing discourse. Previous work shows that exploiting the structural information can improve the structured document (e.g., web pages) retrieval. In this paper we utilize the structure of tweets, induced by these blocks, for Twitter retrieval. A set of features, derived from the blocks of text and their combinations, is used into a learning-to-rank scenario. We show that structuring tweets can achieve state-of-the-art performance. Our approach does not rely upon social media features, but when we do add this additional information, performance improves significantly.


Content Recommendation for Attention Management in Unified Social Messaging

AAAI Conferences

With the growing popularity of social networks and collaboration systems, people are increasingly working with or socially connected with each other. Unified messaging system provides a single interface for users to receive and process information from multiple sources. It is highly desirable to design attention management solution that can help users easily navigate and process dozens of unread messages from a unified message system. Moreover, with the proliferation of mobile devices people are now selectively consuming the most important messages on the go between different activities in their daily life. The information overload problem is especially acute for mobile users with small screen to display. In this paper, we present \PAM, an intelligent end-to-end Personalized Attention Management solution that employs analytical techniques that can learn user interests and organize and prioritize incoming messages based on user interests. For a list of unread messages, \PAM generates a concise attention report that allows users to quickly scan the important new messages from his important social connections as well as messages about his most important tasks that the user is involved with. Our solution can also be applied in other applications such as news filtering and alerts on mobile devices. Our evaluation results demonstrate the effectiveness of \PAM.


Document Summarization Based on Data Reconstruction

AAAI Conferences

Document summarization is of great value to many real world applications, such as snippets generation for search results and news headlines generation. Traditionally, document summarization is implemented by extracting sentences that cover the main topics of a document with a minimum redundancy. In this paper, we take a different perspective from data reconstruction and propose a novel framework named Document Summarization based on Data Reconstruction (DSDR). Specifically, our approach generates a summary which consist of those sentences that can best reconstruct the original document. To model the relationship among sentences, we introduce two objective functions: (1) linear reconstruction, which approximates the document by linear combinations of the selected sentences; (2) nonnegative linear reconstruction, which allows only additive, not subtractive, linear combinations. In this framework, the reconstruction error becomes a natural criterion for measuring the quality of the summary. For each objective function, we develop an efficient algorithm to solve the corresponding optimization problem. Extensive experiments on summarization benchmark data sets DUC 2006 and DUC 2007 demonstrate the effectiveness of our proposed approach.


Choosing Linguistics over Vision to Describe Images

AAAI Conferences

In this paper, we address the problem of automatically generating human-like descriptions for unseen images, given a collection of images and their corresponding human-generated descriptions. Previous attempts for this task mostly rely on visual clues and corpus statistics, but do not take much advantage of the semantic information inherent in the available image descriptions. Here, we present a generic method which benefits from all these three sources (i.e. visual clues, corpus statistics and available descriptions) simultaneously, and is capable of constructing novel descriptions. Our approach works on syntactically and linguistically motivated phrases extracted from the human descriptions. Experimental evaluations demonstrate that our formulation mostly generates lucid and semantically correct descriptions, and significantly outperforms the previous methods on automatic evaluation metrics. One of the significant advantages of our approach is that we can generate multiple interesting descriptions for an image. Unlike any previous work, we also test the applicability of our method on a large dataset containing complex images with rich descriptions.


Predicting Satisfiability at the Phase Transition

AAAI Conferences

Uniform random 3-SAT at the solubility phase transition is one of the most widely studied and empirically hardest distributions of SAT instances. For 20 years, this distribution has been used extensively for evaluating and comparing algorithms. In this work, we demonstrate that simple rules can predict the solubility of these instances with surprisingly high accuracy. Specifically, we show how classification accuracies of about 70% can be obtained based on cheaply (polynomial-time) computable features on a wide range of instance sizes. We argue in two ways that classification accuracy does not decrease with instance size: first, we show that our models' predictive accuracy remains roughly constant across a wide range of problem sizes; second, we show that a classifier trained on small instances is sufficient to achieve very accurate predictions across the entire range of instance sizes currently solvable by complete methods. Finally, we demonstrate that a simple decision tree based on only two features, and again trained only on the smallest instances, achieves predictive accuracies close to those of our most complex model. We conjecture that this two-feature model outperforms random guessing asymptotically; due to the model's extreme simplicity, we believe that this conjecture is a worthwhile direction for future theoretical work.


Trap Avoidance in Local Search Using Pseudo-Conflict Learning

AAAI Conferences

A key challenge in developing efficient local search solvers is to effectively minimise search stagnation (i.e. avoiding traps or local minima). A majority of the state-of-the-art local search solvers perform random and/or Novelty-based walks to overcome search stagnation. Although such strategies are effective in diversifying a search from its current local minimum, they do not actively prevent the search from visiting previously encountered local minima. In this paper, we propose a new preventative strategy to effectively minimise search stagnation using pseudo-conflict learning. We define a pseudo-conflict as a derived path from the search trajectory that leads to a local minimum. We then introduce a new variable selection scheme that penalises variables causing those pseudo-conflicts. Our experimental results show that the new preventative approach significantly improves the performance of local search solvers on a wide range of structured and random benchmarks.


From Streamlined Combinatorial Search to Efficient Constructive Procedures

AAAI Conferences

In recent years, significant progress in the area of search, constraint satisfaction, and automated reasoning has been driven in part by the study of challenge problems from combinatorics and finite algebra. This work has led to the discovery of interesting discrete structures with intricate mathematical properties. While some of those results have resolved open questions and conjectures, a shortcoming is that they generally do not provide further mathematical insights, from which one could derive more general observations. We propose an approach that integrates specialized combinatorial search, using so-called streamlining, with a human computation component. We use this approach to discover efficient constructive procedures for generating certain classes of combinatorial objects of any size. More specifically, using our framework, we discovered two complementary efficient constructions for generating so-called Spatially Balanced Latin squares (SBLS) of any order N, such that 2N+1 is prime. Previously constructions for SBLSs were not known. Our approach also enabled us to derive a new lower bound for so-called weak Schur numbers, improving on a series of earlier results for Schur numbers.


Solving Temporal Problems Using SMT: Weak Controllability

AAAI Conferences

Temporal problems with uncertainty are a well established formalism to model time constraints of a system interacting with an uncertain environment. Several works have addressed the definition and the solving of controllability problems, and three degrees of controllability have been proposed: weak, strong, and dynamic. In this work we focus on weak controllability: we address both the decision and the strategy extraction problems. Extracting a strategy means finding a function from assignments to uncontrollable time points to assignments to controllable time points that fulfills all the temporal constraints. We address the two problems in the satisfiability modulo theory framework. We provide a clean and complete formalization of the problems, and we propose novel techniques to extract strategies. We also provide experimental evidence of the scalability and efficiency of the proposed techniques.


Two New Local Search Strategies for Minimum Vertex Cover

AAAI Conferences

In this paper, we propose two new strategies to design efficient local search algorithms for the minimum vertex cover (MVC) problem. There are two main drawbacks in state-of-the-art MVC local search algorithms: First, they select a pair of vertices to be exchanged simultaneously, which is time consuming; Second, although they use edge weighting techniques, they do not have a strategy to decrease the weights. To address these drawbacks, we propose two new strategies: two stage exchange and edge weighting with forgetting. The two stage exchange strategy selects two vertices to be exchanged separately and performs the exchange in two stages. The strategy of edge weighting with forgetting not only increases weights of uncovered edges, but also decreases some weights for each edge periodically. We utilize these two strategies to design a new algorithm dubbed NuMVC. The experimental results show that NuMVC significantly outperforms existing state-of-the-art heuristic algorithms on most of the hard DIMACS instances and all instances in the hard random BHOSLIB benchmark.


Configuration Checking with Aspiration in Local Search for SAT

AAAI Conferences

An interesting strategy called configuration checking (CC) was recently proposed to handle the cycling problem in local search for Minimum Vertex Cover. A natural question is whether this CC strategy also works for SAT. The direct application of CC did not result in stochastic local search (SLS) algorithms that can compete with the current best SLS algorithms for SAT. In this paper, we propose a new heuristic based on CC for SLS algorithms for SAT, which is called configuration checking with aspiration (CCA). It is used to develop a new SLS algorithm called Swcca. The experiments on random 3-SAT instances show that Swcca significantly outperforms Sparrow2011, the winner of the random satisfiable category of the SAT Competition 2011, which is considered to be the best local search solver for random 3-SAT instances. Moreover, the experiments on structured instances show that Swcca is competitive with Sattime, the best local search solver for the crafted benchmark in the SAT Competition 2011.