Goto

Collaborating Authors

 Asia


Multi-Armed Bandit with Budget Constraint and Variable Costs

AAAI Conferences

We study the multi-armed bandit problems with budget constraint and variable costs (MAB-BV). In this setting, pulling an arm will receive a random reward together with a random cost, and the objective of an algorithm is to pull a sequence of arms in order to maximize the expected total reward with the costs of pulling those arms complying with a budget constraint. This new setting models many Internet applications (e.g., ad exchange, sponsored search, and cloud computing) in a more accurate manner than previous settings where the pulling of arms is either costless or with a fixed cost. We propose two UCB based algorithms for the new setting. The first algorithm needs prior knowledge about the lower bound of the expected costs when computing the exploration term. The second algorithm eliminates this need by estimating the minimal expected costs from empirical observations, and therefore can be applied to more real-world applications where prior knowledge is not available. We prove that both algorithms have nice learning abilities, with regret bounds of O(ln B). Furthermore, we show that when applying our proposed algorithms to a previous setting with fixed costs (which can be regarded as our special case), one can improve the previously obtained regret bound. Our simulation results on real-time bidding in ad exchange verify the effectiveness of the algorithms and are consistent with our theoretical analysis.


From Interest to Function: Location Estimation in Social Media

AAAI Conferences

Recent years have witnessed the tremendous development of social media, which attracts a vast number of Internet users. The high-dimension content generated by these users provides an unique opportunity to understand their behavior deeply. As one of the most fundamental topics, location estimation attracts more and more research efforts. Different from the previous literature, we find that user's location is strongly related to user interest. Based on this, we first build a detection model to mine user interest from short text. We then establish the mapping between location function and user interest before presenting an efficient framework to predict the user's location with convincing fidelity. Thorough evaluations and comparisons on an authentic data set show that our proposed model significantly outperforms the state-of-the-arts approaches. Moreover, the high efficiency of our model also guarantees its applicability in real-world scenarios.


Goal-Oriented Euclidean Heuristics with Manifold Learning

AAAI Conferences

Recently, a Euclidean heuristic (EH) has been proposed for A* search. EH exploits manifold learning methods to construct an embedding of the state space graph, and derives an admissible heuristic distance between two states from the Euclidean distance between their respective embedded points. EH has shown good performance and memory efficiency in comparison to other existing heuristics such as differential heuristics. However, its potential has not been fully explored. In this paper, we propose a number of techniques that can significantly improve the quality of EH. We propose a goal-oriented manifold learning scheme that optimizes the Euclidean distance to goals in the embedding while maintaining admissibility and consistency. We also propose a state heuristic enhancement technique to reduce the gap between heuristic and true distances. The enhanced heuristic is admissible but no longer consistent. We then employ a modified search algorithm, known as B' algorithm, that achieves optimality with inconsistent heuristics using consistency check and propagation. We demonstrate the effectiveness of the above techniques and report un-matched reduction in search costs across several non-trivial benchmark search problems.


Uncorrelated Lasso

AAAI Conferences

In this paper, motivated by the previous sparse learning In many regression applications, there are too many unrelated based research, we propose to add variable correlation into predictors which may hide the relationship between the sparse-learning-based variable selection approach. We response and the most related predictors. A common way to note that in previous Lasso-type variable selection, variable resolve this problem is variable selection, that is to select a correlations are not taken into account, while in most subset of the most representative or discriminative predictors real-life data, predictors are often correlated. Strongly correlated from the input predictor set. The central requirement is that predictors share similar properties, and have some good predictor set contains predictors that are highly correlated overlapped information.


Improving WalkSAT for Random k-Satisfiability Problem with k > 3

AAAI Conferences

Stochastic local search (SLS) algorithms are well known for their ability to efficiently find models of random instances of the Boolean satisfiablity (SAT) problem. One of the most famous SLS algorithms for SAT is WalkSAT, which is an initial algorithm that has wide influence among modern SLS algorithms. Recently, there has been increasing interest in WalkSAT, due to the discovery of its great power on large random 3-SAT instances. However, the performance of WalkSAT on random $k$-SAT instances with $k>3$ lags far behind. Indeed, there have been few works in improving SLS algorithms for such instances. This work takes a large step towards this direction. We propose a novel concept namely $multilevel$ $make$. Based on this concept, we design a scoring function called $linear$ $make$, which is utilized to break ties in WalkSAT, leading to a new algorithm called WalkSAT$lm$. Our experimental results on random 5-SAT and 7-SAT instances show that WalkSAT$lm$ improves WalkSAT by orders of magnitudes. Moreover, WalkSAT$lm$ significantly outperforms state-of-the-art SLS solvers on random 5-SAT instances, while competes well on random 7-SAT ones. Additionally, WalkSAT$lm$ performs very well on random instances from SAT Challenge 2012, indicating its robustness.


How Bad Is Selfish Voting?

AAAI Conferences

It is well known that strategic behavior in elections is essentially unavoidable; we therefore ask: how bad can the rational outcome be? We answer this question via the notion of the price of anarchy, using the scores of alternatives as a proxy for their quality and bounding the ratio between the score of the optimal alternative and the score of the winning alternative in Nash equilibrium. Specifically, we are interested in Nash equilibria that are obtained via sequences of rational strategic moves. Focusing on three common voting rules — plurality, veto, and Borda — we provide very positive results for plurality and very negative results for Borda, and place veto in the middle of this spectrum.


Qualitative Planning under Partial Observability in Multi-Agent Domains

AAAI Conferences

Decentralized POMDPs (Dec-POMDPs) provide a rich, attractive model for planning under uncertainty and partial observability in cooperative multi-agent domains with a growing body of research. In this paper we formulate a qualitative, propositional model for multi-agent planning under uncertainty with partial observability, which we call Qualitative Dec-POMDP (QDec-POMDP). We show that the worst-case complexity of planning in QDec-POMDPs is similar to that of Dec-POMDPs. Still, because the model is more “classical” in nature, it is more compact and easier to specify. Furthermore, it eases the adaptation of methods used in classical and contingent planning to solve problems that challenge current Dec-POMDPs solvers. In particular, in this paper we describe a method based on compilation to classical planning, which handles multi-agent planning problems significantly larger than those handled by current Dec-POMDP algorithms.


Teamwork with Limited Knowledge of Teammates

AAAI Conferences

While great strides have been made in multiagent teamwork, existing approaches typically assume extensive information exists about teammates and how to coordinate actions. This paper addresses how robust teamwork can still be created even if limited or no information exists about a specific group of teammates, as in the ad hoc teamwork scenario. The main contribution of this paper is the first empirical evaluation of an agent cooperating with teammates not created by the authors, where the agent is not provided expert knowledge of its teammates. For this purpose, we develop a general-purpose teammate modeling method and test the resulting ad hoc team agent's ability to collaborate with more than 40 unknown teams of agents to accomplish a benchmark task. These agents were designed by people other than the authors without these designers planning for the ad hoc teamwork setting. A secondary contribution of the paper is a new transfer learning algorithm, TwoStageTransfer, that can improve results when the ad hoc team agent does have some limited observations of its current teammates.


Equilibria of Online Scheduling Algorithms

AAAI Conferences

We describe a model for competitive online scheduling algorithms. Two servers, each with a single observable queue, compete for customers. Upon arrival, each customer strategically chooses the queue with minimal expected wait time. Each scheduler wishes to maximize its number of customers, and can strategically select which scheduling algorithm, such as First-Come-First-Served (FCFS), to use for its queue. This induces a game played by the servers and the customers. We consider a non-Bayesian setting, where servers and customers play to maximize worst-case payoffs. We show that there is a unique subgame perfect safety-level equilibrium and we describe the associated scheduling algorithm (which is not FCFS). The uniqueness result holds for both randomized and deterministic algorithms, with a different equilibrium algorithm in each case. When the goal of the servers is to minimize competitive ratio, we prove that it is an equilibrium for each server to apply FCFS: each server obtains the optimal competitive ratio of 2.


A Pattern Matching Based Model for Implicit Opinion Question Identification

AAAI Conferences

This paper presents the results of developing subjectivity classifiers for Implicit Opinion Question (IOQ) identification. IOQs are defined as opinion questions with no opinion words. An IOQ example is "will the U.S. government pay more attention to the Pacific Rim?" Our analysis on community questions of Yahoo! Answers shows that a large proportion of opinion questions are IOQs. It is thus important to develop techniques to identify such questions. In this research, we first propose an effective framework based on mutual information and sequential pattern mining to construct an opinion lexicon that not only contains opinion words but also patterns. The discovered words and patterns are then combined with a machine learning technique to identify opinion questions. The experimental results on two datasets demonstrate the effectiveness of our approach.