Khashabi, Daniel
Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
Clark, Peter (Allen Institute for AI) | Etzioni, Oren (Allen Institute for AI) | Khot, Tushar (Allen Institute for AI) | Sabharwal, Ashish (Allen Institute for AI) | Tafjord, Oyvind (Allen Institute for AI) | Turney, Peter (Allen Institute for AI) | Khashabi, Daniel (Univ. Illinois at Urbana-Champaign)
What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline. In this paper, we describe an alternative approach that operates at three levels of representation and reasoning: information retrieval, corpus statistics, and simple inference over a semi-automatically constructed knowledge base, to achieve substantially improved results. We evaluate the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam (using only non-diagram, multiple choice questions), and show that our overall systemโs score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. We conclude with a detailed analysis, illustrating the complementary strengths of each method in the ensemble. Our datasets are being released to enable further research.
Online Learning with Adversarial Delays
Quanrud, Kent, Khashabi, Daniel
We study the performance of standard online learning algorithms when the feedback isdelayed by an adversary. We show that online-gradient-descent [1] and follow-the-perturbed-leader [2] achieve regret O( D)in the delayed setting, where D is the sum of delays of each round's feedback. This bound collapses to an optimal O( T) bound in the usual setting of no delays (where D T). Our main contribution is to show that standard algorithms for online learning already have simple regret bounds in the most general setting of delayed feedback, making adjustments to the analysis and not to the algorithms themselves. Our results help affirm and clarify the success of recent algorithms in optimization and machine learning that operate in a delayed feedback model.
Clustering With Side Information: From a Probabilistic Model to a Deterministic Algorithm
Khashabi, Daniel, Wieting, John, Liu, Jeffrey Yufei, Liang, Feng
In this paper, we propose a model-based clustering method (TVClust) that robustly incorporates noisy side information as soft-constraints and aims to seek a consensus between side information and the observed data. Our method is based on a nonparametric Bayesian hierarchical model that combines the probabilistic model for the data instance and the one for the side-information. An efficient Gibbs sampling algorithm is proposed for posterior inference. Using the small-variance asymptotics of our probabilistic model, we then derive a new deterministic clustering algorithm (RDP-means). It can be viewed as an extension of K-means that allows for the inclusion of side information and has the additional property that the number of clusters does not need to be specified a priori. Empirical studies have been carried out to compare our work with many constrained clustering algorithms from the literature on both a variety of data sets and under a variety of conditions such as using noisy side information and erroneous k values. The results of our experiments show strong results for our probabilistic and deterministic approaches under these conditions when compared to other algorithms in the literature.
Heteroscedastic Relevance Vector Machine
Khashabi, Daniel, Ziyadi, Mojtaba, Liang, Feng
In this work we propose a heteroscedastic generalization to RVM, a fast Bayesian framework for regression, based on some recent similar works. We use variational approximation and expectation propagation to tackle the problem. The work is still under progress and we are examining the results and comparing with the previous works.
Generating Motion Patterns Using Evolutionary Computation in Digital Soccer
Amoozgar, Masoud, Khashabi, Daniel, Heydarian, Milad, Nokhbeh, Mohammad, Shouraki, Saeed Bagheri
Dribbling an opponent player in digital soccer environment is an important practical problem in motion planning. It has special complexities which can be generalized to most important problems in other similar Multi Agent Systems. In this paper, we propose a hybrid computational geometry and evolutionary computation approach for generating motion trajectories to avoid a mobile obstacle. In this case an opponent agent is not only an obstacle but also one who tries to harden dribbling procedure. One characteristic of this approach is reducing process cost of online stage by transferring it to offline stage which causes increment in agents' performance. This approach breaks the problem into two offline and online stages. During offline stage the goal is to find desired trajectory using evolutionary computation and saving it as a trajectory plan. A trajectory plan consists of nodes which approximate information of each trajectory plan. In online stage, a linear interpolation along with Delaunay triangulation in xy-plan is applied to trajectory plan to retrieve desired action.