Goto

Collaborating Authors

 Asia


Active and passive learning of linear separators under log-concave distributions

arXiv.org Machine Learning

We provide new results concerning label efficient, polynomial time, passive and active learning of linear separators. We prove that active learning provides an exponential improvement over PAC (passive) learning of homogeneous linear separators under nearly log-concave distributions. Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sample complexity for such problems. This resolves an open question concerning the sample complexity of efficient PAC algorithms under the uniform distribution in the unit ball. Moreover, it provides the first bound for a polynomial-time PAC algorithm that is tight for an interesting infinite class of hypothesis functions under a general and natural class of data-distributions, providing significant progress towards a longstanding open question. We also provide new bounds for active and passive learning in the case that the data might not be linearly separable, both in the agnostic case and and under the Tsybakov low-noise condition. To derive our results, we provide new structural results for (nearly) log-concave distributions, which might be of independent interest as well.


Description Logic Knowledge and Action Bases

Journal of Artificial Intelligence Research

Description logic Knowledge and Action Bases (KAB) are a mechanism for providing both a semantically rich representation of the information on the domain of interest in terms of a description logic knowledge base and actions to change such information over time, possibly introducing new objects. We resort to a variant of DL-Lite where the unique name assumption is not enforced and where equality between objects may be asserted and inferred. Actions are specified as sets of conditional effects, where conditions are based on epistemic queries over the knowledge base (TBox and ABox), and effects are expressed in terms of new ABoxes. In this setting, we address verification of temporal properties expressed in a variant of first-order mu-calculus with quantification across states. Notably, we show decidability of verification, under a suitable restriction inspired by the notion of weak acyclicity in data exchange.


A Theoretical Analysis of NDCG Type Ranking Measures

arXiv.org Machine Learning

A central problem in ranking is to design a ranking measure for evaluation of ranking functions. In this paper we study, from a theoretical perspective, the widely used Normalized Discounted Cumulative Gain (NDCG)-type ranking measures. Although there are extensive empirical studies of NDCG, little is known about its theoretical properties. We first show that, whatever the ranking function is, the standard NDCG which adopts a logarithmic discount, converges to 1 as the number of items to rank goes to infinity. On the first sight, this result is very surprising. It seems to imply that NDCG cannot differentiate good and bad ranking functions, contradicting to the empirical success of NDCG in many applications. In order to have a deeper understanding of ranking measures in general, we propose a notion referred to as consistent distinguishability. This notion captures the intuition that a ranking measure should have such a property: For every pair of substantially different ranking functions, the ranking measure can decide which one is better in a consistent manner on almost all datasets. We show that NDCG with logarithmic discount has consistent distinguishability although it converges to the same limit for all ranking functions. We next characterize the set of all feasible discount functions for NDCG according to the concept of consistent distinguishability. Specifically we show that whether NDCG has consistent distinguishability depends on how fast the discount decays, and 1/r is a critical point. We then turn to the cut-off version of NDCG, i.e., NDCG@k. We analyze the distinguishability of NDCG@k for various choices of k and the discount functions. Experimental results on real Web search datasets agree well with the theory.


Efficient Computation of the Shapley Value for Game-Theoretic Network Centrality

Journal of Artificial Intelligence Research

The Shapley value---probably the most important normative payoff division scheme in coalitional games---has recently been advocated as a useful measure of centrality in networks. However, although this approach has a variety of real-world applications (including social and organisational networks, biological networks and communication networks), its computational properties have not been widely studied. To date, the only practicable approach to compute Shapley value-based centrality has been via Monte Carlo simulations which are computationally expensive and not guaranteed to give an exact answer. Against this background, this paper presents the first study of the computational aspects of the Shapley value for network centralities. Specifically, we develop exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develop efficient (polynomial time) and exact algorithms based on them. We empirically evaluate these algorithms on two real-life examples (an infrastructure network representing the topology of the Western States Power Grid and a collaboration network from the field of astrophysics) and demonstrate that they deliver significant speedups over the Monte Carlo approach. For instance, in the case of unweighted networks our algorithms are able to return the exact solution about 1600 times faster than the Monte Carlo approximation, even if we allow for a generous 10% error margin for the latter method.


Towards more accurate clustering method by using dynamic time warping

arXiv.org Machine Learning

An intrinsic problem of classifiers based on machine learning (ML) methods is that their learning time grows as the size and complexity of the training dataset increases. For this reason, it is important to have efficient computational methods and algorithms that can be applied on large datasets, such that it is still possible to complete the machine learning tasks in reasonable time. In this context, we present in this paper a more accurate simple process to speed up ML methods. An unsupervised clustering algorithm is combined with Expectation, Maximization (EM) algorithm to develop an efficient Hidden Markov Model (HMM) training. The idea of the proposed process consists of two steps. In the first step, training instances with similar inputs are clustered and a weight factor which represents the frequency of these instances is assigned to each representative cluster. Dynamic Time Warping technique is used as a dissimilarity function to cluster similar examples. In the second step, all formulas in the classical HMM training algorithm (EM) associated with the number of training instances are modified to include the weight factor in appropriate terms. This process significantly accelerates HMM training while maintaining the same initial, transition and emission probabilities matrixes as those obtained with the classical HMM training algorithm. Accordingly, the classification accuracy is preserved. Depending on the size of the training set, speedups of up to 2200 times is possible when the size is about 100.000 instances. The proposed approach is not limited to training HMMs, but it can be employed for a large variety of MLs methods.


Parsimonious module inference in large networks

arXiv.org Machine Learning

We investigate the detectability of modules in large networks when the number of modules is not known in advance. We employ the minimum description length (MDL) principle which seeks to minimize the total amount of information required to describe the network, and avoid overfitting. According to this criterion, we obtain general bounds on the detectability of any prescribed block structure, given the number of nodes and edges in the sampled network. We also obtain that the maximum number of detectable blocks scales as $\sqrt{N}$, where $N$ is the number of nodes in the network, for a fixed average degree $$. We also show that the simplicity of the MDL approach yields an efficient multilevel Monte Carlo inference algorithm with a complexity of $O(\tau N\log N)$, if the number of blocks is unknown, and $O(\tau N)$ if it is known, where $\tau$ is the mixing time of the Markov chain. We illustrate the application of the method on a large network of actors and films with over $10^6$ edges, and a dissortative, bipartite block structure.


A Real-Time Decision Support System for High Cost Oil-Well Drilling Operations

AI Magazine

In this article we present DrillEdge — a commercial and award winning software system that monitors oil-well drilling operations in order to reduce non-productive time (NPT). DrillEdge utilizes case-based reasoning with temporal representations on streaming real-time data, pattern matching and agent systems to predict problems and give advice on how to mitigate the problems. The methods utilized, the architecture, the GUI and development cost in addition to two case studies are documented.


AAAI Conferences Calendar

AI Magazine

IAAI-14 will be held July Sixth Annual Symposium on Combinatorial 27-31, 2014, in Quebec City, Quebec, Search. SoCS 2013 will be AAAI Spring Symposium Series. ICINCO 2013 will be Seventh International AAAI Conference on Weblogs and Social Media. Twenty-Sixth International FLAIRS held July 28-31, 2013 in Reykjavík, ICWSM-13 will be held July 8-11, 2013 Conference. Twenty-Seventh AAAI Conference on Twenty-Third International Conference COGSCI 2013 will be held July 31 - Artificial Intelligence and Twenty-on Automated Planning and August 3, 2013 in Berlin, Germany Fifth Innovative Applications of Artificial Scheduling.


Statistical Anomaly Detection for Train Fleets

AI Magazine

The Swedish Institute of Computer Science (SICS) has for several years developed methods for statistical anomaly detection based on a framework called Bayesian principal anomaly (Holst and Ekman 2011). In this article we describe a novel application Addtrack is a tool developed originally by Bombardier domain for the anomaly-detection method: condition Transportation for general analysis, monitoring, monitoring of trains (Holst, Ekman, and and visualization of train conditions and Larsen 2006). It is "intelligent" in statistical models. There are currently many the sense that analysis modules, such as the one popular anomaly-detection methods based on described in this article, can be used to preprocess nonparametric models (see, for example, Ahmed, and visualize data sets. Addtrack, including the anomalydetection model is very general since the parametric module described in this article, is forms of the distributions need not be currently deployed in Sweden, India, China, and known.


RoboCup Rescue Robot and Simulation Leagues

AI Magazine

The RoboCup Rescue Robot and Simulation competitions have been held since 2000. The experience gained during these competitions has increased the maturity level of the field, which allowed deploying robots after real disasters (for example, Fukushima Daiichi nuclear disaster). This article provides an overview of these competitions and highlights the state of the art and the lessons learned.