Goto

Collaborating Authors

 Country


Euclidean Distances, soft and spectral Clustering on Weighted Graphs

arXiv.org Machine Learning

We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.


A unified view of Automata-based algorithms for Frequent Episode Discovery

arXiv.org Artificial Intelligence

Frequent Episode Discovery framework is a popular framework in Temporal Data Mining with many applications. Over the years many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper we present a unified view of all such frequency counting algorithms. We present a generic algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies and we present quantitative relationships among different frequencies. Our unified view also helps in obtaining correctness proofs for various algorithms as we show here. We also point out how this unified view helps us to consider generalization of the algorithm so that they can discover episodes with general partial orders.


Local search for stable marriage problems with ties and incomplete lists

arXiv.org Artificial Intelligence

The stable marriage problem has a wide variety of practical applications, ranging from matching resident doctors to hospitals, to matching students to schools, or more generally to any two-sided market. We consider a useful variation of the stable marriage problem, where the men and women express their preferences using a preference list with ties over a subset of the members of the other sex. Matchings are permitted only with people who appear in these preference lists. In this setting, we study the problem of finding a stable matching that marries as many people as possible. Stability is an envy-free notion: no man and woman who are not married to each other would both prefer each other to their partners or to being single. This problem is NP-hard. We tackle this problem using local search, exploiting properties of the problem to reduce the size of the neighborhood and to make local moves efficiently. Experimental results show that this approach is able to solve large problems, quickly returning stable matchings of large and often optimal size.


Is Computational Complexity a Barrier to Manipulation?

arXiv.org Artificial Intelligence

When agents are acting together, they may need a simple mechanism to decide on joint actions. One possibility is to have the agents express their preferences in the form of a ballot and use a voting rule to decide the winning action(s). Unfortunately, agents may try to manipulate such an election by misreporting their preferences. Fortunately, it has been shown that it is NP-hard to compute how to manipulate a number of different voting rules. However, NP-hardness only bounds the worst-case complexity. Recent theoretical results suggest that manipulation may often be easy in practice. To address this issue, I suggest studying empirically if computational complexity is in practice a barrier to manipulation. The basic tool used in my investigations is the identification of computational "phase transitions". Such an approach has been fruitful in identifying hard instances of propositional satisfiability and other NP-hard problems. I show that phase transition behaviour gives insight into the hardness of manipulating voting rules, increasing concern that computational complexity is indeed any sort of barrier. Finally, I look at the problem of computing manipulation of other, related problems like stable marriage and tournament problems.


On The Complexity and Completeness of Static Constraints for Breaking Row and Column Symmetry

arXiv.org Artificial Intelligence

We consider a common type of symmetry where we have a matrix of decision variables with interchangeable rows and columns. A simple and efficient method to deal with such row and column symmetry is to post symmetry breaking constraints like DOUBLELEX and SNAKELEX. We provide a number of positive and negative results on posting such symmetry breaking constraints. On the positive side, we prove that we can compute in polynomial time a unique representative of an equivalence class in a matrix model with row and column symmetry if the number of rows (or of columns) is bounded and in a number of other special cases. On the negative side, we show that whilst DOUBLELEX and SNAKELEX are often effective in practice, they can leave a large number of symmetric solutions in the worst case. In addition, we prove that propagating DOUBLELEX completely is NP-hard. Finally we consider how to break row, column and value symmetry, correcting a result in the literature about the safeness of combining different symmetry breaking constraints. We end with the first experimental study on how much symmetry is left by DOUBLELEX and SNAKELEX on some benchmark problems.


Decomposition of the NVALUE constraint

arXiv.org Artificial Intelligence

We study decompositions of the global NVALUE constraint. Our main contribution is theoretical: we show that there are propagators for global constraints like NVALUE which decomposition can simulate with the same time complexity but with a much greater space complexity. This suggests that the benefit of a global propagator may often not be in saving time but in saving space. Our other theoretical contribution is to show for the first time that range consistency can be enforced on NVALUE with the same worst-case time complexity as bound consistency. Finally, the decompositions we study are readily encoded as linear inequalities. We are therefore able to use them in integer linear programs.


Symmetry within and between solutions

arXiv.org Artificial Intelligence

Symmetry can be used to help solve many problems. For instance, Einstein's famous 1905 paper ("On the Electrodynamics of Moving Bodies") uses symmetry to help derive the laws of special relativity. In artificial intelligence, symmetry has played an important role in both problem representation and reasoning. I describe recent work on using symmetry to help solve constraint satisfaction problems. Symmetries occur within individual solutions of problems as well as between different solutions of the same problem. Symmetry can also be applied to the constraints in a problem to give new symmetric constraints. Reasoning about symmetry can speed up problem solving, and has led to the discovery of new results in both graph and number theory.


Discovering Graphical Granger Causality Using the Truncating Lasso Penalty

arXiv.org Machine Learning

Components of biological systems interact with each other in order to carry out vital cell functions. Such information can be used to improve estimation and inference, and to obtain better insights into the underlying cellular mechanisms. Discovering regulatory interactions among genes is therefore an important problem in systems biology. Whole-genome expression data over time provides an opportunity to determine how the expression levels of genes are affected by changes in transcription levels of other genes, and can therefore be used to discover regulatory interactions among genes. In this paper, we propose a novel penalization method, called truncating lasso, for estimation of causal relationships from time-course gene expression data. The proposed penalty can correctly determine the order of the underlying time series, and improves the performance of the lasso-type estimators. Moreover, the resulting estimate provides information on the time lag between activation of transcription factors and their effects on regulated genes. We provide an efficient algorithm for estimation of model parameters, and show that the proposed method can consistently discover causal relationships in the large $p$, small $n$ setting. The performance of the proposed model is evaluated favorably in simulated, as well as real, data examples. The proposed truncating lasso method is implemented in the R-package grangerTlasso and is available at http://www.stat.lsa.umich.edu/~shojaie.


Why Gabor Frames? Two Fundamental Measures of Coherence and Their Role in Model Selection

arXiv.org Machine Learning

This paper studies non-asymptotic model selection for the general case of arbitrary design matrices and arbitrary nonzero entries of the signal. In this regard, it generalizes the notion of incoherence in the existing literature on model selection and introduces two fundamental measures of coherence---termed as the worst-case coherence and the average coherence---among the columns of a design matrix. It utilizes these two measures of coherence to provide an in-depth analysis of a simple, model-order agnostic one-step thresholding (OST) algorithm for model selection and proves that OST is feasible for exact as well as partial model selection as long as the design matrix obeys an easily verifiable property. One of the key insights offered by the ensuing analysis in this regard is that OST can successfully carry out model selection even when methods based on convex optimization such as the lasso fail due to the rank deficiency of the submatrices of the design matrix. In addition, the paper establishes that if the design matrix has reasonably small worst-case and average coherence then OST performs near-optimally when either (i) the energy of any nonzero entry of the signal is close to the average signal energy per nonzero entry or (ii) the signal-to-noise ratio in the measurement system is not too high. Finally, two other key contributions of the paper are that (i) it provides bounds on the average coherence of Gaussian matrices and Gabor frames, and (ii) it extends the results on model selection using OST to low-complexity, model-order agnostic recovery of sparse signals with arbitrary nonzero entries.


Improving Iris Recognition Accuracy By Score Based Fusion Method

arXiv.org Artificial Intelligence

Iris recognition technology, used to identify individuals by photographing the iris of their eye, has become popular in security applications because of its ease of use, accuracy, and safety in controlling access to high-security areas. Fusion of multiple algorithms for biometric verification performance improvement has received considerable attention. The proposed method combines the zero-crossing 1 D wavelet Euler number, and genetic algorithm based for feature extraction. The output from these three algorithms is normalized and their score are fused to decide whether the user is genuine or imposter. This new strategies is discussed in this paper, in order to compute a multimodal combined score.