Goto

Collaborating Authors

 Genre


Equitability Analysis of the Maximal Information Coefficient, with Comparisons

arXiv.org Machine Learning

A measure of dependence is said to be equitable if it gives similar scores to equally noisy relationships of different types. Equitability is important in data exploration when the goal is to identify a relatively small set of strongest associations within a dataset as opposed to finding as many non-zero associations as possible, which often are too many to sift through. Thus an equitable statistic, such as the maximal information coefficient (MIC), can be useful for analyzing high-dimensional data sets. Here, we explore both equitability and the properties of MIC, and discuss several aspects of the theory and practice of MIC. We begin by presenting an intuition behind the equitability of MIC through the exploration of the maximization and normalization steps in its definition. We then examine the speed and optimality of the approximation algorithm used to compute MIC, and suggest some directions for improving both. Finally, we demonstrate in a range of noise models and sample sizes that MIC is more equitable than natural alternatives, such as mutual information estimation and distance correlation.


When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

arXiv.org Machine Learning

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabulary. While general overcomplete topic models are not identifiable, we establish generic identifiability under a constraint, referred to as topic persistence. Our sufficient conditions for identifiability involve a novel set of "higher order" expansion conditions on the topic-word matrix or the population structure of the model. This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words. We establish that random structured topic models are identifiable w.h.p. in the overcomplete regime. Our identifiability results allows for general (non-degenerate) distributions for modeling the topic proportions, and thus, we can handle arbitrarily correlated topics in our framework. Our identifiability results imply uniqueness of a class of tensor decompositions with structured sparsity which is contained in the class of Tucker decompositions, but is more general than the Candecomp/Parafac (CP) decomposition.


Universally consistent vertex classification for latent positions graphs

arXiv.org Machine Learning

In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate feature maps for latent position graphs with positive definite link function $\kappa$, provided that the latent positions are i.i.d. from some distribution F. We then consider the exploitation task of vertex classification where the link function $\kappa$ belongs to the class of universal kernels and class labels are observed for a number of vertices tending to infinity and that the remaining vertices are to be classified. We show that minimization of the empirical $\varphi$-risk for some convex surrogate $\varphi$ of 0-1 loss over a class of linear classifiers with increasing complexities yields a universally consistent classifier, that is, a classification rule with error converging to Bayes optimal for any distribution F.


Extended Distributed Learning Automata:A New Method for Solving Stochastic Graph Optimization Problems

arXiv.org Artificial Intelligence

In this paper, a new structure of cooperative learning automata so-called extended learning automata (eDLA) is introduced. Based on the proposed structure, a new iterative randomized heuristic algorithm for finding optimal sub-graph in a stochastic edge-weighted graph through sampling is proposed. It has been shown that the proposed algorithm based on new networked-structure can be to solve the optimization problems on stochastic graph through less number of sampling in compare to standard sampling. Stochastic graphs are graphs in which the edges have an unknown distribution probability weights. Proposed algorithm uses an eDLA to find a policy that leads to an induced sub-graph that satisfies some restrictions such as minimum or maximum weight (length). At each stage of the proposed algorithm, eDLA determines which edges to be sampled. This eDLA-based proposed sampling method may result in decreasing unnecessary samples and hence decreasing the time that algorithm requires for finding the optimal sub-graph. It has been shown that proposed method converge to optimal solution, furthermore the probability of this convergence can be made arbitrarily close to 1 by using a sufficiently small learning rate. A new variance-aware threshold value was proposed that can be improving significantly convergence rate of the proposed eDLA-based algorithm. It has been shown that the proposed algorithm is competitive in terms of the quality of the solution


Collective Mind: cleaning up the research and experimentation mess in computer engineering using crowdsourcing, big data and machine learning

arXiv.org Machine Learning

Software and hardware co-design and optimization of HPC systems has become intolerably complex, ad-hoc, time consuming and error prone due to enormous number of available design and optimization choices, complex interactions between all software and hardware components, and multiple strict requirements placed on performance, power consumption, size, reliability and cost. We present our novel long-term holistic and practical solution to this problem based on customizable, plugin-based, schema-free, heterogeneous, open-source Collective Mind repository and infrastructure with unified web interfaces and on-line advise system. This collaborative framework distributes analysis and multi-objective off-line and on-line auto-tuning of computer systems among many participants while utilizing any available smart phone, tablet, laptop, cluster or data center, and continuously observing, classifying and modeling their realistic behavior. Any unexpected behavior is analyzed using shared data mining and predictive modeling plugins or exposed to the community at cTuning.org for collaborative explanation, top-down complexity reduction, incremental problem decomposition and detection of correlating program, architecture or run-time properties (features). Gradually increasing optimization knowledge helps to continuously improve optimization heuristics of any compiler, predict optimizations for new programs or suggest efficient run-time (online) tuning and adaptation strategies depending on end-user requirements. We decided to share all our past research artifacts including hundreds of codelets, numerical applications, data sets, models, universal experimental analysis and auto-tuning pipelines, self-tuning machine learning based meta compiler, and unified statistical analysis and machine learning plugins in a public repository to initiate systematic, reproducible and collaborative research, development and experimentation with a new publication model where experiments and techniques are validated, ranked and improved by the community.


CDfdr: A Comparison Density Approach to Local False Discovery Rate Estimation

arXiv.org Machine Learning

Efron et al. (2001) proposed empirical Bayes formulation of the frequentist Benjamini and Hochbergs False Discovery Rate method (Benjamini and Hochberg,1995). This article attempts to unify the `two cultures' using concepts of comparison density and distribution function. We have also shown how almost all of the existing local fdr methods can be viewed as proposing various model specification for comparison density - unifies the vast literature of false discovery methods under one concept and notation.


Learning Features and their Transformations by Spatial and Temporal Spherical Clustering

arXiv.org Artificial Intelligence

Learning features invariant to arbitrary transformations in the data is a requirement for any recognition system, biological or artificial. It is now widely accepted that simple cells in the primary visual cortex respond to features while the complex cells respond to features invariant to different transformations. We present a novel two-layered feedforward neural model that learns features in the first layer by spatial spherical clustering and invariance to transformations in the second layer by temporal spherical clustering. Learning occurs in an online and unsupervised manner following the Hebbian rule. When exposed to natural videos acquired by a camera mounted on a cat's head, the first and second layer neurons in our model develop simple and complex cell-like receptive field properties. The model can predict by learning lateral connections among the first layer neurons. A topographic map to their spatial features emerges by exponentially decaying the flow of activation with distance from one neuron to another in the first layer that fire in close temporal proximity, thereby minimizing the pooling length in an online manner simultaneously with feature learning.


Applying the Negative Selection Algorithm for Merger and Acquisition Target Identification

arXiv.org Artificial Intelligence

In this paper, we propose a new methodology based on the Negative Selection Algorithm that belongs to the field of Computational Intelligence, specifically, Artificial Immune Systems to identify takeover targets. Although considerable research based on customary statistical techniques and some contemporary Computational Intelligence techniques have been devoted to identify takeover targets, most of the existing studies are based upon multiple previous mergers and acquisitions. Contrary to previous research, the novelty of this proposal lies in its ability to suggest takeover targets for novice firms that are at the beginning of their merger and acquisition spree. We first discuss the theoretical perspective and then provide a case study with details for practical implementation, both capitalizing from unique generalization capabilities of artificial immune systems algorithms.


Precisely Verifying the Null Space Conditions in Compressed Sensing: A Sandwiching Algorithm

arXiv.org Machine Learning

In this paper, we propose new efficient algorithms to verify the null space condition in compressed sensing (CS). Given an $(n-m) \times n$ ($m>0$) CS matrix $A$ and a positive $k$, we are interested in computing $\displaystyle \alpha_k = \max_{\{z: Az=0,z\neq 0\}}\max_{\{K: |K|\leq k\}}$ ${\|z_K \|_{1}}{\|z\|_{1}}$, where $K$ represents subsets of $\{1,2,...,n\}$, and $|K|$ is the cardinality of $K$. In particular, we are interested in finding the maximum $k$ such that $\alpha_k < {1}{2}$. However, computing $\alpha_k$ is known to be extremely challenging. In this paper, we first propose a series of new polynomial-time algorithms to compute upper bounds on $\alpha_k$. Based on these new polynomial-time algorithms, we further design a new sandwiching algorithm, to compute the \emph{exact} $\alpha_k$ with greatly reduced complexity. When needed, this new sandwiching algorithm also achieves a smooth tradeoff between computational complexity and result accuracy. Empirical results show the performance improvements of our algorithm over existing known methods; and our algorithm outputs precise values of $\alpha_k$, with much lower complexity than exhaustive search.


Supplement to "Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs"

arXiv.org Machine Learning

This supplementary material includes three parts: some preliminary results, four examples, an experiment, three new algorithms, and all proofs of the results in the paper [4]. In this Section, we provide algorithms introduced by Dor and Tarsi [3], and Chickering [1, 2] respectively. These results are necessary to implement our proposed approach technically. Some definitions and notation are introduced first. A directed edge of a DAG is compelled if it occurs in the corresponding completed PDAG, otherwise, the directed edge is reversible and the corresponding parents are reversible parents.