Goto

Collaborating Authors

 Genre


A new framework for optimal classifier design

arXiv.org Machine Learning

Accuracy, Recall, Precision, F-measure, Kappa, ACU [García et al. (2012)] and some other new proposed measures like Informedness and Markedness [Powers (2011)] are examples of different evaluation measures. Depending on the problem and the field of application one measure could be more suitable than another. While in the Behavioral Sciences, Specificity and Sensitivity are commonly used, in the Medical Sciences, ROC analysis is a standard for evaluation. On the other hand, in the Information Retrieval community and fraud detection, Recall, Precision and F-measure are considered appropriate measures for testing effectiveness. In a learning design strategy, the best rule for the specific application will be the one that get the optimal performance for the chosen measure. Looking for the best decision rule, in a Bayesian framework, implies to minimize the overall risk taking into account the different misclassification cost [Duda et al. (2001)]; in an equal misclassification cost problem we can find this optimal solution, with maximum accuracy, selecting the class that has the maximum a posteriori probability. However, finding a decision rule that looks for minimum error rate or maximum accuracy in an imbalanced domain gives solutions strongly biased to favor the majority class, getting poor performance. This problem is particularly important in those applications where the instances of a class (the majority one) heavily outnumber the instances of the other (the minority) class and it is costly to misclassify samples from the minority class.


Using memristor crossbar structure to implement a novel adaptive real time fuzzy modeling algorithm

arXiv.org Artificial Intelligence

Although fuzzy techniques promise fast meanwhile accurate modeling and control abilities for complicated systems, different difficulties have been re-vealed in real situation implementations. Usually there is no escape of it-erative optimization based on crisp domain algorithms. Recently memristor structures appeared promising to implement neural network structures and fuzzy algorithms. In this paper a novel adaptive real-time fuzzy modeling algorithm is proposed which uses active learning method concept to mimic recent understandings of right brain processing techniques. The developed method is based on processing fuzzy numbers to provide the ability of being sensitive to each training data point to expand the knowledge tree leading to plasticity while used defuzzification technique guaranties enough stability. An outstanding characteristic of the proposed algorithm is its consistency to memristor crossbar hardware processing concepts. An analog implemen-tation of the proposed algorithm on memristor crossbars structure is also introduced in this paper. The effectiveness of the proposed algorithm in modeling and pattern recognition tasks is verified by means of computer simulations


Enhancements of Multi-class Support Vector Machine Construction from Binary Learners using Generalization Performance

arXiv.org Machine Learning

We propose several novel methods for enhancing the multi-class SVMs by applying the generalization performance of binary classifiers as the core idea. This concept will be applied on the existing algorithms, i.e., the Decision Directed Acyclic Graph (DDAG), the Adaptive Directed Acyclic Graphs (ADAG), and Max Wins. Although in the previous approaches there have been many attempts to use some information such as the margin size and the number of support vectors as performance estimators for binary SVMs, they may not accurately reflect the actual performance of the binary SVMs. We show that the generalization ability evaluated via a cross-validation mechanism is more suitable to directly extract the actual performance of binary SVMs. Our methods are built around this performance measure, and each of them is crafted to overcome the weakness of the previous algorithm. The proposed methods include the Reordering Adaptive Directed Acyclic Graph (RADAG), Strong Elimination of the classifiers (SE), Weak Elimination of the classifiers (WE), and Voting based Candidate Filtering (VCF). Experimental results demonstrate that our methods give significantly higher accuracy than all of the traditional ones. Especially, WE provides significantly superior results compared to Max Wins which is recognized as the state of the art algorithm in terms of both accuracy and classification speed with two times faster in average. Introduction The support vector machine (SVM) [1, 2] is a high performance learning algorithm constructing a hyperplane to separate two-class data by maximizing the margin between them. There are two approaches for extending SVMs to multi-class problems, i.e., solving the problem by formulating all classes of data under a single optimization, and combining several two-class subproblems. However, the difficulty and complexity to solve the problem with the first method are due to the increase of the number of classes and the size of training data, so the second method is more suitable for practical use. In this paper, we focus on the second approach. For constructing a multi-class classifier from binary ones, the method called one-against-one trains each binary classifier on only two out ofN classes, and builds N (N 1)/ 2 possible classifiers. Several strategies have been proposed for combining the trained classifiers to make the final classification for an unseen data. Friedman [3] suggested the combination strategy called Max Wins .


Sparse and Functional Principal Components Analysis

arXiv.org Machine Learning

Regularized principal components analysis, especially Sparse PCA and Functional PCA, has become widely used for dimension reduction in high-dimensional settings. Many examples of massive data, however, may benefit from estimating both sparse AND functional factors. These include neuroimaging data where there are discrete brain regions of activation (sparsity) but these regions tend to be smooth spatially (functional). Here, we introduce an optimization framework that can encourage both sparsity and smoothness of the row and/or column PCA factors. This framework generalizes many of the existing approaches to Sparse PCA, Functional PCA and two-way Sparse PCA and Functional PCA, as these are all special cases of our method. In particular, our method permits flexible combinations of sparsity and smoothness that lead to improvements in feature selection and signal recovery as well as more interpretable PCA factors. We demonstrate our method on simulated data and a neuroimaging example on EEG data. This work provides a unified framework for regularized PCA that can form the foundation for a cohesive approach to regularization in high-dimensional multivariate analysis.


Compressed Sensing for Block-Sparse Smooth Signals

arXiv.org Machine Learning

Compressed sensing [1], [2] is one of the most exciting topics of present-day signal processing. Signal reconstruction from its low-dimensional representation becomes a possibility, given the sparse nature of the signal and, incoherence and/or restricted isometry property (RIP) [2] of the sensing/measurement process. In this regard, a number of approaches can be used, e.g., basis pursuit (BP) [3], least absolute shrinkage and selection operator (LASSO) [4] and greedy algorithms [5]. In order to exploit the structure of the signal being sensed, a number of variants of LASSO have become popular, e.g., group LASSO (G-LASSO) [6], sparse group LASSO (SG-LASSO) [7] and fused LASSO (F-LASSO) [8], etc. In this letter we propose new LASSO formulations to handle block sparse smooth signals.


Structure Learning of Probabilistic Logic Programs by Searching the Clause Space

arXiv.org Artificial Intelligence

Learning probabilistic logic programming languages is receiving an increasing attention and systems are available for learning the parameters (PRISM, LeProbLog, LFI-ProbLog and EMBLEM) or both the structure and the parameters (SEM-CP-logic and SLIPCASE) of these languages. In this paper we present the algorithm SLIPCOVER for "Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space". It performs a beam search in the space of probabilistic clauses and a greedy search in the space of theories, using the log likelihood of the data as the guiding heuristics. To estimate the log likelihood SLIPCOVER performs Expectation Maximization with EMBLEM. The algorithm has been tested on five real world datasets and compared with SLIPCASE, SEM-CP-logic, Aleph and two algorithms for learning Markov Logic Networks (Learning using Structural Motifs (LSM) and ALEPH++ExactL1). SLIPCOVER achieves higher areas under the precision-recall and ROC curves in most cases.


Task-Driven Dictionary Learning

arXiv.org Machine Learning

Modeling data with linear combinations of a few elements from a learned dictionary has been the focus of much recent research in machine learning, neuroscience and signal processing. For signals such as natural images that admit such sparse representations, it is now well established that these models are well suited to restoration tasks. In this context, learning the dictionary amounts to solving a large-scale matrix factorization problem, which can be done efficiently with classical optimization tools. The same approach has also been used for learning features from data for other purposes, e.g., image classification, but tuning the dictionary in a supervised way for these tasks has proven to be more difficult. In this paper, we present a general formulation for supervised dictionary learning adapted to a wide variety of tasks, and present an efficient algorithm for solving the corresponding optimization problem. Experiments on handwritten digit classification, digital art identification, nonlinear inverse image problems, and compressed sensing demonstrate that our approach is effective in large-scale settings, and is well suited to supervised and semi-supervised classification, as well as regression tasks for data that admit sparse representations.


Exponentially Fast Parameter Estimation in Networks Using Distributed Dual Averaging

arXiv.org Machine Learning

In this paper we present an optimization-based view of distributed parameter estimation and observational social learning in networks. Agents receive a sequence of random, independent and identically distributed (i.i.d.) signals, each of which individually may not be informative about the underlying true state, but the signals together are globally informative enough to make the true state identifiable. Using an optimization-based characterization of Bayesian learning as proximal stochastic gradient descent (with Kullback-Leibler divergence from a prior as a proximal function), we show how to efficiently use a distributed, online variant of Nesterov's dual averaging method to solve the estimation with purely local information. When the true state is globally identifiable, and the network is connected, we prove that agents eventually learn the true parameter using a randomized gossip scheme. We demonstrate that with high probability the convergence is exponentially fast with a rate dependent on the KL divergence of observations under the true state from observations under the second likeliest state. Furthermore, our work also highlights the possibility of learning under continuous adaptation of network which is a consequence of employing constant, unit stepsize for the algorithm.


Spectral Clustering with Imbalanced Data

arXiv.org Machine Learning

Spectral clustering is sensitive to how graphs are constructed from data particularly when proximal and imbalanced clusters are present. We show that Ratio-Cut (RCut) or normalized cut (NCut) objectives are not tailored to imbalanced data since they tend to emphasize cut sizes over cut values. We propose a graph partitioning problem that seeks minimum cut partitions under minimum size constraints on partitions to deal with imbalanced data. Our approach parameterizes a family of graphs, by adaptively modulating node degrees on a fixed node set, to yield a set of parameter dependent cuts reflecting varying levels of imbalance. The solution to our problem is then obtained by optimizing over these parameters. We present rigorous limit cut analysis results to justify our approach. We demonstrate the superiority of our method through unsupervised and semi-supervised experiments on synthetic and real data sets.


Elementos de ingenier\'ia de explotaci\'on de la informaci\'on aplicados a la investigaci\'on tributaria fiscal

arXiv.org Artificial Intelligence

By introducing elements of information mining to tax analysis, by means of data mining software and advanced computational concepts of artificial intelligence, the problem of tax evader's crime against public property has been addressed. Through an empirical approach from a hypothetical case of use, induction algorithms, neural networks and bayesian networks are applied to determine the feasibility of its heuristic application by the tax public administrator. Different strategies are explored to facilitate the work of local and regional federal tax inspectors, considering their limited computational capabilities, but equally effective for those social scientist committed to handcrafting tax research. ----- Apresentando a introdu\c{c}\~ao de elementos de explora\c{c}\~ao de informa\c{c}\~oes para an\'alise fiscal, por meio de software de minera\c{c}\~ao de dados e conceitos avan\c{c}ados computacionais de intelig\^encia artificial, foi abordado o problema do crime de sonegador fiscal contra o patrim\^onio p\'ublico. Atrav\'es de uma abordagem emp\'irica a partir de um caso hipot\'etico de uso, os algoritmos de indu\c{c}\~ao, redes neurais e redes bayesianas s\~ao aplicados para determinar a viabilidade de sua aplica\c{c}\~ao heur\'istica pelo administrador p\'ublico tribut\'ario. Diferentes estrat\'egias s\~ao exploradas para facilitar o trabalho dos inspectores tribut\'arios federais locais e regionais, tendo em conta as suas capacidades computacionais limitados, mas igualmente eficaz para aqueles cientista social comprometido com a investiga\c{c}\~ao fiscal.