Goto

Collaborating Authors

 Perceptrons



Neural Networks Learning the Concept of Influence in Go

AAAI Conferences

This paper describes an intelligent agent that uses a MLP (Multi-Layer Perceptron) Neural Network (NN) in order to evaluate a game state in the game of Go based, exclusively, in an influence analysis. The NN learns the concept of Influence, which is domain specific to the game of Go. The learned function is used to evaluate board states in order to predict which player will win the match. The results show that, in later stages, the NN can achieve an accuracy of up to 89.3% when predicting the winner of the game. As future work the authors propose the incorporation of several improvements to the NN and also its integration intelligent player agents for the game of go, such as Fuego and GnuGo.


No Match for Machine Learning: How the Future of Computing is Solving Difficult Problems from Terrorism to Cancer to Climate Change

#artificialintelligence

Machine learning and the artificial intelligence that it promises to deliver are clearly here to stay. The only remaining question is what will these technologies conquer next? The algorithms and techniques that have been exciting researchers and practitioners over the last few years are being dramatically improved, tuned for perfection, and in some cases completely replaced by a new generation of increasingly powerful algorithms. The investments in areas such as deep learning and the promise of building multi-layer perceptron (or artificial neurons) to solve a host of challenging problems has started to move out of dusty offices and laboratories toward the center of our economy in areas such as healthcare, marketing, communications, finance, energy, education, and even public safety. The number of useful applications is growing rapidly and the benefits of early investments by technology giants and influential research institutions are paying off nicely.


Machine Learning FAQ

#artificialintelligence

That's an interesting question, and I try to answer this is a very general way. The tl;dr version of this is: Deep learning is essentially a set of techniques that help we to parameterize deep neural network structures, neural networks with many, many layers and parameters. And if we are interested, a more concrete example: Let's start with multi-layer perceptrons (MLPs) … On a tangent: The term "perceptron" in MLPs may be a bit confusing since we don't really want only linear neurons in our network. Using MLPs, we want to learn complex functions to solve non-linear problems. Thus, our network is conventionally composed of one or multiple "hidden" layers that connect the input and output layer.


Inferring Interpersonal Relations in Narrative Summaries

AAAI Conferences

Characterizing relationships between people is fundamental for the understanding of narratives. In this work, we address the problem of inferring the polarity of relationships between people in narrative summaries. We formulate the problem as a joint structured prediction for each narrative, and present a general model that combines evidence from linguistic and semantic features, as well as features based on the structure of the social community in the text. We additionally provide a clustering-based approach that can exploit regularities in narrative types. e.g., learn an affinity for love-triangles in romantic stories. On a dataset of movie summaries from Wikipedia, our structured models provide more than 30% error-reduction over a competitive baseline that considers pairs of characters in isolation.


Sparse Perceptron Decision Tree for Millions of Dimensions

AAAI Conferences

Due to the nonlinear but highly interpretable representations,decision tree (DT) models have significantly attracted a lot of attention of researchers. However, DT models usually suffer from the curse of dimensionality and achieve degenerated performance when there are many noisy features. To address these issues, this paper first presents a novel data-dependent generalization error bound for the perceptron decision tree(PDT), which provides the theoretical justification to learn a sparse linear hyperplane in each decision node and to prune the tree. Following our analysis, we introduce the notion of sparse perceptron decision node (SPDN) with a budget constraint on the weight coefficients, and propose a sparse perceptron decision tree (SPDT) algorithm to achieve nonlinear prediction performance. To avoid generating an unstable and complicated decision tree and improve the generalization of the SPDT, we present a pruning strategy by learning classifiers to minimize cross-validation errors on each SPDN. Extensive empirical studies verify that our SPDT is more resilient to noisy features and effectively generates a small,yet accurate decision tree. Compared with state-of-the-art DT methods and SVM, our SPDT achieves better generalization performance on ultrahigh dimensional problems with more than 1 million features.


Enhanced perceptrons using contrastive biclusters

arXiv.org Machine Learning

Perceptrons are neuronal devices capable of fully discriminating linearly separable classes. Although straightforward to implement and train, their applicability is usually hindered by non-trivial requirements imposed by real-world classification problems. Therefore, several approaches, such as kernel perceptrons, have been conceived to counteract such difficulties. In this paper, we investigate an enhanced perceptron model based on the notion of contrastive biclusters. From this perspective, a good discriminative bicluster comprises a subset of data instances belonging to one class that show high coherence across a subset of features and high differentiation from nearest instances of the other class under the same features (referred to as its contrastive bicluster). Upon each local subspace associated with a pair of contrastive biclusters a perceptron is trained and the model with highest area under the receiver operating characteristic curve (AUC) value is selected as the final classifier. Experiments conducted on a range of data sets, including those related to a difficult biosignal classification problem, show that the proposed variant can be indeed very useful, prevailing in most of the cases upon standard and kernel perceptrons in terms of accuracy and AUC measures.


Variational Auto-encoded Deep Gaussian Processes

arXiv.org Machine Learning

We develop a scalable deep non-parametric generative model by augmenting deep Gaussian processes with a recognition model. Inference is performed in a novel scalable variational framework where the variational posterior distributions are reparametrized through a multilayer perceptron. The key aspect of this reformulation is that it prevents the proliferation of variational parameters which otherwise grow linearly in proportion to the sample size. We derive a new formulation of the variational lower bound that allows us to distribute most of the computation in a way that enables to handle datasets of the size of mainstream deep learning tasks. We show the efficacy of the method on a variety of challenges including deep unsupervised learning and deep Bayesian optimization.


Quantum Perceptron Models

arXiv.org Machine Learning

Quantum computation is an emerging technology that utilizes quantum effects to achieve significant, and in some cases exponential, speedups of algorithms over their classical counterparts. The growing importance of machine learning has in recent years led to a host of studies that investigate the promise of quantum computers for machine learning [1, 2, 12, 13, 17, 21-23]. While a number of important quantum speedups have been found, the majority of these speedups are due to replacing a classical subroutine with an equivalent albeit faster quantum algorithm. The true potential of quantum algorithms may therefore remain underexploited since quantum algorithms have been constrainted to follow the same methodology behind traditional machine learning methods [2, 7, 22]. Here we consider an alternate approach: we devise a new machine learning algorithm that is tailored to the speedups that quantum computers can provide.


Towards A Deeper Geometric, Analytic and Algorithmic Understanding of Margins

arXiv.org Artificial Intelligence

Given a matrix $A$, a linear feasibility problem (of which linear classification is a special case) aims to find a solution to a primal problem $w: A^Tw > \textbf{0}$ or a certificate for the dual problem which is a probability distribution $p: Ap = \textbf{0}$. Inspired by the continued importance of "large-margin classifiers" in machine learning, this paper studies a condition measure of $A$ called its \textit{margin} that determines the difficulty of both the above problems. To aid geometrical intuition, we first establish new characterizations of the margin in terms of relevant balls, cones and hulls. Our second contribution is analytical, where we present generalizations of Gordan's theorem, and variants of Hoffman's theorems, both using margins. We end by proving some new results on a classical iterative scheme, the Perceptron, whose convergence rates famously depends on the margin. Our results are relevant for a deeper understanding of margin-based learning and proving convergence rates of iterative schemes, apart from providing a unifying perspective on this vast topic.