kernel


Demo: Kernel methods for machine learning applications · Issue #1 · ohbm/OpenScienceRoom2019

#artificialintelligence

This library fills an important void in the ever-growing python-based machine learning ecosystem, where users are limited to few predefined kernels without the ability to customize or extend them for their own applications. This library defines the KernelMatrix class that is central to all the kernel methods. As it is a key bridge between input data and kernel learning algorithms, it is designed to be highly usable and extensible to different applications and data types. Kernel operations implemented are normalization, centering, product, alignment, linear combination and ranking. Convenience classes, such as Kernel{Set,Bucket}, are designed for easy management of a large collection of kernels. Dealing with diverse kernels and their fusion is necessary for automatic kernel selection in applications such as Multiple Kernel Learning. Besides numerical kernels, we designed this library to provide categorical, string and graph kernels, with the same attractive properties of intuitive and highly-testable API. Besides non-numerical kernels, we aim to provide a deeply extensible framework for arbitrary input data types, such as sequences and trees, via pyradigm. Moreover, drop-in Estimator classes are provided for seamless usage in scikit-learn ecosystem.


Generative Adversarial Networks - Part IV

#artificialintelligence

This is the Part 4 of a short series of posts introducing and building generative adversarial networks, known as GANs. Previously: Part 1 introduced the idea of adversarial learning and we started to build the machinery of a GAN implementation. Part 2 we extended our code to learn a simple 1-dimensional pattern 1010. Part 3 we developed our code to learn to generate 2-dimensional grey-scale images that look like handwritten digits In this post we'll extend our code again to lean to generate full-colour images, learning from a dataset of celebrity face photos. The ideas should be the same, and the code shouldn't need much new added to it. Celebrity Faces A popular dataset for human faces is the celebA dataset which contains 202,599 photos, annotated with some features. A revised version was developed, called the aligned celebA dataset, where the location of the eyes is consistent across the dataset and the orientation of the heads is vertical so the mouth is below the eyes were possible. The following shows 6 samples from the dataset.


Graph Convolutional Networks for Geometric Deep Learning

#artificialintelligence

Graph convolutions are very different from graph embedding methods that were covered in the previous installment. Instead of transforming a graph to a lower dimension, convolutional methods are performed on the input graph itself, with structure and features left unchanged. Since the graph remains closest to its original form in a higher dimension, the relational inductive bias is therefore much stronger. There is a type of inductive bias in every machine learning algorithm. In vanilla CNNs for example, the minimum features inductive bias states that unless there is good evidence that a feature is useful, it should be deleted.


PyTorch internals : Inside 245-5D

#artificialintelligence

This post is a long form essay version of a talk about PyTorch internals, that I gave at the PyTorch NYC meetup on May 14, 2019. Today I want to talk about the internals of PyTorch. This talk is for those of you who have used PyTorch, and thought to yourself, "It would be great if I could contribute to PyTorch," but were scared by PyTorch's behemoth of a C codebase. I'm not going to lie: the PyTorch codebase can be a bit overwhelming at times. The purpose of this talk is to put a map in your hands: to tell you about the basic conceptual structure of a "tensor library that supports automatic differentiation", and give you some tools and tricks for finding your way around the codebase. I'm going to assume that you've written some PyTorch before, but haven't necessarily delved deeper into how a machine learning library is written. The talk is in two parts: in the first part, I'm going to first introduce you to the conceptual universe of a tensor library.


Nonparametric Density Estimation for Stochastic Optimization with an Observable State Variable

Neural Information Processing Systems

We study convex stochastic optimization problems where a noisy objective function value is observed after a decision is made. There are many stochastic optimization problems whose behavior depends on an exogenous state variable which affects the shape of the objective function. Currently, there is no general purpose algorithm to solve this class of problems. We use nonparametric density estimation for the joint distribution of state-outcome pairs to create weights for previous observations. Those similar to the current state are used to create a convex, deterministic approximation of the objective function.


Brain-Machine Interfaces Could Give Us All Superpowers

#artificialintelligence

One rainy day, Bill was riding his bicycle when the mail truck in front of him suddenly stopped. The crash left him paralyzed from the chest down. His autonomy, or what's left of it, comes from voice controls that let him lower and lift the blinds in his room or adjust the angle of his motorized bed. For everything else, he relies on round-the-clock care. Bill doesn't know Anne, who has Parkinson's disease; her hands shake when she tries to apply makeup or weed the garden.


Automatic Emotion Recognition (AER) System based on Two-Level Ensemble of Lightweight Deep CNN Models

arXiv.org Machine Learning

Emotions play a crucial role in human interaction, health care and security investigations and monitoring. Automatic emotion recognition (AER) using electroencephalogram (EEG) signals is an effective method for decoding the real emotions, which are independent of body gestures, but it is a challenging problem. Several automatic emotion recognition systems have been proposed, which are based on traditional hand-engineered approaches and their performances are very poor. Motivated by the outstanding performance of deep learning (DL) in many recognition tasks, we introduce an AER system (Deep-AER) based on EEG brain signals using DL. A DL model involves a large number of learnable parameters, and its training needs a large dataset of EEG signals, which is difficult to acquire for AER problem. To overcome this problem, we proposed a lightweight pyramidal one-dimensional convolutional neural network (LP-1D-CNN) model, which involves a small number of learnable parameters. Using LP-1D-CNN, we build a two level ensemble model. In the first level of the ensemble, each channel is scanned incrementally by LP-1D-CNN to generate predictions, which are fused using majority vote. The second level of the ensemble combines the predictions of all channels of an EEG signal using majority vote for detecting the emotion state. We validated the effectiveness and robustness of Deep-AER using DEAP, a benchmark dataset for emotion recognition research. The results indicate that FRONT plays dominant role in AER and over this region, Deep-AER achieved the accuracies of 98.43% and 97.65% for two AER problems, i.e., high valence vs low valence (HV vs LV) and high arousal vs low arousal (HA vs LA), respectively. The comparison reveals that Deep-AER outperforms the state-of-the-art systems with large margin. The Deep-AER system will be helpful in monitoring for health care and security investigations.


Bayesian Optimization for Policy Search via Online-Offline Experimentation

arXiv.org Machine Learning

Online field experiments are the gold-standard way of evaluating changes to real-world interactive machine learning systems. Yet our ability to explore complex, multi-dimensional policy spaces - such as those found in recommendation and ranking problems - is often constrained by the limited number of experiments that can be run simultaneously. To alleviate these constraints, we augment online experiments with an offline simulator and apply multi-task Bayesian optimization to tune live machine learning systems. We describe practical issues that arise in these types of applications, including biases that arise from using a simulator and assumptions for the multi-task kernel. We measure empirical learning curves which show substantial gains from including data from biased offline experiments, and show how these learning curves are consistent with theoretical results for multi-task Gaussian process generalization. We find that improved kernel inference is a significant driver of multi-task generalization. Finally, we show several examples of Bayesian optimization efficiently tuning a live machine learning system by combining offline and online experiments.


Graph Matching Networks for Learning the Similarity of Graph Structured Objects

arXiv.org Machine Learning

This paper addresses the challenging problem of retrieval and matching of graph structured objects, and makes two key contributions. First, we demonstrate how Graph Neural Networks (GNN), which have emerged as an effective model for various supervised prediction problems defined on structured data, can be trained to produce embedding of graphs in vector spaces that enables efficient similarity reasoning. Second, we propose a novel Graph Matching Network model that, given a pair of graphs as input, computes a similarity score between them by jointly reasoning on the pair through a new cross-graph attention-based matching mechanism. We demonstrate the effectiveness of our models on different domains including the challenging problem of control-flow-graph based function similarity search that plays an important role in the detection of vulnerabilities in software systems. The experimental analysis demonstrates that our models are not only able to exploit structure in the context of similarity learning but they can also outperform domain-specific baseline systems that have been carefully hand-engineered for these problems.


Graph Kernels: A Survey

arXiv.org Machine Learning

Graph kernels have attracted a lot of attention during the last decade, and have evolved into a rapidly developing branch of learning on structured data. During the past 20 years, the considerable research activity that occurred in the field resulted in the development of dozens of graph kernels, each focusing on specific structural properties of graphs. Graph kernels have proven successful in a wide range of domains, ranging from social networks to bioinformatics. The goal of this survey is to provide a unifying view of the literature on graph kernels. In particular, we present a comprehensive overview of a wide range of graph kernels. Furthermore, we perform an experimental evaluation of several of those kernels on publicly available datasets, and provide a comparative study. Finally, we discuss key applications of graph kernels, and outline some challenges that remain to be addressed.