Goto

Collaborating Authors

 Deep Learning


A Deep Generative Deconvolutional Image Model

arXiv.org Machine Learning

A deep generative model is developed for representation and analysis of images, based on a hierarchical convolutional dictionary-learning framework. Stochastic {\em unpooling} is employed to link consecutive layers in the model, yielding top-down image generation. A Bayesian support vector machine is linked to the top-layer features, yielding max-margin discrimination. Deep deconvolutional inference is employed when testing, to infer the latent features, and the top-layer features are connected with the max-margin classifier for discrimination tasks. The model is efficiently trained using a Monte Carlo expectation-maximization (MCEM) algorithm, with implementation on graphical processor units (GPUs) for efficient large-scale learning, and fast testing. Excellent results are obtained on several benchmark datasets, including ImageNet, demonstrating that the proposed model achieves results that are highly competitive with similarly sized convolutional neural networks.


Implementation of deep learning algorithm for automatic detection of brain tumors using intraoperative IR-thermal mapping data

arXiv.org Machine Learning

The efficiency of deep machine learning for automatic delineation of tumor areas has been demonstrated for intraoperative neuronavigation using active IR-mapping with the use of the cold test. The proposed approach employs a matrix IR-imager to remotely register the space-time distribution of surface temperature pattern, which is determined by the dynamics of local cerebral blood flow. The advantages of this technique are non-invasiveness, zero risks for the health of patients and medical staff, low implementation and operational costs, ease and speed of use. Traditional IR-diagnostic technique has a crucial limitation - it involves a diagnostician who determines the boundaries of tumor areas, which gives rise to considerable uncertainty, which can lead to diagnosis errors that are difficult to control. The current study demonstrates that implementing deep learning algorithms allows to eliminate the explained drawback.


DeepWriterID: An End-to-end Online Text-independent Writer Identification System

arXiv.org Machine Learning

--Owing to the rapid growth of touchscreen mobile terminals and pen-based interfa ces, handwriting-based writer identification systems are attracting increasing attention for personal authentication and digital forensics. However, most studies on writer identification have not been satisfying because of the insufficiency of data and th e difficulty of designing good features for various conditions of handwriting samples. Hence, we introduce an end-to-end system called DeepWriterID that employs a deep convolutional neural network (CNN) to address these problems. A key feature of DeepWriterID is a new method we are proposing, called DropSegment. It is designed to achieve data augmentation and to improve the generalized applicability of CNN. For sufficient feature representation, we further introduce path-signature feature maps to impr ove performance. Experiments were conducted on the NLPR handwriting database. Even though we only use pen-position information in the pen-down state of the given handwriting samples, we achieved new state-of-the-art identification rates of 95.72% for Chinese text and 98.51% for English text.


Action-Conditional Video Prediction using Deep Networks in Atari Games

arXiv.org Artificial Intelligence

Motivated by vision-based reinforcement learning (RL) problems, in particular Atari games from the recent benchmark Aracade Learning Environment (ALE), we consider spatio-temporal prediction problems where future (image-)frames are dependent on control variables or actions as well as previous frames. While not composed of natural scenes, frames in Atari games are high-dimensional in size, can involve tens of objects with one or more objects being controlled by the actions directly and many other objects being influenced indirectly, can involve entry and departure of objects, and can involve deep partial observability. We propose and evaluate two deep neural network architectures that consist of encoding, action-conditional transformation, and decoding layers based on convolutional neural networks and recurrent neural networks. Experimental results show that the proposed architectures are able to generate visually-realistic frames that are also useful for control over approximately 100-step action-conditional futures in some games. To the best of our knowledge, this paper is the first to make and evaluate long-term predictions on high-dimensional video conditioned by control inputs.


Variational Dropout and the Local Reparameterization Trick

arXiv.org Machine Learning

We investigate a local reparameterizaton technique for greatly reducing the variance of stochastic gradients for variational Bayesian inference (SGVB) of a posterior over model parameters, while retaining parallelizability. This local reparameterization translates uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such parameterizations can be trivially parallelized and have variance that is inversely proportional to the minibatch size, generally leading to much faster convergence. Additionally, we explore a connection with dropout: Gaussian dropout objectives correspond to SGVB with local reparameterization, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose variational dropout, a generalization of Gaussian dropout where the dropout rates are learned, often leading to better models. The method is demonstrated through several experiments.


Using machine learning for medium frequency derivative portfolio trading

arXiv.org Machine Learning

Abstract--We use machine learning for designing a medium frequency trading strategy for a portfolio of 5 year and 10 year US Treasury note futures. We formulate this as a classification problem where we predict the weekly direction of movement of the portfolio using features extracted from a deep belief network trained on technical indicators of the portfolio constituents. The experimentation shows that the resulting pipeline is effective in making a profitable trade. I. INTRODUCTION AND RELATED WORK Machine learning application in finance is a challenging problem owing to low signal to noise ratio. Moreover, domain expertise is essential for engineering features which assist in solving an appropriate classification or regression problem.


Domain Adaptation and Transfer Learning in StochasticNets

arXiv.org Machine Learning

Transfer learning is a recent field of machine learning research that aims to resolve the challenge of dealing with insufficient training data in the domain of interest. This is a particular issue with traditional deep neural networks where a large amount of training data is needed. Recently, StochasticNets was proposed to take advantage of sparse connectivity in order to decrease the number of parameters that needs to be learned, which in turn may relax training data size requirements. In this paper, we study the efficacy of transfer learning on StochasticNet frameworks. Experimental results show ~7% improvement on StochasticNet performance when the transfer learning is applied in training step.


Learning optimal nonlinearities for iterative thresholding algorithms

arXiv.org Machine Learning

Iterative shrinkage/thresholding algorithm (ISTA) is a well-studied method for finding sparse solutions to ill-posed inverse problems. In this letter, we present a data-driven scheme for learning optimal thresholding functions for ISTA. The proposed scheme is obtained by relating iterations of ISTA to layers of a simple deep neural network (DNN) and developing a corresponding error backpropagation algorithm that allows to fine-tune the thresholding functions. Simulations on sparse statistical signals illustrate potential gains in estimation quality due to the proposed data adaptive ISTA.


Active Sampler: Light-weight Accelerator for Complex Data Analytics at Scale

arXiv.org Machine Learning

Recent years have witnessed amazing outcomes from "Big Models" trained by "Big Data". Most popular algorithms for model training are iterative. Due to the surging volumes of data, we can usually afford to process only a fraction of the training data in each iteration. Typically, the data are either uniformly sampled or sequentially accessed. In this paper, we study how the data access pattern can affect model training. We propose an Active Sampler algorithm, where training data with more "learning value" to the model are sampled more frequently. The goal is to focus training effort on valuable instances near the classification boundaries, rather than evident cases, noisy data or outliers. We show the correctness and optimality of Active Sampler in theory, and then develop a light-weight vectorized implementation. Active Sampler is orthogonal to most approaches optimizing the efficiency of large-scale data analytics, and can be applied to most analytics models trained by stochastic gradient descent (SGD) algorithm. Extensive experimental evaluations demonstrate that Active Sampler can speed up the training procedure of SVM, feature selection and deep learning, for comparable training quality by 1.6-2.2x.


Efficient Deep Feature Learning and Extraction via StochasticNets

arXiv.org Machine Learning

Deep neural networks are a powerful tool for feature learning and extraction given their ability to model high-level abstractions in highly complex data. One area worth exploring in feature learning and extraction using deep neural networks is efficient neural connectivity formation for faster feature learning and extraction. Motivated by findings of stochastic synaptic connectivity formation in the brain as well as the brain's uncanny ability to efficiently represent information, we propose the efficient learning and extraction of features via StochasticNets, where sparsely-connected deep neural networks can be formed via stochastic connectivity between neurons. To evaluate the feasibility of such a deep neural network architecture for feature learning and extraction, we train deep convolutional StochasticNets to learn abstract features using the CIFAR-10 dataset, and extract the learned features from images to perform classification on the SVHN and STL-10 datasets. Experimental results show that features learned using deep convolutional StochasticNets, with fewer neural connections than conventional deep convolutional neural networks, can allow for better or comparable classification accuracy than conventional deep neural networks: relative test error decrease of ~4.5% for classification on the STL-10 dataset and ~1% for classification on the SVHN dataset. Furthermore, it was shown that the deep features extracted using deep convolutional StochasticNets can provide comparable classification accuracy even when only 10% of the training data is used for feature learning. Finally, it was also shown that significant gains in feature extraction speed can be achieved in embedded applications using StochasticNets. As such, StochasticNets allow for faster feature learning and extraction performance while facilitate for better or comparable accuracy performances.