Goto

Collaborating Authors

 Country


Traffic map prediction using UNet based deep convolutional neural network

arXiv.org Machine Learning

This paper describes our UNet based deep convolutional neural network approach on the Traffic4cast challenge 2019. Challenges task is to predict future traffic flow volume, heading and speed on high resolution whole city map. We used UNet based deep convolutional neural network to train predictive model for the short term traffic forecast. On each convolution block, layers are densely connected with subsequent layers like a DenseNet. Trained and evaluated on the real world data set collected from three distinct cities in the world, our method achieved best performance in this challenge.


A psychophysics approach for quantitative comparison of interpretable computer vision models

arXiv.org Artificial Intelligence

The field of transparent Machine Learning (ML) has contributed many novel methods aiming at better interpretability for computer vision and ML models in general. But how useful the explanations provided by transparent ML methods are for humans remains difficult to assess. Most studies evaluate interpretability in qualitative comparisons, they use experimental paradigms that do not allow for direct comparisons amongst methods or they report only offline experiments with no humans in the loop. While there are clear advantages of evaluations with no humans in the loop, such as scalability, reproducibility and less algorithmic bias than with humans in the loop, these metrics are limited in their usefulness if we do not understand how they relate to other metrics that take human cognition into account. Here we investigate the quality of interpretable computer vision algorithms using techniques from psychophysics. In crowdsourced annotation tasks we study the impact of different interpretability approaches on annotation accuracy and task time. In order to relate these findings to quality measures for interpretability without humans in the loop we compare quality metrics with and without humans in the loop. Our results demonstrate that psychophysical experiments allow for robust quality assessment of transparency in machine learning. Interestingly the quality metrics computed without humans in the loop did not provide a consistent ranking of interpretability methods nor were they representative for how useful an explanation was for humans. These findings highlight the potential of methods from classical psychophysics for modern machine learning applications. We hope that our results provide convincing arguments for evaluating interpretability in its natural habitat, human-ML interaction, if the goal is to obtain an authentic assessment of interpretability.


Machine Learning-based Signal Detection for PMH Signals in Load-modulated MIMO System

arXiv.org Machine Learning

Phase Modulation on the Hypersphere (PMH) is a power efficient modulation scheme for the load-modulated multiple-input multiple-output (MIMO) transmitters with central power amplifiers (CP A). However, it is difficult to obtain the precise channel state information (CSI), and the traditional optimal maximum likelihood (ML) detection scheme incurs high complexity which increases exponentially with the number of antennas and the number of bits carried per antenna in the PMH modulation. To detect the PMH signals without knowing the prior CSI, we first propose a signal detection scheme, termed as the hypersphere clustering scheme based on the expectation maximization (EM) algorithm with maximum likelihood detection (HEM-ML). By leveraging machine learning, the proposed detection scheme can accurately obtain information of the channel from a few of the received symbols with little resource cost and achieve comparable detection results as that of the optimal ML detector. To further reduce the computational complexity in the ML detection in HEM-ML, we also propose the second signal detection scheme, termed as the hypersphere clustering scheme based on the EM algorithm with KD-tree detection (HEM-KD). The CSI obtained from the EM algorithm is used to build a spatial KD-tree receiver codebook and the signal detection problem can be transformed into a nearest neighbor search (NNS) problem. The detection complexity of HEM-KD is significantly reduced without any detection performance loss as compared to HEM-ML. Extensive simulation results verify the effectiveness of our proposed detection schemes. I NTRODUCTION The fifth generation (5G) wireless communication network is forecasted to provide over 1000 times higher capacity than the current system. In addition to dramatically expanding the available bandwidth, multiple-input multiple-output (MIMO) technology is playing a key role in improving the spectral efficiency (SE) and enhancing the throughput in the future wireless cellular communication systems [1]. This ambitious goal will however cause an inevitable energy consumption problem, thus limiting the number of the antennas at the base station (BS) and the user terminals in practice [2]. In the traditional design of the MIMO transceivers, each antenna is connected with one distinct radio frequency (RF) chain which includes a power amplifier (P A). This kind of structure enables the power consumption of the transmission to grow linearly with the number of the antennas. In addition, the use of Orthogonal Frequency Division Multiplexing (OFDM) signals in massive MIMO systems leads to a high peak-to-average power ratios (P APR) and exacerbates the costs of P As, thus reducing the power efficiency. On the other hand, to alleviate the effects of mutual coupling and correlated fading, the antennas should be set at least half of a wavelength apart from each other, which will inevitably cause the size problem [3].


Improving EEG based Continuous Speech Recognition

arXiv.org Machine Learning

Improving EEG based Continuous Speech Recognition Gautam Krishna Brain Machine Interface Lab The University of T exas at Austin Austin, Texas Co Tran Brain Machine Interface Lab The University of T exas at Austin Austin, Texas Mason Carnahan Brain Machine Interface Lab The University of T exas at Austin Austin, Texas Y an Han Brain Machine Interface Lab The University of T exas at Austin Austin, Texas Ahmed H Tewfik Brain Machine Interface Lab The University of T exas at Austin Austin, Texas Abstract --In this paper we introduce various techniques to improve the performance of electroencephalography (EEG) features based continuous speech recognition (CSR) systems. A connectionist temporal classification (CTC) based automatic speech recognition (ASR) system was implemented for performing recognition. We introduce techniques to initialize the weights of the recurrent layers in the encoder of the CTC model with more meaningful weights rather than with random weights and we make use of an external language model to improve the beam search during decoding time. We finally study the problem of predicting articulatory features from EEG features in this paper . ASR systems forms front end or back end in many state of the art voice assistant systems like Bixby, Alexa,Siri,Cortana etc.


Histogram Transform Ensembles for Density Estimation

arXiv.org Machine Learning

We investigate an algorithm named histogram transform ensembles (HTE) density estimator whose effectiveness is supported by both solid theoretical analysis and significant experimental performance. On the theoretical side, by decomposing the error term into approximation error and estimation error, we are able to conduct the following analysis: First of all, we establish the universal consistency under $L_1(\mu)$-norm. Secondly, under the assumption that the underlying density function resides in the H\"{o}lder space $C^{0,\alpha}$, we prove almost optimal convergence rates for both single and ensemble density estimators under $L_1(\mu)$-norm and $L_{\infty}(\mu)$-norm for different tail distributions, whereas in contrast, for its subspace $C^{1,\alpha}$ consisting of smoother functions, almost optimal convergence rates can only be established for the ensembles and the lower bound of the single estimators illustrates the benefits of ensembles over single density estimators. In the experiments, we first carry out simulations to illustrate that histogram transform ensembles surpass single histogram transforms, which offers powerful evidence to support the theoretical results in the space $C^{1,\alpha}$. Moreover, to further exert the experimental performances, we propose an adaptive version of HTE and study the parameters by generating several synthetic datasets with diversities in dimensions and distributions. Last but not least, real data experiments with other state-of-the-art density estimators demonstrate the accuracy of the adaptive HTE algorithm.


Multi-Component Graph Convolutional Collaborative Filtering

arXiv.org Machine Learning

Xiao Wang 1, Ruijia Wang 1, Chuan Shi 1, Guojie Song 2, Qingyong Li 3 1 Beijing University of Posts and Telecommunications, 2 Peking University, 3 Beijing Jiaotong University {xiaowang, wangruijia, shichuan }@bupt.edu.cn, Abstract The interactions of users and items in recommender system could be naturally modeled as a user-item bipartite graph. In recent years, we have witnessed an emerging research effort in exploring user-item graph for collaborative filtering methods. Nevertheless, the formation of user-item interactions typically arises from highly complex latent purchasing motivations, such as high cost performance or eye-catching appearance, which are indistinguishably represented by the edges. The existing approaches still remain the differences between various purchasing motivations unexplored, rendering the inability to capture fine-grained user preference. Therefore, in this paper we propose a novel Multi-Component graph con-volutional Collaborative Filtering (MCCF) approach to distinguish the latent purchasing motivations underneath the observed explicit user-item interactions. Specifically, there are two elaborately designed modules, decomposer and com-biner, inside MCCF. The former first decomposes the edges in user-item graph to identify the latent components that may cause the purchasing relationship; the latter then recombines these latent components automatically to obtain unified em-beddings for prediction. Furthermore, the sparse regularizer and weighted random sample strategy are utilized to alleviate the overfitting problem and accelerate the optimization.


Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator

arXiv.org Machine Learning

Mutual information is widely applied to learn latent representations of observations, whilst its implication in classification neural networks remain to be better explained. In this paper, we show that optimising the parameters of classification neural networks with softmax cross-entropy is equivalent to maximising the mutual information between inputs and labels under the balanced data assumption. Through the experiments on synthetic and real datasets, we show that softmax cross-entropy can estimate mutual information approximately. When applied to image classification, this relation helps approximate the point-wise mutual information between an input image and a label without modifying the network structure. In this end, we propose infoCAM, informative class activation map, which highlights regions of the input image that are the most relevant to a given label based on differences in information. The activation map helps localise the target object in an image. Through the experiments on the semi-supervised object localisation task with two real-world datasets, we evaluate the effectiveness of the information-theoretic approach.


A Unified Deep Learning Approach for Prediction of Parkinson's Disease

arXiv.org Machine Learning

The paper presents a novel approach, based on deep learning, for diagnosis of Parkinson's disease through medical imaging. The approach includes analysis and use of the knowledge extracted by Deep Convolutional and Recurrent Neural Networks (DNNs) when trained with medical images, such as Magnetic Resonance Images and DaTscans. Internal representations of the trained DNNs constitute the extracted knowledge which is used in a transfer learning and domain adaptation manner, so as to create a unified framework for prediction of Parkinson's across different medical environments. A large experimental study is presented illustrating the ability of the proposed approach to effectively predict Parkinson's, using different medical image sets from real environments.


Trajectory growth lower bounds for random sparse deep ReLU networks

arXiv.org Machine Learning

Deep neural networks continue to set new benchmarks for machin e learning accuracy across a wide range of tasks, and are the basis for many algorithms we use routinely and on a daily basis. One fundamental set of theoretical questions concerning deep networks relates t o their expressivity. There remain different approaches to understanding and quantifying neural network ex pressivity. Some results take a classical approximation theory approach, focusing on the relationship betw een the architecture of the network and the classes of functions it can accurately approximate ([15, 3, 10]). Another more recent approach has been to apply persistent homology to characterise expressivity ([7]), wh ile [18] focus on global curvature, and the ability of deep networks to disentangle manifolds. Other works c oncentrate specifically on networks with piecewise linear activation functions, using the number of linear r egions ([17]) or the volume of the boundaries between linear regions ([9]) in input space. In 2017, [19] p roposed trajectory length as a measure of expressivity; in particular, they consider the expecte d change in length of a one-dimensional trajectory as it is passed through Gaussian random neural netwo rks (see Figure 1 for an illustration). Their primary theoretical result was that, in expectation, the length of a one-dimensional trajectory which is passed through a fully-connected, Gaussian network is lower bounded by a factor that is exponential with depth, but not with width.


DeepSmartFuzzer: Reward Guided Test Generation For Deep Learning

arXiv.org Machine Learning

Testing Deep Neural Network (DNN) models has become more important than ever with the increasing usage of DNN models in safety-critical domains such as autonomous cars. The traditional approach of testing DNNs is to create a test set, which is a random subset of the dataset about the problem of interest. This kind of approach is not enough for testing most of the real-world scenarios since these traditional test sets do not include corner cases, while a corner case input is generally considered to introduce erroneous behaviors. Recent works on adversarial input generation, data augmentation, and coverage-guided fuzzing (CGF) have provided new ways to extend traditional test sets. Among those, CGF aims to produce new test inputs by fuzzing existing ones to achieve high coverage on a test adequacy criterion (i.e. coverage criterion). Given that the subject test adequacy criterion is a well-established one, CGF can potentially find error inducing inputs for different underlying reasons. In this paper, we propose a novel CGF solution for structural testing of DNNs. The proposed fuzzer employs Monte Carlo Tree Search to drive the coverage-guided search in the pursuit of achieving high coverage. Our evaluation shows that the inputs generated by our method result in higher coverage than the inputs produced by the previously introduced coverage-guided fuzzing techniques.