Goto

Collaborating Authors

 Country


Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise

arXiv.org Machine Learning

--We consider the problem of simultaneous reduction of acoustic echo, reverberation and noise. In real scenarios, these distortion sources may occur simultaneously and reducing them implies combining the corresponding distortion-specific filters. As these filters interact with each other, they must be jointly optimized. We propose to model the target and residual signals after linear echo cancellation and dereverberation using a multichannel Gaussian modeling framework and to jointly represent their spectra by means of a neural network. We develop an iterative block-coordinate ascent algorithm to update all the filters. We evaluate our system on real recordings of acoustic echo, reverberation and noise acquired with a smart speaker in various situations. The proposed approach outperforms in terms of overall distortion a cascade of the individual approaches and a joint reduction approach which does not rely on a spectral model of the target and residual signals. Index T erms--Acoustic echo, reverberation, background noise, joint distortion reduction, expectation-maximization, recurrent neural network. The near-end speaker can be a few meters away from the microphones and the interactions can be subject to several distortion sources such as background noise, acoustic echo and near-end reverberation. Each of these distortion sources degrades speech quality, intelligibility and listening comfort, and must be reduced. Single-and multichannel filters have been used to reduce each of these distortion sources independently. They can be categorized into short nonlinear filters that vary quickly over time and long linear filters that are time-invariant (or slowly time-varying). Short nonlinear filters are generally used for noise reduction [1]. They are robust to the fluctuations and nonlinearities inherent to real signals. Long linear filters can be required for dereverberation [2] and echo reduction [3].


CNAK : Cluster Number Assisted K-means

arXiv.org Machine Learning

Determining the number of clusters present in a dataset is an important problem in cluster analysis. Conventional clustering techniques generally assume this parameter to be provided up front. In this paper, we propose a method which analyzes cluster stability for predicting the cluster number. Under the same computational framework, the technique also finds representatives of the clusters. The method is apt for handling big data, as we design the algorithm using Monte-Carlo simulation. Also, we explore a few pertinent issues found to be of also clustering. Experiments reveal that the proposed method is capable of identifying a single cluster. It is robust in handling high dimensional dataset and performs reasonably well over datasets having cluster imbalance. Moreover, it can indicate cluster hierarchy, if present. Overall we have observed significant improvement in speed and quality for predicting cluster numbers as well as the composition of clusters in a large dataset. Keywords: k-means clustering, Bipartite graph, Perfect Matching, Kuhn-Munkres Algorithm, Monte Carlo simulation. 1. Introduction In cluster analysis, it is required to group a set of data points in a multidimensional space, so that data points in the same group are more similar to each other than to those in other groups. These groups are called clusters. Various distance functions may be used to compute the degree of similarity or dissimilarity among these data points. Typically Euclidean distance function is widely used in clustering. The aim of this unsupervised technique is to increase homogeneity in a group and heterogeneity between groups. Several clustering methods with different characteristics have been proposed for different purposes. Some well-known methods include partition-based clustering [26], hierarchical clustering [25], spectral clustering [27], density-based clustering [12]. However, they require the knowledge of cluster number for a given dataset a priori [12, 21, 26, 27, 36].


Learning Generalized Quasi-Geostrophic Models Using Deep Neural Numerical Models

arXiv.org Machine Learning

We introduce a new strategy designed to help physicists discover hidden laws governing dynamical systems. We propose to use machine learning automatic differentiation libraries to develop hybrid numerical models that combine components based on prior physical knowledge with components based on neural networks. In these architectures, named Deep Neural Numerical Models (DNNMs), the neural network components are used as building-blocks then deployed for learning hidden variables of underlying physical laws governing dynamical systems. In this paper, we illustrate an application of DNNMs to upper ocean dynamics, more precisely the dynamics of a sea surface tracer, the Sea Surface Height (SSH). We develop an advection-based fully differentiable numerical scheme, where parts of the computations can be replaced with learnable ConvNets, and make connections with the single-layer Quasi-Geostrophic (QG) model, a baseline theory in physical oceanography developed decades ago.


A Fast Sampling Gradient Tree Boosting Framework

arXiv.org Machine Learning

As an adaptive, interpretable, robust, and accurate meta-algorithm for arbitrary differentiable loss functions, gradient tree boosting is one of the most popular machine learning techniques, though the computational expensiveness severely limits its usage. Stochastic gradient boosting could be adopted to accelerates gradient boosting by uniformly sampling training instances, but its estimator could introduce a high variance. This situation arises motivation for us to optimize gradient tree boosting. We combine gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance. Furthermore, we use a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. The theoretical analysis supports that our strategies achieve a linear convergence rate on logistic loss. Empirical results show that our algorithm achieves a 2.5x--18x acceleration on two different gradient boosting algorithms (LogitBoost and LambdaMART) without appreciable performance loss.


Black-box Combinatorial Optimization using Models with Integer-valued Minima

arXiv.org Machine Learning

When a black-box optimization objective can only be evaluated with costly or noisy measurements, most standard optimization algorithms are unsuited to find the optimal solution. Specialized algorithms that deal with exactly this situation make use of surrogate models. These models are usually continuous and smooth, which is beneficial for continuous optimization problems, but not necessarily for combinatorial problems. However, by choosing the basis functions of the surrogate model in a certain way, we show that it can be guaranteed that the optimal solution of the surrogate model is integer. This approach outperforms random search, simulated annealing and one Bayesian optimization algorithm on the problem of finding robust routes for a noise-perturbed traveling salesman benchmark problem, with similar performance as another Bayesian optimization algorithm, and outperforms all compared algorithms on a convex binary optimization problem with a large number of variables.


Object-based multi-temporal and multi-source land cover mapping leveraging hierarchical class relationships

arXiv.org Machine Learning

European satellite missions Sentinel-1 (S1) and Sentinel-2 (S2) provide at highspatial resolution and high revisit time, respectively, radar and optical imagesthat support a wide range of Earth surface monitoring tasks such as LandUse/Land Cover mapping. A long-standing challenge in the remote sensingcommunity is about how to efficiently exploit multiple sources of information and leverage their complementary. In this particular case, get the most out ofradar and optical satellite image time series (SITS). Here, we propose to dealwith land cover mapping through a deep learning framework especially tailoredto leverage the multi-source complementarity provided by radar and opticalSITS. The proposed architecture is based on an extension of Recurrent NeuralNetwork (RNN) enriched via a customized attention mechanism capable to fitthe specificity of SITS data. In addition, we propose a new pretraining strategythat exploits domain expert knowledge to guide the model parameter initial-ization. Thorough experimental evaluations involving several machine learningcompetitors, on two contrasted study sites, have demonstrated the suitabilityof our new attention mechanism combined with the extend RNN model as wellas the benefit/limit to inject domain expert knowledge in the neural networktraining process.


On Node Features for Graph Neural Networks

arXiv.org Machine Learning

Graph neural network (GNN) is a deep model for graph representation learning. One advantage of graph neural network is its ability to incorporate node features into the learning process. However, this prevents graph neural network from being applied into featureless graphs. In this paper, we first analyze the effects of node features on the performance of graph neural network. We show that GNNs work well if there is a strong correlation between node features and node labels. Based on these results, we propose new feature initialization methods that allows to apply graph neural network to non-attributed graphs. Our experimental results show that the artificial features are highly competitive with real features.


A Framework for End-to-End Deep Learning-Based Anomaly Detection in Transportation Networks

arXiv.org Machine Learning

Abstract--We develop an end-to-end deep learning-based anomaly detection model for temporal data in transportatio n networks. The proposed EVT -LSTM model is derived from the popular LSTM (Long Short-T erm Memory) network and adopts an objective function that is based on fundamental results f rom EVT (Extreme V alue Theory). We compare the EVT -LSTM model with some established statistical, machine learning, and hybrid deep learning baselines. Experiments on seven diver se real-world data sets demonstrate the superior anomaly dete ction performance of our proposed model over the other models considered in the comparison study. The increasing availability of large-scale traffic data set s provides an opportunity to explore them for knowledge discovery in ITS (Intelligent Transportation Systems). The av - enues for exploration are numerous, ranging from uncoverin g traffic patterns [1], city dynamics [2], driving directions [3], discovering hot spots in a city [4], finding vacant taxis arou nd a city [5], predicting taxi demand [6], taxi operation patte rns [7], to detecting anomalies [8], among others. V arious verticals of ITS have always received active research attention in the past. However, the recent emergence of deep learning techniques and their applicability in tran s-portation systems has resulted in a heightened interest in t his area [9]. Consequently, traditional machine learning mode ls in many applications are now being replaced by deep learning techniques, which is reshaping the landscape of intelligen t transport networks. Out of the several applications of ITS, the area of anomaly detection has benefited significantly from th e application of deep learning-based techniques [10]. Anoma ly detection aims to find those patterns which are not normally expected from the data. Typical observations from traffic da ta demonstrate strong spatiotemporal patterns, showing per iod-icity and strong correlations between adjacent observatio ns.


Understanding Top-k Sparsification in Distributed Deep Learning

arXiv.org Machine Learning

Distributed stochastic gradient descent (SGD) algorithms are widely deployed in training large-scale deep learning models, while the communication overhead among workers becomes the new system bottleneck. Recently proposed gradient sparsification techniques, especially Top-$k$ sparsification with error compensation (TopK-SGD), can significantly reduce the communication traffic without an obvious impact on the model accuracy. Some theoretical studies have been carried out to analyze the convergence property of TopK-SGD. However, existing studies do not dive into the details of Top-$k$ operator in gradient sparsification and use relaxed bounds (e.g., exact bound of Random-$k$) for analysis; hence the derived results cannot well describe the real convergence performance of TopK-SGD. To this end, we first study the gradient distributions of TopK-SGD during the training process through extensive experiments. We then theoretically derive a tighter bound for the Top-$k$ operator. Finally, we exploit the property of gradient distribution to propose an approximate top-$k$ selection algorithm, which is computing-efficient for GPUs, to improve the scaling efficiency of TopK-SGD by significantly reducing the computing overhead. Codes are available at: \url{https://github.com/hclhkbu/GaussianK-SGD}.


Inspect Transfer Learning Architecture with Dilated Convolution

arXiv.org Machine Learning

-- There are many award - winning pre - trained Convolutional Neural Network (CNN), which have a common phenomen on of increasing depth in convolutional layers. However, I inspect on VGG network, which is one of the famous model submitted to ILSVRC - 2014, to show that slight modification in the basic architecture can enhance the accuracy result of the image classification task. In this paper, We present two improv e architectures of pre - trained VGG - 16 and VGG - 19 networks that appl y transfer learning when trained on a different dataset. I report a series of experimental result on various modification of the primary VGG networks and achieved sign ificant out - performance on image classification task by: (1) freezing the first two blocks of the convolutional layers to prevent over - fitting and (2) applying different combination of dilation rate in the last three blocks of convolutional layer to reduce image resolution for feature extraction. Both the proposed architecture achieve s a competitive result on CIFAR - 10 and CIFAR - 100 dataset. Keywords -- CNN, VGG - 16, VGG - 19, Dilated Convolution, transfer learning I. INTRODUCTION Convolutional networks (ConvNets) have achieved excellent success in the large - scale image and video recognition, which has become feasible before large public image repositories such as ImageNet [1] and high - performance computi ng system s such as GPUs or large - scale distributed clusters. These advancements were largely motivated by strong baseline schema s, such as semantic segmentation [2], object recognition [3], image capt ioning [4], and human pose estimation[4].