Goto

Collaborating Authors

 Country


New Potential-Based Bounds for the Geometric-Stopping Version of Prediction with Expert Advice

arXiv.org Machine Learning

This work addresses the classic machine learning problem of online prediction with expert advice. A potential-based framework for the fixed horizon version of this problem was previously developed using verification arguments from optimal control theory (Kobzar, Kohn and Wang, New Potential-Based Bounds for Prediction with Expert Advice (2019)). This paper extends this framework to the random (geometric) stopping version. Taking advantage of these ideas, we construct potentials for the geometric version of prediction with expert advice from potentials used for the fixed horizon version. This construction leads to new explicit lower and upper bounds associated with specific adversary and player strategies for the geometric problem. We identify regimes where these bounds are state of the art.


Landscape Complexity for the Empirical Risk of Generalized Linear Models

arXiv.org Machine Learning

We present a method to obtain the average and the typical value of the number of critical points of the empirical risk landscape for generalized linear estimation problems and variants. This represents a substantial extension of previous applications of the Kac-Rice method since it allows to analyze the critical points of high dimensional non-Gaussian random functions. We obtain a rigorous explicit variational formula for the annealed complexity, which is the logarithm of the average number of critical points at fixed value of the empirical risk. This result is simplified, and extended, using the non-rigorous Kac-Rice replicated method from theoretical physics. In this way we find an explicit variational formula for the quenched complexity, which is generally different from its annealed counterpart, and allows to obtain the number of critical points for typical instances up to exponential accuracy.


Regression with Uncertainty Quantification in Large Scale Complex Data

arXiv.org Machine Learning

While several methods for predicting uncertainty on deep networks have been recently proposed, they do not readily translate to large and complex datasets. In this paper we utilize a simplified form of the Mixture Density Networks (MDNs) to produce a one-shot approach to quantify uncertainty in regression problems. We show that our uncertainty bounds are on-par or better than other reported existing methods. When applied to standard regression benchmark datasets, we show an improvement in predictive log-likelihood and root-mean-square-error when compared to existing state-of-the-art methods. We also demonstrate this method's efficacy on stochastic, highly volatile time-series data where stock prices are predicted for the next time interval. The resulting uncertainty graph summarizes significant anomalies in the stock price chart. Furthermore, we apply this method to the task of age estimation from the challenging IMDb-Wiki dataset of half a million face images. We successfully predict the uncertainties associated with the prediction and empirically analyze the underlying causes of the uncertainties. This uncertainty quantification can be used to pre-process low quality datasets and further enable learning.


Why Should we Combine Training and Post-Training Methods for Out-of-Distribution Detection?

arXiv.org Machine Learning

Deep neural networks are known to achieve superior results i n classification tasks. However, it has been recently shown that they are incapable t o detect examples that are generated by a distribution which is different than the one they have been trained on since they are making overconfident prediction fo r Out-Of-Distribution (OOD) examples. OOD detection has attracted a lot of attenti on recently. In this paper, we review some of the most seminal recent algorit hms in the OOD detection field, we divide those methods into training and po st-training and we experimentally show how the combination of the former with t he latter can achieve state-of-the-art results in the OOD detection task. Since the seminal work of Krizhevsky et al. (2012), Deep Neur al Networks (DNNs) have demonstrated great success in several applications, e.g.


Keyword Aware Influential Community Search in Large Attributed Graphs

arXiv.org Artificial Intelligence

We introduce a novel keyword-aware influential community query KICQ that finds the most influential communities from an attributed graph, where an influential community is defined as a closely connected group of vertices having some dominance over other groups of vertices with the expertise (a set of keywords) matching with the query terms (words or phrases). We first design the KICQ that facilitates users to issue an influential CS query intuitively by using a set of query terms, and predicates (AND or OR). In this context, we propose a novel word-embedding based similarity model that enables semantic community search, which substantially alleviates the limitations of exact keyword based community search. Next, we propose a new influence measure for a community that considers both the cohesiveness and influence of the community and eliminates the need for specifying values of internal parameters of a network. Finally, we propose two efficient algorithms for searching influential communities in large attributed graphs. We present detailed experiments and a case study to demonstrate the effectiveness and efficiency of the proposed approaches.


An Automated Deep Learning Approach for Bacterial Image Classification

arXiv.org Machine Learning

Automated recognition and classification of bacteria species from microscopic images have significant importance in clinical microbiology. Bacteria classification is usually carried out manually by biologists using different shapes and morphologic characteristics of bacteria species. The manual taxonomy of bacteria types from microscopy images is time-consuming and a challenging task for even experienced biologists. In this study, an automated deep learning based classification approach has been proposed to classify bacterial images into different categories. The ResNet-50 pre-trained CNN architecture has been used to classify digital bacteria images into 33 categories. The transfer learning technique was employed to accelerate the training process of the network and improve the classification performance of the network. The proposed method achieved an average classification accuracy of 99.2%. The experimental results demonstrate that the proposed technique surpasses state-of-the-art methods in the literature and can be used for any type of bacteria classification tasks.


SpaRCe: Sparse reservoir computing

arXiv.org Machine Learning

"Sparse" neural networks, in which relatively few neurons or connections are active, are common in both machine learning and neuroscience. Whereas in machine learning, "sparseness" is related to a penalty term which effectively leads to some connecting weights becoming small or zero, in biological brains, sparseness is often created when high spiking thresholds prevent neuronal activity. Inspired by neuroscience, here we introduce sparseness into a reservoir computing network via neuron-specific learnable thresholds of activity, allowing neurons with low thresholds to give output but silencing outputs from neurons with high thresholds. This approach, which we term "SpaRCe", optimises the sparseness level of the reservoir and applies the threshold mechanism to the information received by the read-out weights. Both the read-out weights and the thresholds are learned by a standard on-line gradient rule that minimises an error function on the outputs of the network. Threshold learning occurs by the balance of two opposing forces: reducing inter-neuronal correlations in the reservoir by deactivating redundant neurons, while increasing the activity of neurons participating in correct decisions. We test SpaRCe in a set of classification problems and find that introducing threshold learning improves performance compared to standard reservoir computing networks.


Detecting Hardly Visible Roads in Low-Resolution Satellite Time Series Data

arXiv.org Machine Learning

Massive amounts of satellite data have been gathered over time, holding the potential to unveil a spatiotemporal chronicle of the surface of Earth. These data allow scientists to investigate various important issues, such as land use changes, on a global scale. However, not all land-use phenomena are equally visible on satellite imagery. In particular, the creation of an inventory of the planet's road infrastructure remains a challenge, despite being crucial to analyze urbanization patterns and their impact. Towards this end, this work advances data-driven approaches for the automatic identification of roads based on open satellite data. Given the typical resolutions of these historical satellite data, we observe that there is inherent variation in the visibility of different road types. Based on this observation, we propose two deep learning frameworks that extend state-of-the-art deep learning methods by formalizing road detection as an ordinal classification task. In contrast to related schemes, one of the two models also resorts to satellite time series data that are potentially affected by missing data and cloud occlusion. Taking these time series data into account eliminates the need to manually curate datasets of high-quality image tiles, substantially simplifying the application of such models on a global scale. We evaluate our approaches on a dataset that is based on Sentinel~2 satellite imagery and OpenStreetMap vector data. Our results indicate that the proposed models can successfully identify large and medium-sized roads. We also discuss opportunities and challenges related to the detection of roads and other infrastructure on a global scale.


Cross-Language Aphasia Detection using Optimal Transport Domain Adaptation

arXiv.org Machine Learning

Multi-language speech datasets are scarce and often have small sample sizes in the medical domain. Robust transfer of linguistic features across languages could improve rates of early diagnosis and therapy for speakers of low-resource languages when detecting health conditions from speech. We utilize out-of-domain, unpaired, single-speaker, healthy speech data for training multiple Optimal Transport (OT) domain adaptation systems. We learn mappings from other languages to English and detect aphasia from linguistic characteristics of speech, and show that OT domain adaptation improves aphasia detection over unilingual baselines for French (6% increased F1) and Mandarin (5% increased F1). Further, we show that adding aphasic data to the domain adaptation system significantly increases performance for both French and Mandarin, increasing the F1 scores further (10% and 8% increase in F1 scores for French and Mandarin, respectively, over unilingual baselines).


Learning to Recommend via Meta Parameter Partition

arXiv.org Machine Learning

In this paper we propose to solve an important problem in recommendation -- user cold start, based on meta leaning method. Previous meta learning approaches finetune all parameters for each new user, which is both computing and storage expensive. In contrast, we divide model parameters into fixed and adaptive parts and develop a two-stage meta learning algorithm to learn them separately. The fixed part, capturing user invariant features, is shared by all users and is learned during offline meta learning stage. The adaptive part, capturing user specific features, is learned during online meta learning stage. By decoupling user invariant parameters from user dependent parameters, the proposed approach is more efficient and storage cheaper than previous methods. It also has potential to deal with catastrophic forgetting while continually adapting for streaming coming users. Experiments on production data demonstrates that the proposed method converges faster and to a better performance than baseline methods. Meta-training without online meta model finetuning increases the AUC from 72.24% to 74.72% (2.48% absolute improvement). Online meta training achieves a further gain of 2.46\% absolute improvement comparing with offline meta training.