Country
Adaptive Communication Bounds for Distributed Online Learning
Kamp, Michael, Boley, Mario, Mock, Michael, Keren, Daniel, Schuster, Assaf, Sharfman, Izchak
W e consider distributed online learning protocols that con trol the exchange of information between local learners in a round-based learning scenario. The learning performance of such a protocol is intuitively optimal if app roximately the same loss is incurred as in a hypothetical serial setting. If a pro tocol accomplishes this, it is inherently impossible to achieve a strong communicati on bound at the same time. In the worst case, every input is essential for the lear ning performance, even for the serial setting, and thus needs to be exchanged betwee n the local learners. However, it is reasonable to demand a bound that scales well w ith the hardness of the serialized prediction problem, as measured by the los s received by a serial online learning algorithm. W e provide formal criteria base d on this intuition and show that they hold for a simplified version of a previously pu blished protocol.
Decoding Cosmological Information in Weak-Lensing Mass Maps with Generative Adversarial Networks
Shirasaki, Masato, Yoshida, Naoki, Ikeda, Shiro, Oogi, Taira, Nishimichi, Takahiro
Galaxy imaging surveys enable us to map the cosmic matter density field through weak gravitational lensing analysis. The density reconstruction is compromised by a variety of noise originating from observational conditions, galaxy number density fluctuations, and intrinsic galaxy properties. We propose a deep-learning approach based on generative adversarial networks (GANs) to reduce the noise in the weak lensing map under realistic conditions. We perform image-to-image translation using conditional GANs in order to produce noiseless lensing maps using the first-year data of the Subaru Hyper Suprime-Cam (HSC) survey. We train the conditional GANs by using 30000 sets of mock HSC catalogs that directly incorporate observational effects. We show that an ensemble learning method with GANs can reproduce the one-point probability distribution function (PDF) of the lensing convergence map within a $0.5-1\sigma$ level. We use the reconstructed PDFs to estimate a cosmological parameter $S_{8} = \sigma_{8}\sqrt{\Omega_{\rm m0}/0.3}$, where $\Omega_{\rm m0}$ and $\sigma_{8}$ represent the mean and the scatter in the cosmic matter density. The reconstructed PDFs place tighter constraint, with the statistical uncertainty in $S_8$ reduced by a factor of $2$ compared to the noisy PDF. This is equivalent to increasing the survey area by $4$ without denoising by GANs. Finally, we apply our denoising method to the first-year HSC data, to place $2\sigma$-level cosmological constraints of $S_{8} < 0.777 \, ({\rm stat}) + 0.105 \, ({\rm sys})$ and $S_{8} < 0.633 \, ({\rm stat}) + 0.114 \, ({\rm sys})$ for the noisy and denoised data, respectively.
Machine Learning for a Low-cost Air Pollution Network
Smith, Michael T., Ssematimba, Joel, Alvarez, Mauricio A., Bainomugisha, Engineer
Data collection in economically constrained countries often necessitates using approximate and biased measurements due to the low-cost of the sensors used. This leads to potentially invalid predictions and poor policies or decision making. This is especially an issue if methods from resource-rich regions are applied without handling these additional constraints. In this paper we show, through the use of an air pollution network example, how using probabilistic machine learning can mitigate some of the technical constraints. Specifically we experiment with modelling the calibration for individual sensors as either distributions or Gaussian processes over time, and discuss the wider issues around the decision process.
Self-attention with Functional Time Representation Learning
Xu, Da, Ruan, Chuanwei, Kumar, Sushant, Korpeoglu, Evren, Achan, Kannan
Sequential modelling with self-attention has achieved cutting edge performances in natural language processing. With advantages in model flexibility, computation complexity and interpretability, self-attention is gradually becoming a key component in event sequence models. However, like most other sequence models, self-attention does not account for the time span between events and thus captures sequential signals rather than temporal patterns. Without relying on recurrent network structures, self-attention recognizes event orderings via positional encoding. To bridge the gap between modelling time-independent and time-dependent event sequence, we introduce a functional feature map that embeds time span into high-dimensional spaces. By constructing the associated translation-invariant time kernel function, we reveal the functional forms of the feature map under classic functional function analysis results, namely Bochner's Theorem and Mercer's Theorem. We propose several models to learn the functional time representation and the interactions with event representation. These methods are evaluated on real-world datasets under various continuous-time event sequence prediction tasks. The experiments reveal that the proposed methods compare favorably to baseline models while also capturing useful time-event interactions.
Sentiment Analysis On Indian Indigenous Languages: A Review On Multilingual Opinion Mining
Shah, Sonali Rajesh, Kaushik, Abhishek
An increase in the use of smartphones has laid to the use of the internet and social media platforms. The most commonly used social media platforms are Twitter, Facebook, WhatsApp and Instagram. People are sharing their personal experiences, reviews, feedbacks on the web. The information which is available on the web is unstructured and enormous. Hence, there is a huge scope of research on understanding the sentiment of the data available on the web. Sentiment Analysis (SA) can be carried out on the reviews, feedbacks, discussions available on the web. There has been extensive research carried out on SA in the English language, but data on the web also contains different other languages which should be analyzed. This paper aims to analyze, review and discuss the approaches, algorithms, challenges faced by the researchers while carrying out the SA on Indigenous languages.
Analysis of Lower Bounds for Simple Policy Iteration
Consul, Sarthak, Dedhia, Bhishma, Ashutosh, Kumar, Khirwadkar, Parthasarathi
Policy iteration is a family of algorithms that are used to find an optimal policy for a given Markov Decision Problem (MDP). Simple Policy iteration (SPI) is a type of policy iteration where the strategy is to change the policy at exactly one improvable state at every step. Melekopoglou and Condon [1990] showed an exponential lower bound on the number of iterations taken by SPI for a 2 action MDP. The results have not been generalized to $k-$action MDP since. In this paper, we revisit the algorithm and the analysis done by Melekopoglou and Condon. We generalize the previous result and prove a novel exponential lower bound on the number of iterations taken by policy iteration for $N-$state, $k-$action MDPs. We construct a family of MDPs and give an index-based switching rule that yields a strong lower bound of $\mathcal{O}\big((3+k)2^{N/2-3}\big)$.
Detection and Mitigation of Rare Subclasses in Neural Network Classifiers
Paterson, Colin, Calinescu, Radu
Regions of high-dimensional input spaces that are un-derrepresented in training datasets reduce machine-learnt classifier performance, and may lead to corner cases and unwanted bias for classifiers used in decision making systems. When these regions belong to otherwise well-represented classes, their presence and negative impact are very hard to identify. We propose an approach for the detection and mitigation of such rare subclasses in neural network classifiers. The new approach is underpinned by an easy-to-compute commonality metric that supports the detection of rare subclasses, and comprises methods for reducing their impact during both model training and model exploitation.
Effective Sub-clonal Cancer Representation to Predict Tumor Evolution
Akbar, Adnan, Dubourg-Felonneau, Geoffroy, Solovyev, Andrey, Cassidy, John W, Patel, Nirmesh, Clifford, Harry W
The majority of cancer treatments end in failure due to Intra-Tumor Heterogeneity (ITH). ITH in cancer is represented by clonal evolution where different sub-clones compete with each other for resources under conditions of Darwinian natural selection. Predicting the growth of these sub-clones within a tumour is among the key challenges of modern cancer research. Predicting tumor behavior enables the creation of risk profiles for patients and the optimisation of their treatment by therapeutically targeting sub-clones more likely to grow. Current research efforts in this space are focused on mathematical modelling of population genetics to quantify the selective advantage of sub-clones, thus enabling predictions of which sub-clones are more likely to grow. These tumor evolution models are based on assumptions which are not valid for real-world tumor micro-environment. Furthermore, these models are often fit on a single instance of a tumor, and hence prediction models cannot be validated. This paper presents an alternative approach for predicting cancer evolution using a data-driven machine learning method. Our proposed method is based on the intuition that if we can capture the true characteristics of sub-clones within a tumor and represent it in the form of features, a sophisticated machine learning algorithm can be trained to predict its behavior. The work presented here provides a novel approach to predicting cancer evolution, utilizing a data-driver approach. We strongly believe that the accumulation of data from microbiologists, oncologists and machine learning researchers could be used to encapsulate the true essence of tumor sub-clones, and can play a vital role in selecting the best cancer treatments for patients.
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Aggarwal, Vatsal, Cotescu, Marius, Prateek, Nishant, Lorenzo-Trueba, Jaime, Barra-Chicote, Roberto
We propose a Text-to-Speech method to create an unseen expressive style using one utterance of expressive speech of around one second. Specifically, we enhance the disentanglement capabilities of a state-of-the-art sequence-to-sequence based system with a Variational AutoEncoder (VAE) and a Householder Flow. The proposed system provides a 22% KL-divergence reduction while jointly improving perceptual metrics over state-of-the-art. At synthesis time we use one example of expressive style as a reference input to the encoder for generating any text in the desired style. Perceptual MUSHRA evaluations show that we can create a voice with a 9% relative naturalness improvement over standard Neural Text-to-Speech, while also improving the perceived emotional intensity (59 compared to the 55 of neutral speech).
Data-Driven Compression of Convolutional Neural Networks
Pahwa, Ramit, Arivazhagan, Manoj Ghuhan, Garg, Ankur, Krishnamoorthy, Siddarth, Saxena, Rohit, Choudhary, Sunav
Deploying trained convolutional neural networks (CNNs) to mobile devices is a challenging task because of the simultaneous requirements of the deployed model to be fast, lightweight and accurate. Designing and training a CNN architecture that does well on all three metrics is highly non-trivial and can be very time-consuming if done by hand. One way to solve this problem is to compress the trained CNN models before deploying to mobile devices. This work asks and answers three questions on compressing CNN models automatically: a) How to control the trade-off between speed, memory and accuracy during model compression? b) In practice, a deployed model may not see all classes and/or may not need to produce all class labels. Can this fact be used to improve the trade-off? c) How to scale the compression algorithm to execute within a reasonable amount of time for many deployments? The paper demonstrates that a model compression algorithm utilizing reinforcement learning with architecture search and knowledge distillation can answer these questions in the affirmative. Experimental results are provided for current state-of-the-art CNN model families for image feature extraction like VGG and ResNet with CIFAR datasets.