Machine Learning

Towards a Fast Steady-State Visual Evoked Potentials (SSVEP) Brain-Computer Interface (BCI) Machine Learning

Steady-state visual evoked potentials (SSVEP) brain-computer interface (BCI) provides reliable responses leading to high accuracy and information throughput. But achieving high accuracy typically requires a relatively long time window of one second or more. Various methods were proposed to improve sub-second response accuracy through subject-specific training and calibration. Substantial performance improvements were achieved with tedious calibration and subject-specific training; resulting in the user's discomfort. So, we propose a training-free method by combining spatial-filtering and temporal alignment (CSTA) to recognize SSVEP responses in sub-second response time. CSTA exploits linear correlation and non-linear similarity between steady-state responses and stimulus templates with complementary fusion to achieve desirable performance improvements. We evaluated the performance of CSTA in terms of accuracy and Information Transfer Rate (ITR) in comparison with both training-based and training-free methods using two SSVEP data-sets. We observed that CSTA achieves the maximum mean accuracy of 97.43$\pm$2.26 % and 85.71$\pm$13.41 % with four-class and forty-class SSVEP data-sets respectively in sub-second response time in offline analysis. CSTA yields significantly higher mean performance (p<0.001) than the training-free method on both data-sets. Compared with training-based methods, CSTA shows 29.33$\pm$19.65 % higher mean accuracy with statistically significant differences in time window less than 0.5 s. In longer time windows, CSTA exhibits either better or comparable performance though not statistically significantly better than training-based methods. We show that the proposed method brings advantages of subject-independent SSVEP classification without requiring training while enabling high target recognition performance in sub-second response time.

Graph Representation Learning via Graphical Mutual Information Maximization Machine Learning

The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs---an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.

On Positive-Unlabeled Classification in GAN Machine Learning

This paper defines a positive and unlabeled classification problem for standard GANs, which then leads to a novel technique to stabilize the training of the discriminator in GANs. Traditionally, real data are taken as positive while generated data are negative. This positive-negative classification criterion was kept fixed all through the learning process of the discriminator without considering the gradually improved quality of generated data, even if they could be more realistic than real data at times. In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality. The discriminator is thus a classifier for this positive and unlabeled classification problem, and we derive a new Positive-Unlabeled GAN (PUGAN). We theoretically discuss the global optimality the proposed model will achieve and the equivalent optimization goal. Empirically, we find that PUGAN can achieve comparable or even better performance than those sophisticated discriminator stabilization methods.

Decoupling Learning Rates Using Empirical Bayes Priors Machine Learning

In this work, we propose an Empirical Bayes approach to decouple the learning rates of first order and second order features (or any other feature grouping) in a Generalized Linear Model. Such needs arise in small-batch or low-traffic use-cases. As the first order features are likely to have a more pronounced effect on the outcome, focusing on learning first order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the observed data for the deployed model to empirically compute a hierarchical prior in hindsight. We apply our method to a standard classification setting, as well as a contextual bandit setting in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge. Furthermore, our approach can be applied to any problem instance modeled as a Bayesian framework.

Learning Task-Driven Control Policies via Information Bottlenecks Machine Learning

This paper presents a reinforcement learning approach to synthesizing task-driven control policies for robotic systems equipped with rich sensory modalities (e.g., vision or depth). Standard reinforcement learning algorithms typically produce policies that tightly couple control actions to the entirety of the system's state and rich sensor observations. As a consequence, the resulting policies can often be sensitive to changes in task-irrelevant portions of the state or observations (e.g., changing background colors). In contrast, the approach we present here learns to create a task-driven representation that is used to compute control actions. Formally, this is achieved by deriving a policy gradient-style algorithm that creates an information bottleneck between the states and the task-driven representation; this constrains actions to only depend on task-relevant information. We demonstrate our approach in a thorough set of simulation results on multiple examples including a grasping task that utilizes depth images and a ball-catching task that utilizes RGB images. Comparisons with a standard policy gradient approach demonstrate that the task-driven policies produced by our algorithm are often significantly more robust to sensor noise and task-irrelevant changes in the environment.

Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers Machine Learning

We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the margin between them in accordance to a given prioritization vector, which yields a tighter error bound for the important classes while also reducing the overall out-of-sample error. In addition to demonstrating an efficient implementation of our framework, we derive generalization bounds, demonstrate Fisher consistency, adapt the framework to Mercer's kernel and to neural networks, and report promising empirical results on all accounts.

Bootstrapping a DQN Replay Memory with Synthetic Experiences Machine Learning

An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version.

Introduction to quasi-open set semi-supervised learning for big data analytics Machine Learning

State-of-the-art performance and low system complexity has made deep-learning an increasingly attractive solution for big data analytics. However, limiting assumptions of end-to-end learning regimes hinder the use of neural networks on large application-grade datasets. This work addresses the assumption that output class-labels are defined for all classes in the domain. The amount of data collected by modern-day sensors span over an incomprehensible range of potential classes. Therefore, we propose a new learning regime where only some, but not all, classes of the training data are of interest to the classification system. The semi-supervised learning scenario in big data requires the assumption of a partial class mismatch between labelled and unlabelled training data. With classification systems required to classify source classes indicated by labelled samples while separating novel classes indicated by unlabelled samples, we find ourselves in an open-set case (vs closed set with only source classes). However, introducing samples from novel classes into the training set indicates a more relaxed open-set case. As such, our proposed regime of \textit{quasi-open set semi-supervised learning} is introduced. We propose a suitable method to train under quasi-open set semi-supervised learning that makes use of Wasserstein generative adversarial networks (WGANs). A trained classification certainty estimation within the discriminator (or critic) network is used to enable a reject option for the classifier. By placing a threshold on this certainty estimation, the reject option accepts classifications of source classes and rejects novel classes. Big data end-to-end training is promoted by developing models that recognize input samples do not necessarily belong to output labels. We believe this essential for big data analytics, and urge more work under quasi-open set semi-supervised learning.

Exploring Structural Inductive Biases in Emergent Communication Machine Learning

Human language and thought are characterized by the ability to systematically generate a potentially infinite number of complex structures (e.g., sentences) from a finite set of familiar components (e.g., words). Recent works in emergent communication have discussed the propensity of artificial agents to develop a systematically compositional language through playing cooperative referential games. The degree of structure in the input data was found to affect the compositionality of the emerged communication protocols. Thus, we explore various structural priors in multi-agent communication and propose a novel graph referential game. We compare the effect of structural inductive bias (bag-of-words, sequences and graphs) on the emergence of compositional understanding of the input concepts measured by topographic similarity and generalization to unseen combinations of familiar properties. We empirically show that graph neural networks induce a better compositional language prior and a stronger generalization to out-of- domain data. We further perform ablation studies that show the robustness of the emerged protocol in graph referential games.

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise Machine Learning

Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is o (1 /k c) and the steady-state term is O (1 /k), where c 1 and k is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of Ω(1 /k). A simple numerical experiment is presented to support our theory. Keywords: stochastic approximation, reinforcement learning, GTD learning, Markovian noise 1. Introduction Since its introduction close to 70 years ago, the stochastic approximation (SA) scheme (Robbins and Monro, 1951) has been a powerful tool for root finding when only noisy samples are available. During the past two decades, considerable progresses in the practical and theoretical research of SA have been made, see (Bena ım, 1999; Kushner and Yin, 2003; Borkar, 2008) for an overview. Among others, linear SA schemes are popular in reinforcement learning (RL) as they lead to policy evaluation methods with linear function approximation, of particular importance is temporal difference (TD) learning (Sutton, 1988) for which finite time analysis has been reported in (Srikant and Ying, 2019; Lakshminarayanan and Szepesvari, 2018; Bhandari et al., 2018; Dalal et al., 2018a). The TD learning scheme based on classical (linear) SA is known to be inadequate for the off-policy learning paradigms in RL, where data samples are drawn from a behavior policy different from the policy being evaluated (Baird, 1995; Tsitsiklis and V an Roy, 1997). To circumvent this Authors listed in alphabetical order. These methods fall within the scope of linear two-timescale SA scheme introduced by Borkar (1997): θ k 1 θ k β k{null b 1( X k 1) null A 11(X k 1)θ k null A 12(X k 1) w k}, (1) w k 1 w k γ k{null b 2( X k 1) null A 21( X k 1)θ k null A 22(X k 1)w k}.