Country
Sentiment and Knowledge Based Algorithmic Trading with Deep Reinforcement Learning
Nan, Abhishek, Perumal, Anandh, Zaiane, Osmar R.
Algorithmic trading, due to its inherent nature, is a difficult problem to tackle; there are too many variables involved in the real world which make it almost impossible to have reliable algorithms for automated stock trading. The lack of reliable labelled data that considers physical and physiological factors that dictate the ups and downs of the market, has hindered the supervised learning attempts for dependable predictions. To learn a good policy for trading, we formulate an approach using reinforcement learning which uses traditional time series stock price data and combines it with news headline sentiments, while leveraging knowledge graphs for exploiting news about implicit relationships. Keywords: Reinforcement Learning ยท Trading ยท Stock Price Prediction ยท Sentiment Analysis ยท Knowledge Graph. 1 Introduction Machine learning is mainly about building predictive models from data. When the data are time series, models can also forecast sequences or outcomes.
Multi-task Learning for Voice Trigger Detection
Sigtia, Siddharth, Clark, Pascal, Haynes, Rob, Richards, Hywel, Bridle, John
We describe the design of a voice trigger detection system for smart speakers. In this study, we address two major challenges. The first is that the detectors are deployed in complex acoustic environments with external noise and loud playback by the device itself. Secondly, collecting training examples for a specific keyword or trigger phrase is challenging resulting in a scarcity of trigger phrase specific training data. We describe a two-stage cascaded architecture where a low-power detector is always running and listening for the trigger phrase. If a detection is made at this stage, the candidate audio segment is re-scored by larger, more complex models to verify that the segment contains the trigger phrase. In this study, we focus our attention on the architecture and design of these second-pass detectors. We start by training a general acoustic model that produces phonetic transcriptions given a large labelled training dataset. Next, we collect a much smaller dataset of examples that are challenging for the baseline system. We then use multi-task learning to train a model to simultaneously produce accurate phonetic transcriptions on the larger dataset \emph{and} discriminate between true and easily confusable examples using the smaller dataset. Our results demonstrate that the proposed model reduces errors by half compared to the baseline in a range of challenging test conditions \emph{without} requiring extra parameters.
Reproducibility Challenge NeurIPS 2019 Report on "Competitive Gradient Descent"
Authors suggest their method is a natural generalization of gradient descent to the two-player scenario where the update is given by the Nash equilibrium of a regularized bilinear local approximation of the underlying game. It avoids oscillatory and divergent behaviors seen in alternating gradient descent. The paper proposes several experiments to establish the robustness of their method. This project aims at replicating their results. The paper provides a detailed comparison to methods based on optimism and consensus on the properties of convergence and stability of various discussed methods using numerical experiments and rigorous analysis. In order to understand these terms, comparison and proposed method and examine the results of the experiments, next section gives a necessary background of the original paper. 2 Background The traditional optimization is concerned with a single agent trying to optimize a cost function. It can be seen as min x R m f ( x). The agent has a clear objective to find ("Good local") minimum of f . Gradeint Descent (and its varients) are reliable Algorithmic Baseline for this purpose.
Multi-task Learning for Speaker Verification and Voice Trigger Detection
Sigtia, Siddharth, Marchi, Erik, Kajarekar, Sachin, Naik, Devang, Bridle, John
Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models.
Naive Exploration is Optimal for Online LQR
Simchowitz, Max, Foster, Dylan J.
We consider the problem of online adaptive control of the linear quadratic regulator, where the true system parameters are unknown. We prove new upper and lower bounds demonstrating that the optimal regret scales as $\widetilde{\Theta}({\sqrt{d_{\mathbf{u}}^2 d_{\mathbf{x}} T}})$, where $T$ is the number of time steps, $d_{\mathbf{u}}$ is the dimension of the input space, and $d_{\mathbf{x}}$ is the dimension of the system state. Notably, our lower bounds rule out the possibility of a $\mathrm{poly}(\log{}T)$-regret algorithm, which has been conjectured due to the apparent strong convexity of the problem. Our upper bounds are attained by a simple variant of \emph{certainty equivalence control}, where the learner selects control inputs according to the optimal controller for their estimate of the system while injecting exploratory random noise. While this approach was shown to achieve $\sqrt{T}$-regret by Mania et al. 2019, we show that if the learner continually refines their estimates of the system matrices, the method attains optimal dimension dependence as well. Central to our upper and lower bounds is a new approach for controlling perturbations of Ricatti equations, which we call the \emph{self-bounding ODE method}. The approach enables regret upper bounds which hold for \emph{any stabilizable instance}, require no foreknowledge of the system except for a single stabilizing controller, and scale with natural control-theoretic quantities.
Heterogeneous Learning from Demonstration
Paleja, Rohan, Gombolay, Matthew
--The development of human-robot systems able to leverage the strengths of both humans and their robotic counterparts has been greatly sought after because of the foreseen, broad-ranging impact across industry and research. We believe the true potential of these systems cannot be reached unless the robot is able to act with a high level of autonomy, reducing the burden of manual tasking or teleoperation. T o achieve this level of autonomy, robots must be able to work fluidly with its human partners, inferring their needs without explicit commands. This inference requires the robot to be able to detect and classify the heterogeneity of its partners. We propose a framework for learning from heterogeneous demonstration based upon Bayesian inference and evaluate a suite of approaches on a real-world dataset of gameplay from StarCraft II. This evaluation provides evidence that our Bayesian approach can outperform conventional methods by up to 12.8 % . 1 Index T erms--Learning from Demonstration; Human-Robot Interaction; Human-Robot T eaming; Deep Learning I.
Comprehensive Analysis of Time Series Forecasting Using Neural Networks
Tadayon, Manie, Iwashita, Yumi
Time series forecasting has gained lots of attention recently; this is because many real-world phenomena can be modeled as time series. The massive volume of data and recent advancements in the processing power of the computers enable researchers to develop more sophisticated machine learning algorithms such as neural networks to forecast the time series data. In this paper, we propose various neural network architectures to forecast the time series data using the dynamic measurements; moreover, we introduce various architectures on how to combine static and dynamic measurements for forecasting. We also investigate the importance of performing techniques such as anomaly detection and clustering on forecasting accuracy. Our results indicate that clustering can improve the overall prediction time as well as improve the forecasting performance of the neural network. Furthermore, we show that feature-based clustering can outperform the distance-based clustering in terms of speed and efficiency. Finally, our results indicate that adding more predictors to forecast the target variable will not necessarily improve the forecasting accuracy.
Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property
Marcondes, Diego, Simonis, Adilson, Barrera, Junior
The agnostic PAC learning model consists of: a Hypothesis Space $\mathcal{H}$, a probability distribution $P$, a sample complexity function $m_{\mathcal{H}}(\epsilon,\delta): [0,1]^{2} \mapsto \mathbb{Z}_{+}$ of precision $\epsilon$ and confidence $1 - \delta$, a finite i.i.d. sample $\mathcal{D}_{N}$, a cost function $\ell$ and a learning algorithm $\mathbb{A}(\mathcal{H},\mathcal{D}_{N})$, which estimates $\hat{h} \in \mathcal{H}$ that approximates a target function $h^{\star} \in \mathcal{H}$ seeking to minimize out-of-sample error. In this model, prior information is represented by $\mathcal{H}$ and $\ell$, while problem solution is performed through their instantiation in several applied learning models, with specific algebraic structures for $\mathcal{H}$ and corresponding learning algorithms. However, these applied models use additional important concepts not covered by the classic PAC learning theory: model selection and regularization. This paper presents an extension of this model which covers these concepts. The main principle added is the selection, based solely on data, of a subspace of $\mathcal{H}$ with a VC-dimension compatible with the available sample. In order to formalize this principle, the concept of Learning Space $\mathbb{L}(\mathcal{H})$, which is a poset of subsets of $\mathcal{H}$ that covers $\mathcal{H}$ and satisfies a property regarding the VC dimension of related subspaces, is presented as the natural search space for model selection algorithms. A remarkable result obtained on this new framework are conditions on $\mathbb{L}(\mathcal{H})$ and $\ell$ that lead to estimated out-of-sample error surfaces, which are true U-curves on $\mathbb{L}(\mathcal{H})$ chains, enabling a more efficient search on $\mathbb{L}(\mathcal{H})$. Hence, in this new framework, the U-curve optimization problem becomes a natural component of model selection algorithms.
Imperfect ImaGANation: Implications of GANs Exacerbating Biases on Facial Data Augmentation and Snapchat Selfie Lenses
Jain, Niharika, Olmo, Alberto, Sengupta, Sailik, Manikonda, Lydia, Kambhampati, Subbarao
Recently, the use of synthetic data generated by GANs has become a popular method to do data augmentation for many applications. While practitioners celebrate this as an economical way to obtain synthetic data for training data-hungry machine learning models, it is not clear that they recognize the perils of such an augmentation technique when applied to an already-biased dataset. Although one expects GANs to replicate the distribution of the original data, in real-world settings with limited data and finite network capacity, GANs suffer from mode collapse. Especially when this data is coming from online social media platforms or the web which are never balanced. In this paper, we show that in settings where data exhibits bias along some axes (eg. More often than not, this bias is unavoidable; we empirically demonstrate that given input of a dataset of headshots of engineering faculty collected from 47 online university directory webpages in the United States is biased toward white males, a state-of-the-art (unconditional variant of) GAN "imagines" faces of synthetic engineering professors that have masculine facial features and white skin color (inferred using human studies and a state-of- the-art gender recognition system). We also conduct a preliminary case study to highlight how Snapchat's explosively popular "female" filter (widely accepted to use a conditional variant of GAN), ends up consistently lightening the skin tones in women of color when trying to make face images appear more feminine. Our study is meant to serve as a cautionary tale for the lay practitioners who may unknowingly increase the bias in their training data by using GAN-based augmentation techniques with web data and to showcase the dangers of using biased datasets for facial applications. Introduction Breakthroughs in deep learning for image recognition have heralded significant progress in the field of computer vision, but one of the greatest limitations of the technology still remains: classifiers require massive amounts of training data to recognize meaningful patterns. As practitioners struggle to coax classifiers into generalizing to the underlying real-world distribution instead of overfitting to the sparse data, several data augmentation techniques have emerged as a useful form of regularization to increase training sample size and thereby, accuracy during test-time. The simplest of these augmentation techniques perform affine transformations on existing samples in the data, such as rotation, zooming, translation etc. (O'Gorman and Kasturi 1995; Bloice, Stocker, and Holzinger 2017). Ideally, the transformed samples should be representative of the same real-world distribution p dataas the original train and test sets.
Markov-Chain Monte Carlo Approximation of the Ideal Observer using Generative Adversarial Networks
Zhou, Weimin, Anastasio, Mark A.
The Ideal Observer (IO) performance has been advocated when optimizing medical imaging systems for signal detection tasks. However, analytical computation of the IO test statistic is generally intractable. To approximate the IO test statistic, sampling-based methods that employ Markov-Chain Monte Carlo (MCMC) techniques have been developed. However, current applications of MCMC techniques have been limited to several object models such as a lumpy object model and a binary texture model, and it remains unclear how MCMC methods can be implemented with other more sophisticated object models. Deep learning methods that employ generative adversarial networks (GANs) hold great promise to learn stochastic object models (SOMs) from image data. In this study, we described a method to approximate the IO by applying MCMC techniques to SOMs learned by use of GANs. The proposed method can be employed with arbitrary object models that can be learned by use of GANs, thereby the domain of applicability of MCMC techniques for approximating the IO performance is extended. In this study, both signal-known-exactly (SKE) and signal-known-statistically (SKS) binary signal detection tasks are considered. The IO performance computed by the proposed method is compared to that computed by the conventional MCMC method. The advantages of the proposed method are discussed.