Goto

Collaborating Authors

 Rajpal, Mohit


Hessian-Aware Bayesian Optimization for Decision Making Systems

arXiv.org Artificial Intelligence

Many approaches for optimizing decision making systems rely on gradient based methods requiring informative feedback from the environment. However, in the case where such feedback is sparse or uninformative, such approaches may result in poor performance. Derivative-free approaches such as Bayesian Optimization mitigate the dependency on the quality of gradient feedback, but are known to scale poorly in the high-dimension setting of complex decision making systems. This problem is exacerbated if the system requires interactions between several actors cooperating to accomplish a shared goal. To address the dimensionality challenge, we propose a compact multi-layered architecture modeling the dynamics of actor interactions through the concept of role. We introduce Hessian-aware Bayesian Optimization to efficiently optimize the multi-layered architecture parameterized by a large number of parameters, and give the first improved regret bound in additive high-dimensional Bayesian Optimization since Mutny & Krause (2018). Our approach shows strong empirical results under malformed or sparse reward.


A Unifying Framework of Bilinear LSTMs

arXiv.org Machine Learning

This paper presents a novel unifying framework of bilinear L STMs that can represent and utilize the nonlinear interaction of the input feat ures present in sequence datasets for achieving superior performance over a linear L STM and yet not incur more parameters to be learned. To realize this, our unifying framework allows the expressivity of the linear vs. bilinear terms to be balan ced by correspondingly trading off between the hidden state vector size vs. approxi mation quality of the weight matrix in the bilinear term so as to optimize the perfo rmance of our bilinear LSTM, while not incurring more parameters to be learned. W e e mpirically evaluate the performance of our bilinear LSTM in several languag e-based sequence learning tasks to demonstrate its general applicability. Recurrent neural networks (RNNs) are popularized by their impressive performance in a wide variety of supervised and unsupervised sequence learning t asks, which include language modeling (Merity et al., 2018), statistical machine translation (Bahdanau et al., 2015), and coreference resolution (Lee et al., 2017). Different variants of RNNs su ch as long short-term memory (LSTM) networks (Hochreiter & Schmidhuber, 1997) and gated recurr ent units (Cho et al., 2014) share a common architectural trait of being built by feedforward ne ural networks connected in a recurrent manner. Typically, a RNN is instantiated by linear neurons coupled w ith a nonlinear activation function, which constitute its basic building blocks; to be consisten t with the literature (Park & Zhu, 1994), we refer to such neurons as linear . This should naturally affect the processin g of adjacent words based on context in a nonlinear manner (see Table 2 in Section 4.2).


Neural Networks for Efficient Bayesian Decoding of Natural Images from Retinal Neurons

Neural Information Processing Systems

Decoding sensory stimuli from neural signals can be used to reveal how we sense our physical environment, and is valuable for the design of brain-machine interfaces. However, existing linear techniques for neural decoding may not fully reveal or exploit the fidelity of the neural signal. Here we develop a new approximate Bayesian method for decoding natural images from the spiking activity of populations of retinal ganglion cells (RGCs). We sidestep known computational challenges with Bayesian inference by exploiting artificial neural networks developed for computer vision, enabling fast nonlinear decoding that incorporates natural scene statistics implicitly. We use a decoder architecture that first linearly reconstructs an image from RGC spikes, then applies a convolutional autoencoder to enhance the image. The resulting decoder, trained on natural images and simulated neural responses, significantly outperforms linear decoding, as well as simple point-wise nonlinear decoding. These results provide a tool for the assessment and optimization of retinal prosthesis technologies, and reveal that the retina may provide a more accurate representation of the visual scene than previously appreciated.