Goto

Collaborating Authors

 Country


Single Headed Attention RNN: Stop Thinking With Your Head

arXiv.org Artificial Intelligence

The leading approaches in language modeling are all obsessed with TV shows of my youth - namely Transformers and Sesame Street. Transformers this, Transformers that, and over here a bonfire worth of GPU-TPU-neuromorphic wafer scale silicon. We opt for the lazy path of old and proven techniques with a fancy crypto inspired acronym: the Single Headed Attention RNN (SHA-RNN). The author's lone goal is to show that the entire field might have evolved a different direction if we had instead been obsessed with a slightly different acronym and slightly different result. We take a previously strong language model based only on boring LSTMs and get it to within a stone's throw of a stone's throw of state-of-the-art byte level language model results on enwik8. This work has undergone no intensive hyperparameter optimization and lived entirely on a commodity desktop machine that made the author's small studio apartment far too warm in the midst of a San Franciscan summer. The final results are achievable in plus or minus 24 hours on a single GPU as the author is impatient. The attention mechanism is also readily extended to large contexts with minimal computation. Take that Sesame Street.


Attention Deep Model with Multi-Scale Deep Supervision for Person Re-Identification

arXiv.org Artificial Intelligence

In recent years, person re-identification (PReID) has become a hot topic in computer vision duo to it is an important part in intelligent surveillance. Many state-of-the-art PReID methods are attention-based or multi-scale feature learning deep models. However, introducing attention mechanism may lead to some important feature information losing issue. Besides, most of the multi-scale models embedding the multi-scale feature learning block into the feature extraction deep network, which reduces the efficiency of inference network. To address these issue, in this study, we introduce an attention deep architecture with multi-scale deep supervision for PReID. Technically, we contribute a reverse attention block to complement the attention block, and a novel multi-scale layer with deep supervision operator for training the backbone network. The proposed block and operator are only used for training, and discard in test phase. Experiments have been performed on Market-1501, DukeMTMC-reID and CUHK03 datasets. All the experiment results show that the proposed model significantly outperforms the other competitive state-of-the-art methods.


A Fully Natural Gradient Scheme for Improving Inference of the Heterogeneous Multi-Output Gaussian Process Model

arXiv.org Artificial Intelligence

A recent novel extension of multi-output Gaussian processes handles heterogeneous outputs assuming that each output has its own likelihood function. It uses a vector-valued Gaussian process prior to jointly model all likelihoods' parameters as latent functions drawn from a Gaussian process with a linear model of coregionalisation covariance. By means of an inducing points framework, the model is able to obtain tractable variational bounds amenable to stochastic variational inference. Nonetheless, the strong conditioning between the variational parameters and the hyper-parameters burdens the adaptive gradient optimisation methods used in the original approach. To overcome this issue we borrow ideas from variational optimisation introducing an exploratory distribution over the hyper-parameters, allowing inference together with the variational parameters through a fully natural gradient optimisation scheme. We show that our optimisation scheme can achieve better local optima solution with higher test performance rates than adaptive gradient methods or an hybrid strategy that partially use natural gradients in cooperation with the Adam method. We compare the performance of the different methods over toy and real databases.


Context-aware Active Multi-Step Reinforcement Learning

arXiv.org Artificial Intelligence

Reinforcement learning has attracted great attention recently, especially policy gradient algorithms, which have been demonstrated on challenging decision making and control tasks. In this paper, we propose an active multi-step TD algorithm with adaptive stepsizes to learn actor and critic. Specifically, our model consists of two components: active stepsize learning and adaptive multi-step TD algorithm. Firstly, we divide the time horizon into chunks and actively select state and action inside each chunk. Then given the selected samples, we propose the adaptive multi-step TD, which generalizes TD($\lambda$), but adaptively switch on/off the backups from future returns of different steps. Particularly, the adaptive multi-step TD introduces a context-aware mechanism, here a binary classifier, which decides whether or not to turn on its future backups based on the context changes. Thus, our model is kind of combination of active learning and multi-step TD algorithm, which has the capacity for learning off-policy without the need of importance sampling. We evaluate our approach on both discrete and continuous space tasks in an off-policy setting respectively, and demonstrate competitive results compared to other reinforcement learning baselines.


Reading The Markets -- Machine Learning Versus The Financial News

#artificialintelligence

Suffice it to say that they are a form of non-linear regression tool whose underlying design found inspiration in a simplification of the basic architecture of the human brain. Many of the great advances that we have experienced in Machine Learning over the last few years make use of neural networks. The basic algorithm has been around for decades -- but it has come into its own as processing power and data availability have steadily increased. For this project we implemented our neural network in Python using the popular TensorFlow library from Google. The characteristics of our neural network, and in particular its complexity, were chosen to balance precision and generalization.


5 Ways FinTech Can Benefit from Machine Learning

#artificialintelligence

FinTech industry is known to use artificial intelligence for a wide range of purposes. Digital enterprises use it for efficient chatbot response systems. Some businesses offer AI as an assistant for asset management and market analysis. The use cases of AI are widespread among the industry, and we can safely assume that technology will be further used. According to Mordor Intelligence, the AI market in fintech is projected to grow beyond $7 billion from only $1.2 bln in 2017.


New machine learning algorithms offer safety and fairness guarantees: New framework for fairer, safer algorithms

#artificialintelligence

Guaranteeing safe and fair machine behavior is still an issue today, says machine learning researcher and lead author Philip Thomas at the University of Massachusetts Amherst. "When someone applies a machine learning algorithm, it's hard to control its behavior," he points out. This risks undesirable outcomes from algorithms that direct everything from self-driving vehicles to insulin pumps to criminal sentencing, say he and co-authors. Writing in Science, Thomas and his colleagues Yuriy Brun, Andrew Barto and graduate student Stephen Giguere at UMass Amherst, Bruno Castro da Silva at the Federal University of Rio Grande del Sol, Brazil, and Emma Brunskill at Stanford University this week introduce a new framework for designing machine learning algorithms that make it easier for users of the algorithm to specify safety and fairness constraints. "We call algorithms created with our new framework'Seldonian' after Asimov's character Hari Seldon," Thomas explains.


AI Scalability for the Next Decade

#artificialintelligence

Can it do something meaningful for me today? Which AI opportunities are next? Expanding Growth in Cloud Usage GPU Workstation GPU Server How to improve? Replicated data Difficult to scale Lack of security with open-source frameworks and applications introduces risk. Multiple Spark teams each with dedicated servers wasted capacity and high administrative overhead.


AI Curricula for K-12 Classrooms

#artificialintelligence

Schools like those in the Pennsylvania Montour School District have mandated AI in the grades 5-8 curriculum, and they are expanded the initiative in other grades as well. Educators have embedded artificial intelligence in STEM courses, and other subjects like Music, Computer Science and Media Arts also include AI in their curricula. Additionally, the district requires their students to take a stand-alone AI Ethics course that teaches students design and values.


How artificial intelligence could transform GI patient care: Dr. William Karnes of Docbot weighs in

#artificialintelligence

William Karnes, MD, is director of the high-risk program and colonoscopy quality at the UCI Health H.H. Chao Comprehensive Digestive Disease Center in Orange, Calif., and chief medical officer of Docbot, a technology that uses artificial intelligence to detect abnormalities from colonoscopy capsule video. Here, Dr. Karnes shares his thoughts with Becker's ASC Review on the future of AI in the gastroenterology specialty, and how the technology could help patients and physicians. Question: Can you tell me a little more about the Docbot technology and how you got involved? Dr. William Karnes: The story goes back to 2012 when I came to UCI and Dr. Chan brought me on to wipe out colon cancer in Orange County. It was a three-pronged approach but one of the most important ones.