Africa
On the Reduction of Variance and Overestimation of Deep Q-Learning
Sabry, Mohammed, Khalifa, Amr M. A.
The breakthrough of deep Q-Learning on different types of environments revolutionized the algorithmic design of Reinforcement Learning to introduce more stable and robust algorithms, to that end many extensions to deep Q-Learning algorithm have been proposed to reduce the variance of the target values and the overestimation phenomena. In this paper, we examine new methodology to solve these issues, we propose using Dropout techniques on deep Q-Learning algorithm as a way to reduce variance and overestimation. We further present experiments on some of the benchmark environments that demonstrate significant improvement of the stability of the performance and a reduction in variance and overestimation.
Emergent properties of the local geometry of neural loss landscapes
Fort, Stanislav, Ganguli, Surya
Emergent properties of the local geometry of neural loss landscapesStanislav Fort Surya Ganguli Stanford University Stanford, CA, USA Stanford University Stanford, CA, USA Abstract The local geometry of high dimensional neural network loss landscapes can both challenge our cherished theoretical intuitions as well as dramatically impact the practical success of neural network training. Indeed recent works have observed 4 striking local properties of neural loss landscapes on classification tasks: (1) the landscape exhibits exactly C directions of high positive curvature, where C is the number of classes; (2) gradient directions are largely confined to this extremely low dimensional subspace of positive Hessian curvature, leaving the vast majority of directions in weight space unexplored; (3) gradient descent transiently explores intermediate regions of higher positive curvature before eventually finding flatter minima; (4) training can be successful even when confined to low dimensional random affine hy-perplanes, as long as these hyperplanes intersect a Goldilocks zone of higher than average curvature. We develop a simple theoretical model of gradients and Hessians, justified by numerical experiments on architectures and datasets used in practice, that simultaneously accounts for all 4 of these surprising and seemingly unrelated properties. Our unified model provides conceptual insights into the emergence of these properties and makes connections with diverse topics in neural networks, random matrix theory, and spin glasses, including the neural tangent kernel, BBP phase transitions, and Derrida's random energy model. 1 Introduction The geometry of neural network loss landscapes and the implications of this geometry for both optimization and generalization have been subjects of intense interest in many works, ranging from studies on the lack of local minima at significantly higher loss than that of the global minimum [1, 2] to studies debating relations between the curvature of local minima and their generalization properties [3, 4, 5, 6]. Fundamentally, the neural network loss landscape is a scalar loss function over a very high D dimensional parameter space that could depend a priori in highly nontrivial ways on the very structure of real-world data itself as well as intricate properties of the neural network architecture. Moreover, the regions of this loss landscape explored by gradient descent could themselves have highly atypical geometric properties relative to randomly chosen points in the landscape.
Flood Detection On Low Cost Orbital Hardware
Mateo-Garcia, Gonzalo, Oprea, Silviu, Smith, Lewis, Veitch-Michaelis, Josh, Schumann, Guy, Gal, Yarin, Baydin, Atฤฑlฤฑm Gรผneล, Backes, Dietmar
Satellite imaging is a critical technology for monitoring and responding to natural disasters such as flooding. Despite the capabilities of modern satellites, there is still much to be desired from the perspective of first response organisations like UNICEF. Two main challenges are rapid access to data, and the ability to automatically identify flooded regions in images. We describe a prototypical flood segmentation system, identifying cloud, water and land, that could be deployed on a constellation of small satellites, performing processing on board to reduce downlink bandwidth by 2 orders of magnitude. We target PhiSat-1, part of the FSSCAT mission, which is planned to be launched by the European Space Agency (ESA) near the start of 2020 as a proof of concept for this new technology.
Disruptive Innovation - You Cannot Stop Progress
My wife's father was a perfect mix of Walter Mattheu & Fred Flintstone. He didn't talk much so when he did you tended to listen. My wife's name is Kate but he called her Kay. When we first met, in the mid-1970s, I remember Ivan asking her about our relationship: "Now Kay, I always raised you to make the right decisions; now are you sure that you want to marry this guy?" There was something to his statement because, at that time, I didn't have two nickles to rub together and my prospects didn't look good.
Artificial Intelligence: African Women In Tech Turn To Artificial Intelligence
Artificial intelligence took center stage as African female technology experts met at Women in Tech Week in Ghana to promote women's involvement in the field. When Lily Edinam Botsyoe was studying computer science at a university in Ghana, students wrote programming codes on a whiteboard because there were not enough computers. This made it difficult to apply the coding skills they were learning, she says, and the problem continues today. "We have students coming out of schools having the theoretical background -- which is very important because you can't actually appreciate something practical if you don't have the theory. But, the industry-ready skills is lacking because they didn't have the hands-on experience," Botsyoe said.
Global Cognitive Computing Market Remarkable Growth Factors, New Innovations Of Leading Players & Forecast Till 2028 - Market Newsmirror
The Cognitive Computing Market report includes the leading advancements and technological up-gradation that engages the user to inhabit with fine business selections, define their future-based priority growth plans, and to implement the necessary actions. The global Cognitive Computing Market report also offers a detailed summary of key players and their manufacturing procedure with statistical data and profound analysis of the products, contribution, and revenue. Every information given in the report is sourced and verified by our expert team and is collated with precision. To give a broad overview of the current global market trends and strategies led by key businesses, we present the information in a graphical format such as graphs, pie-charts with the superior illustration.
Deep Probabilistic Kernels for Sample-Efficient Learning
Mallick, Ankur, Dwivedi, Chaitanya, Kailkhura, Bhavya, Joshi, Gauri, Han, T. Yong-Jin
Gaussian Processes (GPs) with an appropriate kernel are known to provide accurate predictions and uncertainty estimates even with very small amounts of labeled data. However, GPs are generally unable to learn a good representation that can encode intricate structures in high dimensional data. The representation power of GPs depends heavily on kernel functions used to quantify the similarity between data points. Traditional GP kernels are not very effective at capturing similarity between high dimensional data points, while methods that use deep neural networks to learn a kernel are not sample-efficient. To overcome these drawbacks, we propose deep probabilistic kernels which use a probabilistic neural network to map high-dimensional data to a probability distribution in a low dimensional subspace, and leverage the rich work on kernels between distributions to capture the similarity between these distributions. Experiments on a variety of datasets show that building a GP using this covariance kernel solves the conflicting problems of representation learning and sample efficiency. Our model can be extended beyond GPs to other small-data paradigms such as few-shot classification where we show competitive performance with state-of-the-art models on the mini-Imagenet dataset.
If dropout limits trainable depth, does critical initialisation still matter? A large-scale statistical analysis on ReLU networks
Pretorius, Arnu, van Biljon, Elan, van Niekerk, Benjamin, Eloff, Ryan, Reynard, Matthew, James, Steve, Rosman, Benjamin, Kamper, Herman, Kroon, Steve
Recent work in signal propagation theory has shown that dropout limits the depth to which information can propagate through a neural network. In this paper, we investigate the effect of initialisation on training speed and generalisation for ReLU networks within this depth limit. We ask the following research question: given that critical initialisation is crucial for training at large depth, if dropout limits the depth at which networks are trainable, does initialising critically still matter? We conduct a large-scale controlled experiment, and perform a statistical analysis of over $12000$ trained networks. We find that (1) trainable networks show no statistically significant difference in performance over a wide range of non-critical initialisations; (2) for initialisations that show a statistically significant difference, the net effect on performance is small; (3) only extreme initialisations (very small or very large) perform worse than criticality. These findings also apply to standard ReLU networks of moderate depth as a special case of zero dropout. Our results therefore suggest that, in the shallow-to-moderate depth setting, critical initialisation provides zero performance gains when compared to off-critical initialisations and that searching for off-critical initialisations that might improve training speed or generalisation, is likely to be a fruitless endeavour.
8 Platforms You Can Use To Build Mobile Deep Learning Solutions
Deep Learning has made several breakthroughs in recent years. Compared to traditional computation platforms, it has become more sophisticated and advanced than ever. Smart homes, intelligent personal assistant, etc. are some of the major breakthroughs in the present era. In this article, we list down 8 platforms which can be used to build mobile deep learning solutions. Facebook's open-source deep learning framework, Caffe2 is a lightweight, modular, and scalable framework which provides an easy way to experiment with deep learning models and algorithms. The framework comes with native Python and C APIs that work interchangeably and integrates with Android Studio, Microsoft Visual Studio, or XCode for mobile development.
AI Can Be Trained To Independently Make Scientific Predictions Based On Previous Knowledge
If you're a lover of coffee, it will come as unpleasant news that the price of coffee could potentially spike in the near future. Climate change and deforestation are threatening some of the biggest coffee species in the world, but AI could potentially help keep coffee relatively affordable. The combined forces of deforestation and climate change are threatening the production of many species of coffee, including the common Arabica species, which can be found in many of the most prolific blends and brews. Coffee farmers around the globe are having to deal with rising temperatures and the problems that are associated with them, such as periods of drought. One recent study published in the journals Global Change Biology and Science Advances found that there were substantial risks to many wild coffee species, with around 60% of 124 different wild coffee species being vulnerable to extinction.