Oceania
Stable Rank Normalization for Improved Generalization in Neural Networks and GANs
Sanyal, Amartya, Torr, Philip H. S., Dokania, Puneet K.
Exciting new work on the generalization bounds for neural networks (NN) given by Neyshabur et al. , Bartlett et al. closely depend on two parameter-depenedent quantities: the Lipschitz constant upper-bound and the stable rank (a softer version of the rank operator). This leads to an interesting question of whether controlling these quantities might improve the generalization behaviour of NNs. To this end, we propose stable rank normalization (SRN), a novel, optimal, and computationally efficient weight-normalization scheme which minimizes the stable rank of a linear operator. Surprisingly we find that SRN, inspite of being non-convex problem, can be shown to have a unique optimal solution. Moreover, we show that SRN allows control of the data-dependent empirical Lipschitz constant, which in contrast to the Lipschitz upper-bound, reflects the true behaviour of a model on a given dataset. We provide thorough analyses to show that SRN, when applied to the linear layers of a NN for classification, provides striking improvements-11.3% on the generalization gap compared to the standard NN along with significant reduction in memorization. When applied to the discriminator of GANs (called SRN-GAN) it improves Inception, FID, and Neural divergence scores on the CIFAR 10/100 and CelebA datasets, while learning mappings with low empirical Lipschitz constants.
Macro-action Multi-timescale Dynamic Programming for Energy Management with Phase Change Materials
Rahimpour, Zahra, Verbic, Gregor, Chapman, Archie C.
This paper focuses on home energy management systems (HEMS) in buildings that have controllable HVAC systems and use phase change material (PCM) as an energy storage system. In this setting, optimally operating a HVAC system is a challenge, because of the nonlinear and non-convex characteristics of the PCM, which makes the corresponding optimization problem impractical with commonly used methods in HEMS. Instead, we use dynamic programming (DP) to deal with the nonlinear features of PCM. However, DP suffers from the curse of dimensionality. Given this drawback, this paper proposes a novel methodology to reduce the computational burden of the DP algorithm in HEMS optimisation with PCM, while maintaining the quality of the solution. Specifically, the method incorporates approaches from sequential decision making in artificial intelligence, including macro-action and multi-time scale abstractions, coupled with an underlying state-space approximation to reduce state-space and action-space size. The method is demonstrated on an energy management problem for a typical residential building located in Sydney for four seasonal weather conditions. Our results demonstrate that the proposed method performs well with an attractive computational cost. In particular, it has a significant speed-up over directly applying DP to the problem, of up to 12900 times faster.
Futuristic rifle with 'Google Maps for drones' software
A defence company has invented a new futuristic'rifle' that stops rogue drones by hacking into them - and forcing them to fly back to their pilots. DroneShield has developed a software similar to'Google Maps' for drones that instantly locates any drones - and sends them back to their pilots. The firm has previously worked with the British Army and provided assistance to the 2018 Korean Winter Olympics, and their tech is in use at airports. CEO Oleg Vornik remains tight-lipped on the exact cost of the system, but confirmed it ranges from five to seven figures. Mr Vornik also says the system could be used to protect airports from drone incursions - such as the one that brought chaos to Gatwick Airport, bringing it to a standstill for 33 hours before Christmas.
Likelihood-free approximate Gibbs sampling
Rodrigues, G. S., Nott, D. J., Sisson, S. A.
Likelihood-free methods refer to procedures that perform likelihood-based statistical inference, but without direct evaluation of the likelihood function. This is attractive when the likelihood function is computationally prohibitive to evaluate due to dataset size or model complexity, or when the likelihood function is only known through a data generation process. Some classes of likelihood-free methods include pseudo-marginal methods (Beaumont 2003; Andrieu and Roberts 2009), indirect inference (Gourieroux et al. 1993) and approximate Bayesian computation (Sisson et al. 2018a). In particular, approximate Bayesian computation (ABC) methods form an approximation to the computationally intractable posterior distribution by firstly sampling parameter vectors from the prior, and conditional on these, generating synthetic datasets under the model. The parameter vectors are then weighted by how well a vector of summary statistics of the synthetic datasets matches the same summary statistics of the observed data. ABC methods have seen extensive application and development over the past 15 years.
SALT: Subspace Alignment as an Auxiliary Learning Task for Domain Adaptation
Thopalli, Kowshik, Thiagarajan, Jayaraman J., Anirudh, Rushil, Turaga, Pavan
Unsupervised domain adaptation aims to transfer and adapt knowledge learned from a labeled source domain to an unlabeled target domain. Key components of unsupervised domain adaptation include: (a) maximizing performance on the source, and (b) aligning the source and target domains. Traditionally, these tasks have either been considered as separate, or assumed to be implicitly addressed together with high-capacity feature extractors. In this paper, we advance a third broad approach; which we term SALT. The core idea is to consider alignment as an auxiliary task to the primary task of maximizing performance on the source. The auxiliary task is made rather simple by assuming a tractable data geometry in the form of subspaces. We synergistically allow certain parameters derived from the closed-form auxiliary solution, to be affected by gradients from the primary task. The proposed approach represents a unique fusion of geometric and model-based alignment with gradient-flows from a data-driven primary task. SALT is simple, rooted in theory, and outperforms state-of-the-art on multiple standard benchmarks.
Weight Agnostic Neural Networks
Not all neural network architectures are created equal, some perform much better than others for certain tasks. But how important are the weight parameters of a neural network compared to its architecture? In this work, we question to what extent neural network architectures alone, without learning any weight parameters, can encode solutions for a given task. We propose a search method for neural network architectures that can already perform a task without any explicit weight training. To evaluate these networks, we populate the connections with a single shared weight parameter sampled from a uniform random distribution, and measure the expected performance. We demonstrate that our method can find minimal neural network architectures that can perform several reinforcement learning tasks without weight training. On a supervised learning domain, we find network architectures that achieve much higher than chance accuracy on MNIST using random weights.
DataLearner: A Data Mining and Knowledge Discovery Tool for Android Smartphones and Tablets
Yates, Darren, Islam, Md Zahidul, Gao, Junbin
Smartphones have become the ultimate'personal' computer, yet despite this, general-purpose data mining and knowledge discovery tools for mobile devices are surprisingly rare. DataLearner is a new data mining application designed specifically for Android devices that imports the Weka data mining engine and augments it with algorithms developed by Charles Sturt University. Moreover, DataLearner can be expanded with additional algorithms. Combined, DataLearner delivers 40 classification, clustering and association rule mining algorithms for model training and evaluation without need for cloud computing resources or network connectivity. It provides the same classification accuracy as PCs and laptops, while doing so with acceptable processing speed and consuming negligible battery life. With its ability to provide easy-to-use data mining on a phone-size screen, DataLearner is a new portable, self-contained data mining tool for remote, personalised and educational applications alike. DataLearner features four elements - this paper, the app available on Google Play, the GPL3-licensed source code on GitHub and a short video on YouTube.
Four Things Everyone Should Know to Improve Batch Normalization
Summers, Cecilia, Dinneen, Michael J.
A key component of most neural network architectures is the use of normalization layers, such as Batch Normalization. Despite its common use and large utility in optimizing deep architectures that are otherwise intractable, it has been challenging both to generically improve upon Batch Normalization and to understand specific circumstances that lend themselves to other enhancements. In this paper, we identify four improvements to the generic form of Batch Normalization and the circumstances under which they work, yielding performance gains across all batch sizes while requiring no additional computation during training. These contributions include proposing a method for reasoning about the current example in inference normalization statistics which fixes a training vs. inference discrepancy; recognizing and validating the powerful regularization effect of Ghost Batch Normalization for small and medium batch sizes; examining the effect of weight decay regularization on the scaling and shifting parameters ฮณ and ฮฒ; and identifying a new normalization algorithm for very small batch sizes by combining the strengths of Batch and Group Normalization.
AI-Smartphone App 'Listens' to Cough to Diagnose Disease - Docwire News
A group of Australian researchers have recently developed an AI-powered smartphone app that can diagnose respiratory disorders by "listening" to the user's cough. This technology was developed by researchers at Curtin University and The University of Queensland, Australia, whose findings were published June 6 in the journal Respiratory Research. The researchers created an algorithm that can analyze coughs for features that are unique to five different diseases. This technique is similar to speech recognition technologies in that the software examines the auditory cough for characteristics specific to these conditions. This is typically done by a physician during a clinical exam, with a stethoscope being used to listen to sound produced while breathing or coughing (auscultation). The downside to this is that the patient must be in the presence of a trained professional to have their respiration sounds analyzed.
Multi-hop Reading Comprehension through Question Decomposition and Rescoring
Min, Sewon, Zhong, Victor, Zettlemoyer, Luke, Hajishirzi, Hannaneh
Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs. We propose a system for multi-hop RC that decomposes a compositional question into simpler sub-questions that can be answered by off-the-shelf single-hop RC models. Since annotations for such decomposition are expensive, we recast sub-question generation as a span prediction problem and show that our method, trained using only 400 labeled examples, generates sub-questions that are as effective as human-authored sub-questions. We also introduce a new global rescoring approach that considers each decomposition (i.e. the sub-questions and their answers) to select the best final answer, greatly improving overall performance. Our experiments on HotpotQA show that this approach achieves the state-of-the-art results, while providing explainable evidence for its decision making in the form of sub-questions.