Oceania
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting
Sun, Xu, Ren, Xuancheng, Ma, Shuming, Wang, Houfeng
We propose a simple yet effective technique for neural network learning. The forward propagation is computed as usual. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-$k$ elements (in terms of magnitude) are kept. As a result, only $k$ rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction ($k$ divided by the vector dimension) in the computational cost. Surprisingly, experimental results demonstrate that we can update only 1-4% of the weights at each back propagation pass. This does not result in a larger number of training iterations. More interestingly, the accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given. The code is available at https://github.com/lancopku/meProp
A Hybrid GA-PSO Method for Evolving Architecture and Short Connections of Deep Convolutional Neural Networks
Wang, Bin, Sun, Yanan, Xue, Bing, Zhang, Mengjie
Image classification is a difficult machine learning task, where Convolutional Neural Networks (CNNs) have been applied for over 20 years in order to solve the problem. In recent years, instead of the traditional way of only connecting the current layer with its next layer, shortcut connections have been proposed to connect the current layer with its forward layers apart from its next layer, which has been proved to be able to facilitate the training process of deep CNNs. However, there are various ways to build the shortcut connections, it is hard to manually design the best shortcut connections when solving a particular problem, especially given the design of the network architecture is already very challenging. In this paper, a hybrid evolutionary computation (EC) method is proposed to \textit{automatically} evolve both the architecture of deep CNNs and the shortcut connections. Three major contributions of this work are: Firstly, a new encoding strategy is proposed to encode a CNN, where the architecture and the shortcut connections are encoded separately; Secondly, a hybrid two-level EC method, which combines particle swarm optimisation and genetic algorithms, is developed to search for the optimal CNNs; Lastly, an adjustable learning rate is introduced for the fitness evaluations, which provides a better learning rate for the training process given a fixed number of epochs. The proposed algorithm is evaluated on three widely used benchmark datasets of image classification and compared with 12 peer Non-EC based competitors and one EC based competitor. The experimental results demonstrate that the proposed method outperforms all of the peer competitors in terms of classification accuracy.
Rectangular Bounding Process
Fan, Xuhui, Li, Bin, Sisson, Scott Anthony
Stochastic partition models divide a multi-dimensional space into a number of rectangular regions, such that the data within each region exhibit certain types of homogeneity. Due to the nature of their partition strategy, existing partition models may create many unnecessary divisions in sparse regions when trying to describe data in dense regions. To avoid this problem we introduce a new parsimonious partition model -- the Rectangular Bounding Process (RBP) -- to efficiently partition multi-dimensional spaces, by employing a bounding strategy to enclose data points within rectangular bounding boxes. Unlike existing approaches, the RBP possesses several attractive theoretical properties that make it a powerful nonparametric partition prior on a hypercube. In particular, the RBP is self-consistent and as such can be directly extended from a finite hypercube to infinite (unbounded) space. We apply the RBP to regression trees and relational models as a flexible partition prior. The experimental results validate the merit of the RBP {in rich yet parsimonious expressiveness} compared to the state-of-the-art methods.
Will AI bring gender equality closer?
Is the age of intelligent machines bringing gender equality nearer or turning back the clock? Gemma Lloyd, co-founder of Work180, an Australia-based international jobs network for women, is proud of her engineering team in which women outnumber men. She just wishes there were more female engineers generally. "If there aren't enough women in the mix, the products won't be as good as they could be, and they certainly won't be what society wants -- because women are 50 per cent of society," she says. The lack of female technologists -- only 22 per cent of artificial intelligence professionals globally are female, for instance -- is a frustration for many gend er equality advocates.
New Zealand farmers are using drones to herd sheep
Lambeth's employer, Ben Crossley, confirmed that his fourth-generation farm is indeed using drones to control sheep. One favored model: the DJI Mavic Enterprise, which is already outfitted to play sounds -- such as barking -- over a speaker. The Washington Post noted that farmers are already using drones around the world for a variety of farming tasks, *including* surveying crops. The Washington Post noted that farmers are already using drones around the world for a variety of farming tasks, including surveying crops. Having the devices deal directly with animals is less common -- but it could be a vision of the future of agriculture.
Inductive Transfer for Neural Architecture Optimization
Wistuba, Martin, Pedapati, Tejaswini
The recent advent of automated neural network architecture search led to several methods that outperform state-of-the-art human-designed architectures. However, these approaches are computationally expensive, in extreme cases consuming GPU years. We propose two novel methods which aim to expedite this optimization problem by transferring knowledge acquired from previous tasks to new ones. First, we propose a novel neural architecture selection method which employs this knowledge to identify strong and weak characteristics of neural architectures across datasets. Thus, these characteristics do not need to be rediscovered in every search, a strong weakness of current state-of-the-art searches. Second, we propose a method for learning curve extrapolation to determine if a training process can be terminated early. In contrast to existing work, we propose to learn from learning curves of architectures trained on other datasets to improve the prediction accuracy for novel datasets. On five different image classification benchmarks, we empirically demonstrate that both of our orthogonal contributions independently lead to an acceleration, without any significant loss in accuracy.
Generating Difficult SAT Instances by Preventing Triangles
Escamocher, Guillaume, O'Sullivan, Barry, Prestwich, Steven David
When creating benchmarks for SAT solvers, we need SAT instances that are easy to build but hard to solve. A recent development in the search for such methods has led to the Balanced SAT algorithm, which can create k-SAT instances with m clauses of high difficulty, for arbitrary k and m. In this paper we introduce the No-Triangle SAT algorithm, a SAT instance generator based on the cluster coefficient graph statistic. We empirically compare the two algorithms by fixing the arity and the number of variables, but varying the number of clauses. The hardest instances that we find are produced by No-Triangle SAT. Furthermore, difficult instances from No-Triangle SAT have a different number of clauses than difficult instances from Balanced SAT, potentially allowing a combination of the two methods to find hard SAT instances for a larger array of parameters.
Incorporating social practices in BDI agent systems
Cranefield, Stephen, Dignum, Frank
When agents interact with humans, either through embodied agents or because they are embedded in a robot, it would be easy if they could use fixed interaction protocols as they do with other agents. However, people do not keep fixed protocols in their day-to-day interactions and the environments are often dynamic, making it impossible to use fixed protocols. Deliberating about interactions from fundamentals is not very scalable either, because in that case all possible reactions of a user have to be considered in the plans. In this paper we argue that social practices can be used as an inspiration for designing flexible and scalable interaction mechanisms that are also robust. However, using social practices requires extending the traditional BDI deliberation cycle to monitor landmark states and perform expected actions by leveraging existing plans. We define and implement this mechanism in Jason using a periodically run meta-deliberation plan, supported by a metainterpreter, and illustrate its use in a realistic scenario.
SeizureNet: A Deep Convolutional Neural Network for Accurate Seizure Type Classification and Seizure Detection
Asif, Umar, Roy, Subhrajit, Tang, Jianbin, Harrer, Stefan
Automatic epileptic seizure analysis is important because the differentiation of neural patterns among different patients can be used to classify people with specific types of epilepsy. This could enable more efficient management of the disease. Automatic seizure type classification using clinical electroencephalograms (EEGs) is challenging due to factors such as low signal to noise ratios, signal artefacts, high variance in the seizure semiology among individual epileptic patients, and limited clinical data constraints. To overcome these challenges, in this paper, we present a deep learning based framework which uses a Convolutional Neural Network (CNN) with dense connections and learns highly robust features at different spatial and temporal resolutions of the EEG data spectrum for accurate cross-patient seizure type classification. We evaluate our framework for seizure type classification and seizure detection on the recently released TUH EEG Seizure Corpus, where our framework achieves overall weighted f 1 scores of up to 0.90 and 0.88, thereby setting new benchmarks on the dataset.
Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning
Kim, Dong Ki, Liu, Miao, Omidshafiei, Shayegan, Lopez-Cot, Sebastian, Riemer, Matthew, Habibi, Golnaz, Tesauro, Gerald, Mourad, Sami, Campbell, Murray, How, Jonathan P.
Heterogeneous knowledge naturally arises among different agents in cooperative multiagent reinforcement learning. As such, learning can be greatly improved if agents can effectively pass their knowledge on to other agents. Existing work has demonstrated that peer-to-peer knowledge transfer, a process referred to as action advising, improves team-wide learning. In contrast to previous frameworks that advise at the level of primitive actions, we aim to learn high-level teaching policies that decide when and what high-level action (e.g., sub-goal) to advise a teammate. We introduce a new learning to teach framework, called hierarchical multiagent teaching (HMAT). The proposed framework solves difficulties faced by prior work on multiagent teaching when operating in domains with long horizons, delayed rewards, and continuous states/actions by leveraging temporal abstraction and deep function approximation. Our empirical evaluations show that HMAT accelerates team-wide learning progress in difficult environments that are more complex than those explored in previous work. HMAT also learns teaching policies that can be transferred to different teammates/tasks and can even teach teammates with heterogeneous action spaces.