Oceania
Overcoming Long-term Catastrophic Forgetting through Adversarial Neural Pruning and Synaptic Consolidation
Tang, Jian Peng Bo, Jiang, Hao, Li, Zhuo, Lei, Yinjie, Lin, Tao, Li, Haifeng
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in realistic human application scenarios. However, as the task sequence increases, the model quickly forgets previously learned skills; we refer to this loss of memory of long sequences as long-term catastrophic forgetting. There are two main reasons for the long-term forgetting: first, as the tasks increase, the intersection of the low-error parameter subspace satisfying these tasks will become smaller and smaller or even non-existent; The second is the cumulative error in the process of protecting the knowledge of previous tasks. This paper, we propose a confrontation mechanism in which neural pruning and synaptic consolidation are used to overcome long-term catastrophic forgetting. This mechanism distills task-related knowledge into a small number of parameters, and retains the old knowledge by consolidating a small number of parameters, while sparing most parameters to learn the follow-up tasks, which not only avoids forgetting but also can learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the parameter conditions of the current task to expand the common parameter subspace of tasks; The modified synaptic consolidation strategy is comprised of two components, a novel network structure information considered measurement is proposed to calculate the parameter importance, and a element-wise parameter updating strategy that is designed to prevent significant parameters being overridden in subsequent learning. We verified the method on image classification, and the results showed that our proposed ANPSC approach outperforms the state-of-the-art methods. The hyperparametric sensitivity test further demonstrates the robustness of our proposed approach.
Global Deep Learning System Market Analysis by Market Key Player, Product Application & Geography
Deep Learning System Market report offers detailed analysis and a five-year forecast for the global Deep Learning System industry. Deep Learning System market report delivers the insights which will shape your strategic planning as you estimate geographic, product or service expansion within the Deep Learning System industry.. The Deep Learning System market accounted for $XX million in 2018, and is expected to reach $XX million by 2024, registering a CAGR of YY% from 2019 to 2024. The global Deep Learning System market is segmented based on product, end user, and region. Region wise, it is analyzed across North America (U.S., Canada, and Mexico), Europe (Germany, UK, Italy, Spain, France, and rest of Europe), Asia-Pacific (Japan, China, Australia, India, South Korea, Taiwan, and, rest of Asia-Pacific) and EMEA (Brazil, South Africa, Saudi Arabia, UAE, rest of EMEA). Ask more details or request custom reports to our experts at https://www.proaxivereports.com/pre-order/12206 Moreover, other factors that contribute toward the growth of the Deep Learning System market include favorable government initiatives related to the use of Deep Learning System.
Trintech Expands Artificial Intelligence Strategy to Support the Office of Finance
DALLAS, TX / ACCESSWIRE / December 17, 2019 / Trintech, a leading global provider of integrated Record to Report software solutions for the office of finance, today announced its newest Artificial Intelligence (AI) investments, AI Risk Rating for Journal Entries and Risk Intelligent Inspect powered by MindBridge Ai. Each of these investments leverage Financial Controls AI, a type of Artificial Intelligence developed specifically for the complex needs of the office of finance to identify errors and anomalies in financial data. It uses a risk-based approach to help financial professionals optimize global controls and automate workflow. "Artificial Intelligence is playing a powerful role in helping organizations analyze financial data, identify insights and ultimately remove risk in their balance sheet as far down as each individual transaction," said Michael Ross, Chief Product Officer at Trintech. "As the risk of fraudulent activity and misstatement continues to rise, we are continuing to invest in our AI strategy to better provide our customers with solutions that efficiently and effectively reduce risk throughout their financial close process."
A Comprehensive Review of Shepherding as a Bio-inspired Swarm-Robotics Guidance Approach
Long, Nathan K, Sammut, Karl, Sgarioto, Daniel, Garratt, Matthew, Abbass, Hussein
The simultaneous control of multiple coordinated robotic agents represents an elaborate problem. If solved, however, the interaction between the agents can lead to solutions to sophisticated problems. The concept of swarming, inspired by nature, can be described as the emergence of complex system-level behaviors from the interactions of relatively elementary agents. Due to the effectiveness of solutions found in nature, bio-inspired swarming-based control techniques are receiving a lot of attention in robotics. One method, known as swarm shepherding, is founded on the sheep herding behavior exhibited by sheepdogs, where a swarm of relatively simple agents are governed by a shepherd (or shepherds) which is responsible for high-level guidance and planning. Many studies have been conducted on shepherding as a control technique, ranging from the replication of sheep herding via simulation, to the control of uninhabited vehicles and robots for a variety of applications. We present a comprehensive review of the literature on swarm shepherding to reveal the advantages and potential of the approach to be applied to a plethora of robotic systems in the future.
On the Metrics and Adaptation Methods for Domain Divergences of sEMG-based Gesture Recognition
Ketykó, István, Kovács, Ferenc
Machine Learning (ML) is widely used for several tasks with time-series and biosensor data such as for human activity recognition, electronic health records data-based predictions (Ismail Fawaz et al., 2019), and real-time bionsensor-based decisions. V arious classification goals are addressed related to electrocardiography (ECG) (Jambukia et al., 2015), elec-troencephalography (EEG) (Craik et al., 2019; Dose et al., 2018), and electromyograpy (EMG) (Ketyk et al., 2019; Hu et al., 2018; Patricia et al., 2014; Du et al., 2017). Sensing hand gestures can be done by means of wearables or by means of image or video analysis of hand or finger motion. A wearable-based detection can physically rely on measuring the acceleration and rotations of our body parts (arms, hands or fingers) with Inertial Measurement Unit (IMU) sensors or by measuring the myo-electric signals generated by the various muscles of our arms or fingers with EMG sensors. Surface EMG (sEMG) records muscle activity from the surface of the skin which is above the muscle being evaluated. The signal is collected via surface electrodes. We are interested in sEMG-sensor placement to the forearm and performing hand gesture recognition with ML.
Lower Dimensional Kernels for Video Discriminators
Kahembwe, Emmanuel, Ramamoorthy, Subramanian
This work presents an analysis of the discriminators used in Generative Adversarial Networks (GANs) for Video. We show that unconstrained video discriminator architectures induce a loss surface with high curvature which make optimisation difficult. We also show that this curvature becomes more extreme as the maximal kernel dimension of video discriminators increases. With these observations in hand, we propose a family of efficient Lower-Dimensional Video Discriminators for GANs (LDVD GANs). The proposed family of discriminators improve the performance of video GAN models they are applied to and demonstrate good performance on complex and diverse datasets such as UCF-101. In particular, we show that they can double the performance of Temporal-GANs and provide for state-of-the-art performance on a single GPU.
Analytic expressions for the output evolution of a deep neural network
Anastasia Borovykh December 19, 2019 Abstract We present a novel methodology based on a Taylor expansion of the network output for obtaining analytical expressions for the expected value of the network weights and output under stochastic training. Using these analytical expressions the effects of the hyperparameters and the noise variance of the optimization algorithm on the performance of the deep neural network are studied. In the early phases of training with a small noise coefficient, the output is equivalent to a linear model. In this case the network can generalize better due to the noise preventing the output from fully converging on the train data, however the noise does not result in any explicit regularization. In the later training stages, when higher order approximations are required, the impact of the noise becomes more significant, i.e. in a model which is nonlinear in the weights noise can regularize the output function resulting in better generalization as witnessed by its influence on the weight Hessian, a commonly used metric for generalization capabilities. Keywords: deep learning; Taylor expansion; stochastic gradient descent; regularization; generalization 1 Introduction With the large number of applications which are nowadays in some way using deep learning, it is of significant value to gain insight into the output evolution of a deep neural network and the effects that the model architecture and optimization algorithm have on it. A deep neural network is a complex model due to the nonlinear dependencies and the large number of parameters in the model. Understanding the network output and its generalization capabilities, i.e. how well a model optimized on train data will be able to perform on unseen test data, is thus a complex task. One way of gaining insight into the network is by studying it in a large-parameter limit, a setting in which its dynamics becomes analytically tractable. Such limits have been considered in e.g. The generalization capabilities and the definition of various quantities that measure these have been studied extensively. Previous work has shown that the norm [3], [27], [19], the width of a minimum in weight space [11], [34], the input sensitivity [28] and a model's compressibility [2] can be related (either theoretically or in practice) to the model's complexity and thus its ability to perform well on unseen data. Furthermore, it has been noted that the generalization capabilities can be influenced by the optimization algorithm used to train the model, e.g. it can be used to bias the model into configurations that are more robust to noise and have lower model complexity, see e.g. Furthermore, it has been observed that certain parameters of stochastic gradient descent (SGD) can be used to control the generalization error and the data fit, see e.g.
Feature engineering workflow for activity recognition from synchronized inertial measurement units
Kempa-Liehr, Andreas W., Oram, Jonty, Wong, Andrew, Finch, Mark, Besier, Thor
The ubiquitous availability of wearable sensors is responsible for driving the Internet-of-Things but is also making an impact on sport sciences and precision medicine. While human activity recognition from smartphone data or other types of inertial measurement units (IMU) has evolved to one of the most prominent daily life examples of machine learning, the underlying process of time-series feature engineering still seems to be time-consuming. This lengthy process inhibits the development of IMU-based machine learning applications in sport science and precision medicine. This contribution discusses a feature engineering workflow, which automates the extraction of time-series feature on based on the FRESH algorithm (FeatuRe Extraction based on Scalable Hypothesis tests) to identify statistically significant features from synchronized IMU sensors (IMeasureU Ltd, NZ). The feature engineering workflow has five main steps: time-series engineering, automated time-series feature extraction, optimized feature extraction, fitting of a specialized classifier, and deployment of optimized machine learning pipeline. The workflow is discussed for the case of a user-specific running-walking classification, and the generalization to a multi-user multi-activity classification is demonstrated.
Deep Reinforcement Learning Designed RF Pulse: $DeepRF_{SLR}$
Shin, Dongmyung, Ji, Sooyeon, Lee, Doohee, Lee, Jieun, Oh, Se-Hong, Lee, Jongho
A novel approach of applying deep reinforcement learning to an RF pulse design is introduced. This method, which is referred to as $DeepRF_{SLR}$, is designed to minimize the peak amplitude or, equivalently, minimize the pulse duration of a multiband refocusing pulse generated by the Shinar Le-Roux (SLR) algorithm. In the method, the root pattern of SLR polynomial, which determines the RF pulse shape, is optimized by iterative applications of deep reinforcement learning and greedy tree search. When tested for the designs of the multiband factors of three and seven RFs, $DeepRF_{SLR}$ demonstrated improved performance compared to conventional methods, generating shorter duration RF pulses in shorter computational time. In the experiments, the RF pulse from $DeepRF_{SLR}$ produced a slice profile similar to the minimum-phase SLR RF pulse and the profiles matched to that of the computer simulation. Our approach suggests a new way of designing an RF by applying a machine learning algorithm, demonstrating a machine-designed MRI sequence.
Unsupervised Anomaly Detection in Stream Data with Online Evolving Spiking Neural Networks
Maciąg, Piotr S., Kryszkiewicz, Marzena, Bembenik, Robert, Lobo, Jesus L., Del Ser, Javier
In this work, we propose a novel OeSNN-UAD (Online evolving Spiking Neural Networks for Unsupervised Anomaly Detection) approach for online anomaly detection in univariate time series data. Our approach is based on evolving Spiking Neural Networks (eSNN). Its distinctive feature is that the proposed eSNN architecture learns in the process of classifying input values to be anomalous or not. In fact, we offer an unsupervised learning method for eSNN, in which classification is carried out without earlier pre-training of the network with data with labeled anomalies. Unlike in a typical eSNN architecture, neurons in the output repository of our architecture are not divided into known a priori decision classes. Each output neuron is assigned its own output value, which is modified in the course of learning and classifying the incoming input values of time series data. To better adapt to the changing characteristic of the input data and to make their classification efficient, the number of output neurons is limited: the older neurons are replaced with new neurons whose output values and synapses' weights are adjusted according to the current input values of the time series. The proposed OeSNN-UAD approach was experimentally compared to the state-of-the-art unsupervised methods and algorithms for anomaly detection in stream data. The experiments were carried out on Numenta Anomaly Benchmark and Yahoo Anomaly Datasets. According to the results of these experiments, our approach significantly outperforms other solutions provided in the literature in the case of Numenta Anomaly Benchmark. Also in the case of real data files category of Yahoo Anomaly Benchmark, OeSNN-UAD outperforms other selected algorithms, whereas in the case of Yahoo Anomaly Benchmark synthetic data files, it provides competitive results to the results recently reported in the literature.