Perceptrons
Application of multilayer perceptron with data augmentation in nuclear physics
Bahtiyar, Hüseyin, Soydaner, Derya, Yüksel, Esra
Neural networks have become popular in many fields of science since they serve as promising, reliable and powerful tools. In this work, we study the effect of data augmentation on the predictive power of neural network models for nuclear physics data. We present two different data augmentation techniques, and we conduct a detailed analysis in terms of different depths, optimizers, activation functions and random seed values to show the success and robustness of the model. Using the experimental uncertainties for data augmentation for the first time, the size of the training data set is artificially boosted and the changes in the root-mean-square error between the model predictions on the test set and the experimental data are investigated. Our results show that the data augmentation decreases the prediction errors, stabilizes the model and prevents overfitting. The extrapolation capabilities of the MLP models are also tested for newly measured nuclei in AME2020 mass table, and it is shown that the predictions are significantly improved by using data augmentation.
Supernova Light Curves Approximation based on Neural Network Models
Demianenko, Mariia, Samorodova, Ekaterina, Sysak, Mikhail, Shiriaev, Aleksandr, Malanchev, Konstantin, Derkach, Denis, Hushchyn, Mikhail
Photometric data-driven classification of supernovae becomes a challenge due to the appearance of real-time processing of big data in astronomy. Recent studies have demonstrated the superior quality of solutions based on various machine learning models. These models learn to classify supernova types using their light curves as inputs. Preprocessing these curves is a crucial step that significantly affects the final quality. In this talk, we study the application of multilayer perceptron (MLP), bayesian neural network (BNN), and normalizing flows (NF) to approximate observations for a single light curve. We use these approximations as inputs for supernovae classification models and demonstrate that the proposed methods outperform the state-of-the-art based on Gaussian processes applying to the Zwicky Transient Facility Bright Transient Survey light curves. MLP demonstrates similar quality as Gaussian processes and speed increase. Normalizing Flows exceeds Gaussian processes in terms of approximation quality as well.
Overcoming the Spectral Bias of Neural Value Approximation
Yang, Ge, Ajay, Anurag, Agrawal, Pulkit
Value approximation using deep neural networks is at the heart of off-policy deep reinforcement learning, and is often the primary module that provides learning signals to the rest of the algorithm. While multi-layer perceptron networks are universal function approximators, recent works in neural kernel regression suggest the presence of a spectral bias, where fitting high-frequency components of the value function requires exponentially more gradient update steps than the low-frequency ones. In this work, we re-examine off-policy reinforcement learning through the lens of kernel regression and propose to overcome such bias via a composite neural tangent kernel. With just a single line-change, our approach, the Fourier feature networks (FFN) produce state-of-the-art performance on challenging continuous control domains with only a fraction of the compute. Faster convergence and better off-policy stability also make it possible to remove the target network without suffering catastrophic divergences, which further reduces TD}(0)'s estimation bias on a few tasks.
Introduction to Machine Learning
This course will provide you a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction. In addition, we have designed practice exercises that will give you hands-on experience implementing these data science models on data sets. These practice exercises will teach you how to implement machine learning algorithms with PyTorch, open source libraries used by leading tech companies in the machine learning field (e.g., Google, NVIDIA, CocaCola, eBay, Snapchat, Uber and many more).
Perceptron: AI that feels pain and predicts players' movements – TechCrunch
Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron (previously Deep Science), aims to collect some of the most relevant recent discoveries and papers -- particularly in, but not limited to, artificial intelligence -- and explain why they matter. This week in AI, a team of engineers at the University of Glasgow developed "artificial skin" that can learn to experience and react to simulated pain. Elsewhere, researchers at DeepMind developed a machine learning system that predicts where soccer players will run on a field, while groups from The Chinese University of Hong Kong (CUHK) and Tsinghua University created algorithms that can generate realistic photos -- and even videos -- of human models. According to a press release, the Glasgow team's artificial skin leveraged a new type of processing system based on "synaptic transistors" designed to mimic the brain's neural pathways.
Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime
Overparameterization is known to permit strong generalization performance in neural networks. In this work, we provide an initial theoretical analysis of its effect on catastrophic forgetting in a continual learning setup. We show experimentally that in permuted MNIST image classification tasks, the generalization performance of multilayer perceptrons trained by vanilla stochastic gradient descent can be improved by overparameterization, and the extent of the performance increase achieved by overparameterization is comparable to that of state-of-the-art continual learning algorithms. We provide a theoretical explanation of this effect by studying a qualitatively similar two-task linear regression problem, where each task is related by a random orthogonal transformation. We show that when a model is trained on the two tasks in sequence without any additional regularization, the risk gain on the first task is small if the model is sufficiently overparameterized.
Perceptron: The risks of teleoperating robots and AI that beats Rocket League – TechCrunch
Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron (previously Deep Science), aims to collect some of the most relevant recent discoveries and papers -- particularly in, but not limited to, artificial intelligence -- and explain why they matter. This week in AI, researchers discovered a method that could allow adversaries to track the movements of remotely-controlled robots even when the robots' communications are encrypted end-to-end. The coauthors, who hail from the University of Strathclyde in Glasgow, said that their study shows adopting the best cybersecurity practices isn't enough to stop attacks on autonomous systems. Remote control, or teleoperation, promises to enable operators to guide one or several robots from afar in a range of environments.
Vectorization: Must-know Technique to Speed Up Operations 100x Faster
In current data science or machine learning applications, huge datasets and sophisticated networks are usually involved. Thus, code efficiency becomes really important when it comes to handling the computational workload. As an example, in a classic Multi-layer Perceptron (aka Feedforward Neural Network), the network typically contains multiple linear layers. Let's say the input layer contains 64 neurons while the first hidden layer contains 128 hidden neurons. Then, in order to calculate the output of the hidden layer given an input, the straightforward way would be to use the np.dot method provided by the Numpy library: As we can see, this method takes 1.4 microseconds on average.
Fair and Green Hyperparameter Optimization via Multi-objective and Multiple Information Source Bayesian Optimization
Candelieri, Antonio, Ponti, Andrea, Archetti, Francesco
There is a consensus that focusing only on accuracy in searching for optimal machine learning models amplifies biases contained in the data, leading to unfair predictions and decision supports. Recently, multi-objective hyperparameter optimization has been proposed to search for machine learning models which offer equally Pareto-efficient trade-offs between accuracy and fairness. Although these approaches proved to be more versatile than fairness-aware machine learning algorithms -- which optimize accuracy constrained to some threshold on fairness -- they could drastically increase the energy consumption in the case of large datasets. In this paper we propose FanG-HPO, a Fair and Green Hyperparameter Optimization (HPO) approach based on both multi-objective and multiple information source Bayesian optimization. FanG-HPO uses subsets of the large dataset (aka information sources) to obtain cheap approximations of both accuracy and fairness, and multi-objective Bayesian Optimization to efficiently identify Pareto-efficient machine learning models. Experiments consider two benchmark (fairness) datasets and two machine learning algorithms (XGBoost and Multi-Layer Perceptron), and provide an assessment of FanG-HPO against both fairness-aware machine learning algorithms and hyperparameter optimization via a multi-objective single-source optimization algorithm in BoTorch, a state-of-the-art platform for Bayesian Optimization.