Regression
A Strong Baseline for Weekly Time Series Forecasting
Godahewa, Rakshitha, Bergmeir, Christoph, Webb, Geoffrey I., Montero-Manso, Pablo
Many businesses and industries require accurate forecasts for weekly time series nowadays. The forecasting literature however does not currently provide easy-to-use, automatic, reproducible and accurate approaches dedicated to this task. We propose a forecasting method that can be used as a strong baseline in this domain, leveraging state-of-the-art forecasting techniques, forecast combination, and global modelling. Our approach uses four base forecasting models specifically suitable for forecasting weekly data: a global Recurrent Neural Network model, Theta, Trigonometric Box-Cox ARMA Trend Seasonal (TBATS), and Dynamic Harmonic Regression ARIMA (DHR-ARIMA). Those are then optimally combined using a lasso regression stacking approach. We evaluate the performance of our method against a set of state-of-the-art weekly forecasting models on six datasets. Across four evaluation metrics, we show that our method consistently outperforms the benchmark methods by a considerable margin with statistical significance. In particular, our model can produce the most accurate forecasts, in terms of mean sMAPE, for the M4 weekly dataset.
Two Recent Developments in Machine Learning for Protein Engineering
Both articles in this post came out of the George Church's lab at Harvard University. The first of them is Unified rational protein engineering with sequence-based deep representation learning. Here, the authors present a recurrent neural network (specifically, a type of mLSTM) which was trained on 24 million UniRef50 protein sequences with the objective of transforming each sequence into a numerical vector of fixed-length (that is, a deep representation). What these vectors or deep representations enable is the ability to analyze and compare protein sequences with techniques borrowed from linear algebra, as opposed to using traditional bioinformatics algorithms like sequence alignment. Next, the authors show that UniRep vectors can be used as input to train a simpler or "top" model (e.g. a linear regression) to predict the effect of single mutations.
A Theory of Hyperbolic Prototype Learning
We introduce Hyperbolic Prototype Learning, a type of supervised learning, where class labels are represented by ideal points (points at infinity) in hyperbolic space. Learning is achieved by minimizing the 'penalized Busemann loss', a new loss function based on the Busemann function of hyperbolic geometry. We discuss several theoretical features of this setup. In particular, Hyperbolic Prototype Learning becomes equivalent to logistic regression in the one-dimensional case.
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks
Dar, Yehuda, Baraniuk, Richard G.
We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically characterize the generalization error of the target task in terms of the salient factors in the transfer learning architecture, i.e., the number of examples available, the number of (free) parameters in each of the tasks, the number of parameters transferred from the source to target task, and the correlation between the two tasks. Our non-asymptotic analysis shows that the generalization error of the target task follows a two-dimensional double descent trend (with respect to the number of free parameters in each of the tasks) that is controlled by the transfer learning factors. Our analysis points to specific cases where the transfer of parameters is beneficial.
Machine Learning Regression Masterclass in Python
Free Coupon Discount - Machine Learning Regression Masterclass in Python, Build 8 Practical Projects and Master Machine Learning Regression Techniques Using Python, Scikit Learn and Keras Created by Dr. Ryan Ahmed, Ph.D., MBA, Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team, Mitchell Bouchard Students also bought Unsupervised Deep Learning in Python Deep Learning Prerequisites: Linear Regression in Python Neural Networks in Python from Scratch: Complete guide Artificial Intelligence: Optimization Algorithms in Python Machine Learning Practical: 6 Real-World Applications Preview this Udemy Course GET COUPON CODE Description Artificial Intelligence (AI) revolution is here! The technology is progressing at a massive scale and is being widely adopted in the Healthcare, defense, banking, gaming, transportation and robotics industries. Machine Learning is a subfield of Artificial Intelligence that enables machines to improve at a given task with experience. Machine Learning is an extremely hot topic; the demand for experienced machine learning engineers and data scientists has been steadily growing in the past 5 years. According to a report released by Research and Markets, the global AI and machine learning technology sectors are expected to grow from $1.4B to $8.8B by 2022 and it is predicted that AI tech sector will create around 2.3 million jobs by 2020.
Interpretable Machine Learning with an Ensemble of Gradient Boosting Machines
Konstantinov, Andrei V., Utkin, Lev V.
A method for the local and global interpretation of a black-box model on the basis of the well-known generalized additive models is proposed. It can be viewed as an extension or a modification of the algorithm using the neural additive model. The method is based on using an ensemble of gradient boosting machines (GBMs) such that each GBM is learned on a single feature and produces a shape function of the feature. The ensemble is composed as a weighted sum of separate GBMs resulting a weighted sum of shape functions which form the generalized additive model. GBMs are built in parallel using randomized decision trees of depth 1, which provide a very simple architecture. Weights of GBMs as well as features are computed in each iteration of boosting by using the Lasso method and then updated by means of a specific smoothing procedure. In contrast to the neural additive model, the method provides weights of features in the explicit form, and it is simply trained. A lot of numerical experiments with an algorithm implementing the proposed method on synthetic and real datasets demonstrate its efficiency and properties for local and global interpretation.
GPU-Accelerated Primal Learning for Extremely Fast Large-Scale Classification
Halloran, John T., Rocke, David M.
One of the most efficient methods to solve L2-regularized primal problems, such as logistic regression and linear support vector machine (SVM) classification, is the widely used trust region Newton algorithm, TRON. While TRON has recently been shown to enjoy substantial speedups on shared-memory multi-core systems, exploiting graphical processing units (GPUs) to speed up the method is significantly more difficult, owing to the highly complex and heavily sequential nature of the algorithm. In this work, we show that using judicious GPU-optimization principles, TRON training time for different losses and feature representations may be drastically reduced. For sparse feature sets, we show that using GPUs to train logistic regression classifiers in LIBLINEAR is up to an order-of-magnitude faster than solely using multithreading. For dense feature sets--which impose far more stringent memory constraints--we show that GPUs substantially reduce the lengthy SVM learning times required for state-of-the-art proteomics analysis, leading to dramatic improvements over recently proposed speedups. Furthermore, we show how GPU speedups may be mixed with multithreading to enable such speedups when the dataset is too large for GPU memory requirements; on a massive dense proteomics dataset of nearly a quarter-billion data instances, these mixed-architecture speedups reduce SVM analysis time from over half a week to less than a single day while using limited GPU memory.
FairMixRep : Self-supervised Robust Representation Learning for Heterogeneous Data with Fairness constraints
Chakraborty, Souradip, Verma, Ekansh, Sahoo, Saswata, Datta, Jyotishka
Representation Learning in a heterogeneous space with mixed variables of numerical and categorical types has interesting challenges due to its complex feature manifold. Moreover, feature learning in an unsupervised setup, without class labels and a suitable learning loss function, adds to the problem complexity. Further, the learned representation and subsequent predictions should not reflect discriminatory behavior towards certain sensitive groups or attributes. The proposed feature map should preserve maximum variations present in the data and needs to be fair with respect to the sensitive variables. We propose, in the first phase of our work, an efficient encoder-decoder framework to capture the mixed-domain information. The second phase of our work focuses on de-biasing the mixed space representations by adding relevant fairness constraints. This ensures minimal information loss between the representations before and after the fairness-preserving projections. Both the information content and the fairness aspect of the final representation learned has been validated through several metrics where it shows excellent performance. Our work (FairMixRep) addresses the problem of Mixed Space Fair Representation learning from an unsupervised perspective and learns a Universal representation that is timely, unique, and a novel research contribution.
Signal classification using weighted orthogonal regression method
In this paper, a new classifier based on the intrinsic properties of the data is proposed. Classification is an essential task in data mining-based applications. The classification problem will be challenging when the size of the training set is not sufficient to compare to the dimension of the problem. This paper proposes a new classification method that exploits the intrinsic structure of each class through the corresponding Eigen components. Each component contributes to the learned span of each class by specific weight. The weight is determined by the associated eigenvalue. This approach results in reliable learning robust in the case of facing a classification problem with limited training data. The proposed method involves the obtained Eigenvectors by SVD of data from each class to select the bases for each subspace. Moreover, it considers an efficient weighting for the decision-making criterion to discriminate two classes. In addition to high performance on artificial data, this method has increased the best result of international competition.