Regression
NLS: an accurate and yet easy-to-interpret regression method
Coscrato, Victor, Inácio, Marco Henrique de Almeida, Botari, Tiago, Izbicki, Rafael
An important feature of successful supervised machine learning applications is to be able to explain the predictions given by the regression or classification model being used. However, most state-of-the-art models that have good predictive power lead to predictions that are hard to interpret. Thus, several model-agnostic interpreters have been developed recently as a way of explaining black-box classifiers. In practice, using these methods is a slow process because a novel fitting is required for each new testing instance, and several non-trivial choices must be made. We develop NLS (neural local smoother), a method that is complex enough to give good predictions, and yet gives solutions that are easy to be interpreted without the need of using a separate interpreter. The key idea is to use a neural network that imposes a local linear shape to the output layer. We show that NLS leads to predictive power that is comparable to state-of-the-art machine learning models, and yet is easier to interpret.
Sparse Reduced-Rank Regression for Simultaneous Rank and Variable Selection via Manifold Optimization
Yoshikawa, Kohei, Kawano, Shuichi
We consider the problem of constructing a reduced-rank regression model whose coefficient parameter is represented as a singular value decomposition with sparse singular vectors. The traditional estimation procedure for the coefficient parameter often fails when the true rank of the parameter is high. To overcome this issue, we develop an estimation algorithm with rank and variable selection via sparse regularization and manifold optimization, which enables us to obtain an accurate estimation of the coefficient parameter even if the true rank of the coefficient parameter is high. Using sparse regularization, we can also select an optimal value of the rank. We conduct Monte Carlo experiments and real data analysis to illustrate the effectiveness of our proposed method.
Building Machine Learning Models to Solve Practical Problems - Simple Talk
Machine learning has been reshaping our lives for quite a while now. Be it the smallest thing such as unlocking your phone through Face Recognition to useful interactions with Siri, Alexa, Cortana, or Google using Speech Recognition, machine learning is everywhere! In this article, I am going to provide a brief overview of machine learning and data science. With a basic understanding of these concepts, you can dive deeper into the details of linear regression and how you can build a machine learning model that will help you to solve many practical problems. The article will focus on building a Linear Regression model for Movie Budget data using various modules in Python.
On EducationDeep Learning Prerequisites: Logistic Regression in Python - CouponED
This course is a lead-in to deep learning and neural networks - it covers a popular and fundamental technique used in machine learning, data science and statistics: logistic regression. We cover the theory from the ground up: derivation of the solution, and applications to real-world problems. We show you how one might code their own logistic regression module in Python. This course does not require any external materials. Everything needed (Python, and some Python libraries) can be obtained for free.
Deep Structured Mixtures of Gaussian Processes
Trapp, Martin, Peharz, Robert, Pernkopf, Franz, Rasmussen, Carl E.
Gaussian Processes (GPs) are powerful non-parametric Bayesian regression models that allow exact posterior inference, but exhibit high computational and memory costs. In order to improve scalability of GPs, approximate posterior inference is frequently employed, where a prominent class of approximation techniques is based on local GP experts. However, the local-expert techniques proposed so far are either not well-principled, come with limited approximation guarantees, or lead to intractable models. In this paper, we introduce deep structured mixtures of GP experts, a stochastic process model which i) allows exact posterior inference, ii) has attractive computational and memory costs, and iii), when used as GP approximation, captures predictive uncertainties consistently better than previous approximations. In a variety of experiments, we show that deep structured mixtures have a low approximation error and outperform existing expert-based approaches.
Using Machine Learning to Recommend Investments in P2P Lending
Peer-to-peer lending marketplaces like LendingClub and Prosper Marketplace are driven by what is essentially a brokers fee for connecting investors and borrowers. They are incentivized to increase the total number of transactions taking place on their platforms. Driven by ease-of-use, their off-the-shelf credit risk assessments are scored in grouped buckets. On a loan-by-loan basis, this is inefficient given each loan's uniqueness and the sheer amount of data collected from borrowers. Scoring risk on a more granular, continuous basis is not only possible but preferable over discrete, grouped buckets.
Estimating regression errors without ground truth values
Tiittanen, Henri, Oikarinen, Emilia, Henelius, Andreas, Puolamäki, Kai
Regression analysis is a standard supervised machine learning method used to model an outcome variable in terms of a set of predictor variables. In most real-world applications we do not know the true value of the outcome variable being predicted outside the training data, i.e., the ground truth is unknown. It is hence not straightforward to directly observe when the estimate from a model potentially is wrong, due to phenomena such as overfitting and concept drift. In this paper we present an efficient framework for estimating the generalization error of regression functions, applicable to any family of regression functions when the ground truth is unknown. We present a theoretical derivation of the framework and empirically evaluate its strengths and limitations. We find that it performs robustly and is useful for detecting concept drift in datasets in several real-world domains.
Supervised feature selection with orthogonal regression and feature weighting
Wu, Xia, Xu, Xueyuan, Liu, Jianhong, Wang, Hailing, Hu, Bin, Nie, Feiping
Effective features can improve the performance of a model, which can thus help us understand the characteristics and underlying structure of complex data. Previous feature selection methods usually cannot keep more local structure information. To address the defects previously mentioned, we propose a novel supervised orthogonal least square regression model with feature weighting for feature selection. The optimization problem of the objection function can be solved by employing generalized power iteration (GPI) and augmented Lagrangian multiplier (ALM) methods. Experimental results show that the proposed method can more effectively reduce the feature dimensionality and obtain better classification results than traditional feature selection methods. The convergence of our iterative method is proved as well. Consequently, the effectiveness and superiority of the proposed method are verified both theoretically and experimentally.
How It Feels to Learn Data Science in 2019
So I just have to buy a Tableau license and I'm now a data scientist? Okay, let's just take that sales pitch with a grain of salt. I may be clueless, but I know there is more to data science than making pretty visualizations. I can do that in Excel. You got to admit it is slick marketing though. Charting data is the fun stage, and they leave out the painful and time-consuming parts of working with data: cleaning, wrangling, transforming, and loading it. Yes, and that is why I suspect there is value in learning to code. Maybe you can learn Alteryx. There's another software called Alteryx that allows you to clean, wrangle, transform, and load data.
Algorithmic Probability-guided Supervised Machine Learning on Non-differentiable Spaces
Hernández-Orozco, Santiago, Zenil, Hector, Riedel, Jürgen, Uccello, Adam, Kiani, Narsis A., Tegnér, Jesper
We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this new approach requires less training data and is more generalizable as it shows greater resilience to random attacks. We investigate the shape of the discrete algorithmic space when performing regression or classification using a loss function parametrized by algorithmic complexity, demonstrating that the property of differentiation is not necessary to achieve results similar to those obtained using differentiable programming approaches such as deep learning. In doing so we use examples which enable the two approaches to be compared (small, given the computational power required for estimations of algorithmic complexity). We find and report that (i) machine learning can successfully be performed on a non-smooth surface using algorithmic complexity; (ii) that parameter solutions can be found using an algorithmic-probability classifier, establishing a bridge between a fundamentally discrete theory of computability and a fundamentally continuous mathematical theory of optimization methods; (iii) a formulation of an algorithmically directed search technique in non-smooth manifolds can be defined and conducted; (iv) exploitation techniques and numerical methods for algorithmic search to navigate these discrete non-differentiable spaces can be performed; in application of the (a) identification of generative rules from data observations; (b) solutions to image classification problems more resilient against pixel attacks compared to neural networks; (c) identification of equation parameters from a small data-set in the presence of noise in continuous ODE system problem, (d) classification of Boolean NK networks by (1) network topology, (2) underlying Boolean function, and (3) number of incoming edges.