Goto

Collaborating Authors

 Regression


AI dissonance will end when we ask the right questions in the boardroom

#artificialintelligence

There is a disturbing movement among technology companies today. Many claim to be using artificial intelligence in one way or another, but more often than not, these claims are a massive exaggeration. This may be hard to believe, especially in the age of the Elon Musk's warnings about a potential global apocalypse caused by AI. While Musk's warnings may be justified, they're hardly relevant -- AI is still playing around in the wading pool of what is and isn't possible. Do you have an AI strategy -- or hoping to get one?


Gaussian Discriminant Analysis an example of Generative Learning Algorithms

#artificialintelligence

Generative Learning Algorithms: In Linear Regression and Logistic Regression both we modelled conditional distribution of y given x, as follow. Algorithms that model p(y x) directly from the training set are called discriminative algorithms. There can be a different approach to the same problem, consider the same binary classification problem where we want learn to distinguish between two classes, class A (y 1) and class B (y 0) based on some features. Now we take all the examples of label A and try to learn the features and build a model for class A. Then we take all the examples labeled B and try to learn it's features and build a separate model for class B. Finally to classify a new element, we match it against each model and see which one fits better (generate high value for probability). In this approach we try to model p(x y) and p(y) as oppose to p(y x) we did earlier, it's called Generative Learning Algorithms.


30 Questions to test your understanding of Logistic Regression

@machinelearnbot

Logistic Regression is likely the most commonly used algorithm for solving all classification problems. It is also one of the first methods people get their hands dirty on. We saw the same spirit on the test we designed to assess people on Logistic Regression. More than 800 people took this test. This skill test is specially designed for you to test your knowledge on logistic regression and its nuances.


Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case

arXiv.org Machine Learning

In an effort to overcome the data deluge in computational biology and bioinformatics and to facilitate bioinformatics research in the era of big data, we identify some of the most influential algorithms that have been widely used in the bioinformatics community. These top data mining and machine learning algorithms cover classification, clustering, regression, graphical model-based learning, and dimensionality reduction. The goal of this study is to guide the focus of scalable computing experts in the endeavor of applying new storage and scalable computation designs to bioinformatics algorithms that merit their attention most, following the engineering maxim of "optimize the common case".


Unsupervised Domain Adaptation with Copula Models

arXiv.org Machine Learning

We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family, (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.


Sparse Hierarchical Regression with Polynomials

arXiv.org Machine Learning

We present a novel method for exact hierarchical sparse polynomial regression. Our regressor is that degree $r$ polynomial which depends on at most $k$ inputs, counting at most $\ell$ monomial terms, which minimizes the sum of the squares of its prediction errors. The previous hierarchical sparse specification aligns well with modern big data settings where many inputs are not relevant for prediction purposes and the functional complexity of the regressor needs to be controlled as to avoid overfitting. We present a two-step approach to this hierarchical sparse regression problem. First, we discard irrelevant inputs using an extremely fast input ranking heuristic. Secondly, we take advantage of modern cutting plane methods for integer optimization to solve our resulting reduced hierarchical $(k, \ell)$-sparse problem exactly. The ability of our method to identify all $k$ relevant inputs and all $\ell$ monomial terms is shown empirically to experience a phase transition. Crucially, the same transition also presents itself in our ability to reject all irrelevant features and monomials as well. In the regime where our method is statistically powerful, its computational complexity is interestingly on par with Lasso based heuristics. The presented work fills a void in terms of a lack of powerful disciplined nonlinear sparse regression methods in high-dimensional settings. Our method is shown empirically to scale to regression problems with $n\approx 10,000$ observations for input dimension $p\approx 1,000$.


Sparse High-Dimensional Regression: Exact Scalable Algorithms and Phase Transitions

arXiv.org Machine Learning

We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve to provable optimality the sparse regression problem for sample sizes n and number of regressors p in the 100,000s, that is two orders of magnitude better than the current state of the art, in seconds. The ability to solve the problem for very high dimensions allows us to observe new phase transition phenomena. Contrary to traditional complexity theory which suggests that the difficulty of a problem increases with problem size, the sparse regression problem has the property that as the number of samples $n$ increases the problem becomes easier in that the solution recovers 100% of the true signal, and our approach solves the problem extremely fast (in fact faster than Lasso), while for small number of samples n, our approach takes a larger amount of time to solve the problem, but importantly the optimal solution provides a statistically more relevant regressor. We argue that our exact sparse regression approach presents a superior alternative over heuristic methods available at present.


Introduction to Deepnets

#artificialintelligence

We are proud to present Deepnets as the new resource brought to the BigML platform. On October 5, 2017, it will be available via the BigML Dashboard, API and WhizzML. Deepnets (an optimized version of Deep Neural Networks) are part of a broader family of classification and regression methods based on learning data representations from a wide variety of data types (e.g., numeric, categorical, text, image). Deepnets have been successfully used to solve many types of classification and regression problems in addition to social network filtering, machine translation, bioinformatics and similar problems in data-rich domains. In the spirit of making Machine Learning easy for everyone, we will provide new learning material for you to start with Deepnets from scratch and progressively become a power user.


Multi-way Interacting Regression via Factorization Machines

arXiv.org Machine Learning

We propose a Bayesian regression method that accounts for multi-way interactions of arbitrary orders among the predictor variables. Our model makes use of a factorization mechanism for representing the regression coefficients of interactions among the predictors, while the interaction selection is guided by a prior distribution on random hypergraphs, a construction which generalizes the Finite Feature Model. We present a posterior inference algorithm based on Gibbs sampling, and establish posterior consistency of our regression model. Our method is evaluated with extensive experiments on simulated data and demonstrated to be able to identify meaningful interactions in applications in genetics and retail demand forecasting.


Logistic Regression using Python (Sklearn, NumPy, MNIST, Handwriting Recognition, Matplotlib)

@machinelearnbot

One of the first models I learned when I started my data science journey was Logistic Regression. The name Logistic Regression is highly misleading. Logisitic regression actually is a classification algorithm and not a regression algorithm. Logistic regression can be used to solve problems like classifying images. The image above shows a bunch of training digits (observations) from the MNIST dataset whose category membership is known (labels 0–9).