Goto

Collaborating Authors

 Regression


Ideas on interpreting machine learning

#artificialintelligence

For more on advances in machine learning, prediction, and technology, check out the Data science and advanced analytics sessions at Strata Hadoop World London, May 22-25, 2017. You've probably heard by now that machine learning algorithms can use big data to predict whether a donor will give to a charity, whether an infant in a NICU will develop sepsis, whether a customer will respond to an ad, and on and on. Machine learning can even drive cars and predict elections. I believe it can, but these recent high-profile hiccups should leave everyone who works with data (big or not) and machine learning algorithms asking themselves some very hard questions: do I understand my data? Do I understand the model and answers my machine learning algorithm is giving me? And do I trust these answers? Unfortunately, the complexity that bestows the extraordinary predictive abilities on machine learning algorithms also makes the answers the algorithms produce hard to understand, and maybe even hard to ...


MyChillNews: An NLP-driven guide for conscious consumption of stressful news

#artificialintelligence

Daniel Saunders participated in the Insight Health Data Science program in the Fall of 2016, and currently works as a Data Scientist at Wayfair. Previously, Daniel was a postdoctoral fellow at the Center for Mind/Brain Sciences of the University of Trento, and received his PhD in Psychology from Queen's University. While at Insight, Daniel built an NLP-driven engine to generate stress impact scores for newspaper front pages, trained on the reactions of Facebook users to news story headlines. In this blog post, he describes his creative process in developing this project. For my Insight Health Data Science project, I wanted to tackle a problem related to mental health, since my Ph.D. is in Psychology and my father is a mental health advocate in British Columbia.


A simple experiment in Machine Learning Studio

#artificialintelligence

If you've never used Azure Machine Learning Studio before, this tutorial is for you. In this tutorial, we'll walk through how to use Studio for the first time to create a machine learning experiment. The experiment will test an analytical model that predicts the price of an automobile based on different variables such as make and technical specifications. This tutorial shows you the basics of how to drag-and-drop modules onto your experiment, connect them together, run the experiment, and look at the results. We're not going to discuss the general topic of machine learning or how to select and use the 100 built-in algorithms and data manipulation modules included in Studio.


Determining Song Similarity via Machine Learning Techniques and Tagging Information

arXiv.org Machine Learning

The task of determining item similarity is a crucial one in a recommender system. This constitutes the base upon which the recommender system will work to determine which items are more likely to be enjoyed by a user, resulting in more user engagement. In this paper we tackle the problem of determining song similarity based solely on song metadata (such as the performer, and song title) and on tags contributed by users. We evaluate our approach under a series of different machine learning algorithms. We conclude that tf-idf achieves better results than Word2Vec to model the dataset to feature vectors. We also conclude that k-NN models have better performance than SVMs and Linear Regression for this problem.


A Brief Introduction to the Temporal Group LASSO and its Potential Applications in Healthcare

arXiv.org Machine Learning

The Temporal Group LASSO is an example of a multi-task, regularized regression approach for the prediction of response variables that vary over time. The aim of this work is to introduce the reader to the concepts behind the Temporal Group LASSO and its related methods, as well as to the type of potential applications in a healthcare setting that the method has. Weargue thatthemethodis attractivebecause ofitsabilitytoreduce overfitting, select predictors, learn smooth effect patterns over time, and finally, its simplicity.


A Brief Primer on Linear Regression โ€“ Part 1

#artificialintelligence

Prediction has always been a curious topic in life due to a key attribute โ€“ the extreme human desire to know what is coming next. Let's ponder over our thoughts to answer a simple question โ€“ "Where is prediction most relevant in your life today?" Predictions are central to every aspect of our life, whether we realize it or not. During school days, it was predicting what we would love to do in the future to choose a career path, checking the weather today to determine how should I dress, evaluating inventory numbers for the next day, to less important predictions made daily during our interactions with other people โ€“ like doing time management and getting into classes for a student, to dining, socializing, etc. A prediction or forecast, is a statement about the future.


Optimal algorithms for smooth and strongly convex distributed optimization in networks

arXiv.org Machine Learning

In this paper, we determine the optimal convergence rates for strongly convex and smooth distributed optimization in two settings: centralized and decentralized communications over a network. For centralized (i.e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp. $1$) is the time needed to communicate values between two neighbors (resp. perform local computations). For decentralized algorithms based on gossip, we provide the first optimal algorithm, called the multi-step dual accelerated (MSDA) method, that achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_l}(1+\frac{\tau}{\sqrt{\gamma}})\ln(1/\varepsilon))$, where $\kappa_l$ is the condition number of the local functions and $\gamma$ is the (normalized) eigengap of the gossip matrix used for communication between nodes. We then verify the efficiency of MSDA against state-of-the-art methods for two problems: least-squares regression and classification by logistic regression.


An Efficient Pseudo-likelihood Method for Sparse Binary Pairwise Markov Network Estimation

arXiv.org Machine Learning

The pseudo-likelihood method is one of the most popular algorithms for learning sparse binary pairwise Markov networks. In this paper, we formulate the $L_1$ regularized pseudo-likelihood problem as a sparse multiple logistic regression problem. In this way, many insights and optimization procedures for sparse logistic regression can be applied to the learning of discrete Markov networks. Specifically, we use the coordinate descent algorithm for generalized linear models with convex penalties, combined with strong screening rules, to solve the pseudo-likelihood problem with $L_1$ regularization. Therefore a substantial speedup without losing any accuracy can be achieved. Furthermore, this method is more stable than the node-wise logistic regression approach on unbalanced high-dimensional data when penalized by small regularization parameters. Thorough numerical experiments on simulated data and real world data demonstrate the advantages of the proposed method.


Detecting confounding in multivariate linear models via spectral analysis

arXiv.org Machine Learning

We study a model where one target variable Y is correlated with a vector X:=(X_1,...,X_d) of predictor variables being potential causes of Y. We describe a method that infers to what extent the statistical dependences between X and Y are due to the influence of X on Y and to what extent due to a hidden common cause (confounder) of X and Y. The method relies on concentration of measure results for large dimensions d and an independence assumption stating that, in the absence of confounding, the vector of regression coefficients describing the influence of each X on Y typically has `generic orientation' relative to the eigenspaces of the covariance matrix of X. For the special case of a scalar confounder we show that confounding typically spoils this generic orientation in a characteristic way that can be used to quantitatively estimate the amount of confounding.


Vertica Machine Learning Series: Logistic Regression - ODBMS.org

#artificialintelligence

This blog post is based on a white paper authored by Maurizio Felici. Logistic regression is a popular machine learning algorithm used for binary classification. Logistic regression labels a sample with one of two possible classes, given a set of predictors in the sample. Optionally, the output can be the probability that a sample belongs to a given class. For example, suppose a researcher is interested in the factors that determine if a student will be accepted or rejected to graduate school.