Goto

Collaborating Authors

 Regression


Explaining the Predictions of Any Image Classifier via Decision Trees

arXiv.org Artificial Intelligence

Despite outstanding contribution to the significant progress of Artificial Intelligence (AI), deep learning models remain mostly black boxes, which are extremely weak in explainability of the reasoning process and prediction results. Explainability is not only a gateway between AI and society but also a powerful tool to detect flaws in the model and biases in the data. Local Interpretable Model-agnostic Explanation (LIME) is a recent approach that uses a linear regression model to form a local explanation for the individual prediction result. However, being so restricted and usually oversimplifying the relationships, linear models fail in situations where nonlinear associations and interactions exist among features and prediction results. This paper proposes an extended Decision Tree-based LIME (TLIME) approach, which uses a decision tree model to form an interpretable representation that is locally faithful to the original model. The new approach can capture nonlinear interactions among features in the data and creates plausible explanations. Various experiments show that the TLIME explanation of multiple blackbox models can achieve more reliable performance in terms of understandability, fidelity, and efficiency.


The F-Test for Regression Analysis

#artificialintelligence

Suppose by means of some analysis, we were to deduce that today's value of the DJIA Closing Price may turn out to be a good predictor of tomorrow's Closing Price. To test this theory, we will develop a linear regression model consisting of a single regression variable. This variable will be the time lagged value of the time series. Here are the first few rows of the modified Data Frame. Let's remove the first row to get rid of the NaN: Next let's create our training and test data sets: Plot the model's performance against the test data set: At first glance, this model's performance looks much better than what we got from the mean model.


Iterative Algorithm for Discrete Structure Recovery

arXiv.org Machine Learning

We propose a general modeling and algorithmic framework for discrete structure recovery that can be applied to a wide range of problems. Under this framework, we are able to study the recovery of clustering labels, ranks of players, and signs of regression coefficients from a unified perspective. A simple iterative algorithm is proposed for discrete structure recovery, which generalizes methods including Lloyd's algorithm and the iterative feature matching algorithm. A linear convergence result for the proposed algorithm is established in this paper under appropriate abstract conditions on stochastic errors and initialization. We illustrate our general theory by applying it on three representative problems: clustering in Gaussian mixture model, approximate ranking, and sign recovery in compressed sensing, and show that minimax rate is achieved in each case.


On EducationMachine Learning Basics: Classification models in Python - CouponED

#artificialintelligence

WHAT YOU WILL LEARN Understand how to interpret the result of Logistic Regression model and translate them into actionable insight Learn how to solve real life problem using the different classification techniques Predict future outcomes basis past data by implementing Machine Learning algorithm Course contains a end-to-end DIY project to implement your learnings from the lectures The course "Machine Learning Basics: Classification models in Python" teaches you all the steps of creating a Classification model to solve business problems. Below is a list of popular FAQs of students who want to start their Machine learning journey- What is Machine Learning? Machine Learning is a field of computer science which gives the computer the ability to learn without being explicitly programmed. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. Which all classification techniques are taught in this course?


Case study: explaining credit modeling predictions with SHAP

#artificialintelligence

At Fiddler labs, we are all about explaining machine learning models. One recent interesting explanation technology is SHAP (SHapely Additive exPlanations). To learn more about how SHAP works in practice, we applied it to predicting loan defaults in data from Lending Club. We built three models (random, logistic regression, and boosted trees), and looked at several feature explanation diagrams from each. We are not financial experts, so this blog post focuses on comparing algorithms, not insights into loan data. Hopefully, we can bring SHAP into sharper focus.


What's New in the Splunk Machine Learning Toolkit 5.0

#artificialintelligence

This release was all about improving and enhancing toolkits' abilities to provide insights into your data, including a brand new outlier detection assistant, an update to our Machine Learning examples showcase page, an upgrade from Python 2.x to Python 3.x and a new System Identification algorithm. Outlier detection is by far the most popular use case in the industry. We constantly seek ways to offer a simple, yet rich and accurate way of helping you find outliers in your data, evaluate it and deploy it in your Splunk environment. It is not only smart by not having prejudice against your data's statistical characteristics, but also charming with a new set of custom visualizations available. With Python 2.7 coming to its end of life, Splunk 8.0 is migrating to Python 3.7 and so is the Splunk Machine Learning Toolkit.


Implementation of Linear Regression

#artificialintelligence

We're going to be implementing Linear Regression on the'Boston Housing' dataset. The Boston data set contains information about the different houses in Boston. There are 506 samples and 13 feature variables in this dataset. Our aim is to predict the value of prices of the house using the given features. To get basic details about our Boston Housing dataset like null values or missing values, data types etc. we can use .info()


Understanding Causal Inference

#artificialintelligence

This article covers causal relationships and includes a chapter excerpt from the book Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications by Andrew Kelleher and Adam Kelleher. A complementary Domino project is available. As data science work is experimental and probabilistic in nature, data scientists are often faced with making inferences. This may require a shift in mindset, particularly if moving from "traditional statistical analysis to causal analysis of multivariate data". As Domino is committed to providing the platform and tools data scientists need to accelerate their work, we reached out to Addison-Wesley Professional (AWP) Pearson for permission to excerpt "Causal Inference" from the book, Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications by Andrew Kelleher and Adam Kelleher. We appreciate the permissions to provide the chapter excerpt below as well as place the code within a complementary Domino project. We've introduced [in the book] a couple of machine-learning algorithms and suggested that they can be used to produce clear, interpretable results. You've seen that logistic regression coefficients can be used to say how much more likely an outcome will occur in conjunction with a feature (for binary features) or how much more likely an outcome is to occur per unit increase in a variable (for real-valued features). We'd like to make stronger statements. We'd like to say "If you increase a variable by a unit, then it will have the effect of making an outcome more likely." These two interpretations of a regression coefficient are so similar on the surface that you may have to read them a few times to take away the meaning. The key is that in the first case, we're describing what usually happens in a system that we observe. In the second case, we're saying what will happen if we intervene in that system and disrupt it from its normal operation. After we go through an example, we'll build up the mathematical and conceptual machinery to describe interventions. We'll cover how to go from a Bayesian network describing observational data to one that describes the effects of an intervention. We'll go through some classic approaches to estimating the effects of interventions, and finally we'll explain how to use machine-learning estimators to estimate the effects of interventions.


On-Device Machine Learning: An Algorithms and Learning Theory Perspective

arXiv.org Machine Learning

The current paradigm for using machine learning models on a device is to train a model in the cloud and perform inference using the trained model on the device. However, with the increasing number of smart devices and improved hardware, there is interest in performing model training on the device. Given this surge in interest, a comprehensive survey of the field from a device-agnostic perspective sets the stage for both understanding the state-of-the-art and for identifying open challenges and future avenues of research. Since on-device learning is an expansive field with connections to a large number of related topics in AI and machine learning (including online learning, model adaptation, one/few-shot learning, etc), covering such a large number of topics in a single survey is impractical. Instead, this survey finds a middle ground by reformulating the problem of on-device learning as resource constrained learning where the resources are compute and memory. This reformulation allows tools, techniques, and algorithms from a wide variety of research areas to be compared equitably. In addition to summarizing the state of the art, the survey also identifies a number of challenges and next steps for both the algorithmic and theoretical aspects of on-device learning.


Machine Learning with Python: Simple Linear Regression - Price prediction

#artificialintelligence

In this article we will learn how to use linear regression to predict resale price of a car. Don't worry if you don't have any clue what linear regression means, I will explain the basics as we go along. The first step as always is importing the libraries required. We have to import pandas for working with dataframes. Data required for car resale price prediction is available in CarResalePrice.csv.