AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Regression Versus Classification Machine Learning: What's the Difference?

#artificialintelligenceOct-4-2018, 05:47:27 GMT

The difference between regression machine learning algorithms and classification machine learning algorithms sometimes confuse most data scientists, which make them to implement wrong methodologies in solving their prediction problems. Andreybu, who is from Germany and has more than 5 years of machine learning experience, says that "understanding whether the machine learning task is a regression or classification problem is key for selecting the right algorithm to use." Let's start by talking about the similarities between the two techniques. Regression and classification are categorized under the same umbrella of supervised machine learning. Both share the same concept of utilizing known datasets (referred to as training datasets) to make predictions. In supervised learning, an algorithm is employed to learn the mapping function from the input variable (x) to the output variable (y); that is y f(X).

algorithm, artificial intelligence, machine learning, (12 more...)

#artificialintelligence

Country: Europe > Germany (0.26)

Industry: Education (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.59)

Add feedback

Projective Inference in High-dimensional Problems: Prediction and Feature Selection

Piironen, Juho, Paasiniemi, Markus, Vehtari, Aki

arXiv.org Machine LearningOct-4-2018

This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the \emph{reference model} and the operation during the latter step as predictive \emph{projection}. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The benefits are illustrated via several simulated and real world examples.

projection, reference model, selection, (14 more...)

arXiv.org Machine Learning

doi: 10.1214/20-EJS1711

1810.02406

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

5 Types of Regressions for your Machine Learning Toolbox

#artificialintelligenceOct-3-2018, 07:57:59 GMT

However, some seasoned techniques are here to stay. At the top of the list are regression techniques. As long as this number is as high, you will encounter regressions during your machine learning career. Even if you don't use them yourself, it is essential to be aware of the different flavors and which problems they tackle. In this post, I provide you with a quick overview of five different (groups of) regressions.

artificial intelligence, machine learning, regression, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.64)

Add feedback

Agnostic Sample Compression for Linear Regression

Hanneke, Steve, Kontorovich, Aryeh, Sadigurschi, Menachem

arXiv.org Machine LearningOct-3-2018

We obtain the first positive results for bounded sample compression in the agnostic regression setting. We show that for p in {1,infinity}, agnostic linear regression with $\ell_p$ loss admits a bounded sample compression scheme. Specifically, we exhibit efficient sample compression schemes for agnostic linear regression in $R^d$ of size $d+1$ under the $\ell_1$ loss and size $d+2$ under the $\ell_\infty$ loss. We further show that for every other $\ell_p$ loss (1 < p < infinity), there does not exist an agnostic compression scheme of bounded size. This refines and generalizes a negative result of David, Moran, and Yehudayoff (2016) for the $\ell_2$ loss. We close by posing a general open question: for agnostic regression with $\ell_1$ loss, does every function class admit a compression scheme of size equal to its pseudo-dimension? This question generalizes Warmuth's classic sample compression conjecture for realizable-case classification (Warmuth, 2003).

artificial intelligence, compression scheme, machine learning, (16 more...)

arXiv.org Machine Learning

1810.01864

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.83)

Add feedback

AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

Bellamy, Rachel K. E., Dey, Kuntal, Hind, Michael, Hoffman, Samuel C., Houde, Stephanie, Kannan, Kalapriya, Lohia, Pranay, Martino, Jacquelyn, Mehta, Sameep, Mojsilovic, Aleksandra, Nagar, Seema, Ramamurthy, Karthikeyan Natesan, Richards, John, Saha, Diptikalyan, Sattigeri, Prasanna, Singh, Moninder, Varshney, Kush R., Zhang, Yunfeng

arXiv.org Artificial IntelligenceOct-3-2018

We used Python's Flask framework for building the service and exposed a REST API that generates a bias report based on the following input parameters from a user: the dataset name, the protected attributes, the privileged and unprivileged groups, the chosen fairness metrics, and the chosen mitigation algorithm, if any. With these inputs, the back-end then runs a series of steps to 1) split the dataset into training, development, and validation sets; 2) train a logistic regression classifier on the training set; 3) run the bias-checking metrics on the classifier against the test dataset; 4) if a mitigation algorithm is chosen, run the mitigation algorithm with the appropriate pipeline (pre-processing, in-processing, or post-processing). The end result is then cached so that if the exact same inputs are provided, the result can be directly retrieved from cache and no additional computation is needed. The reason to truly use the toolkit code in serving the Web application rather than having a pre-computed lookup table of results is twofold: we want to make the app a real representation of the underlying capabilities (in fact, creating the Web app helped us debug a few items in the code), and we also avoid any issues of synchronizing updates to the metrics, explainers, and algorithms with the results shown: synchronization is automatic. Currently, the service is limited to three built-in datasets, but it can be expanded to support the user's own data upload. The service is also limited to building logistic regression classifiers, but again this can be expanded. Such expansions can be more easily implemented if this fairness service is integrated into a full AI suite that provides various classifier options and data storage solutions.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1810.01943

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Adaptive, Personalized Diversity for Visual Discovery

Teo, Choon Hui, Nassif, Houssam, Hill, Daniel, Srinavasan, Sriram, Goodman, Mitchell, Mohan, Vijai, Vishwanathan, SVN

arXiv.org Machine LearningOct-2-2018

Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shopping experience. Here we explore extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon. Our system presents the user with a diverse set of interesting items while adapting to user interactions. Our solution consists of three components (1) a Bayesian regression model for scoring the relevance of items while leveraging uncertainty, (2) a submodular diversification framework that re-ranks the top scoring items based on category, and (3) personalized category preferences learned from the user's behavior. When tested on live traffic, our algorithms show a strong lift in click-through-rate and session duration.

artificial intelligence, category, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1145/2959100.2959171

1810.01477

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (0.47)

Industry:

Information Technology > Services (0.34)
Retail (0.30)

Technology:

Information Technology > Information Management > Search (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Is Ordered Weighted $\ell_1$ Regularized Regression Robust to Adversarial Perturbation? A Case Study on OSCAR

Chen, Pin-Yu, Vinzamuri, Bhanukiran, Liu, Sijia

arXiv.org Machine LearningOct-2-2018

Many state-of-the-art machine learning models such as deep neural networks have recently shown to be vulnerable to adversarial perturbations, especially in classification tasks. Motivated by adversarial machine learning, in this paper we investigate the robustness of sparse regression models with strongly correlated covariates to adversarially designed measurement noises. Specifically, we consider the family of ordered weighted $\ell_1$ (OWL) regularized regression methods and study the case of OSCAR (octagonal shrinkage clustering algorithm for regression) in the adversarial setting. Under a norm-bounded threat model, we formulate the process of finding a maximally disruptive noise for OWL-regularized regression as an optimization problem and illustrate the steps towards finding such a noise in the case of OSCAR. Experimental results demonstrate that the regression performance of grouping strongly correlated features can be severely degraded under our adversarial setting, even when the noise budget is significantly smaller than the ground-truth signals.

artificial intelligence, machine learning, robustness, (13 more...)

arXiv.org Machine Learning

1809.08706

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

Semi-supervised Text Regression with Conditional Generative Adversarial Networks

Li, Tao, Liu, Xudong, Su, Shihan

arXiv.org Artificial IntelligenceOct-2-2018

Enormous online textual information provides intriguing opportunities for understandings of social and economic semantics. In this paper, we propose a novel text regression model based on a conditional generative adversarial network (GAN), with an attempt to associate textual data and social outcomes in a semi-supervised manner. Besides promising potential of predicting capabilities, our superiorities are twofold: (i) the model works with unbalanced datasets of limited labelled data, which align with real-world scenarios; and (ii) predictions are obtained by an end-to-end framework, without explicitly selecting high-level representations. Finally we point out related datasets for experiments and future research directions.

computational linguistic, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

1810.01165

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.31)
Health & Medicine > Therapeutic Area > Immunology (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

Modeling Online Discourse with Coupled Distributed Topics

Srivatsan, Akshay, Wojtowicz, Zachary, Berg-Kirkpatrick, Taylor

arXiv.org Machine LearningOct-1-2018

In this paper, we propose a deep, globally normalized topic model that incorporates structural relationships connecting documents in socially generated corpora, such as online forums. Our model (1) captures discursive interactions along observed reply links in addition to traditional topic information, and (2) incorporates latent distributed representations arranged in a deep architecture, which enables a GPU-based mean-field inference procedure that scales efficiently to large data. We apply our model to a new social media dataset consisting of 13M comments mined from the popular internet forum Reddit, a domain that poses significant challenges to models that do not account for relationships connecting user comments. We evaluate against existing methods across multiple metrics including perplexity and metadata prediction, and qualitatively analyze the learned interaction patterns.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1809.07282

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Media > News (0.37)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Neural Regression Trees

Memon, Shahan Ali, Zhao, Wenbo, Raj, Bhiksha, Singh, Rita

arXiv.org Machine LearningOct-1-2018

Regression-via-Classification (RvC) is the process of converting a regression problem to a classification one. Current approaches for RvC use ad-hoc discretization strategies and are suboptimal. We propose a neural regression tree model for RvC. In this model, we employ a joint optimization framework where we learn optimal discretization thresholds while simultaneously optimizing the features for each node in the tree. We empirically show the validity of our model by testing it on two challenging regression tasks where we establish the state of the art.

artificial intelligence, decision tree learning, machine learning, (17 more...)

arXiv.org Machine Learning

1810.00974

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback