AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Price Suggestion for Online Second-hand Items with Texts and Images

Han, Liang, Yin, Zhaozheng, Xia, Zhurong, Tang, Mingqian, Jin, Rong

arXiv.org Artificial IntelligenceDec-10-2020

This paper presents an intelligent price suggestion system for online second-hand listings based on their uploaded images and text descriptions. The goal of price prediction is to help sellers set effective and reasonable prices for their second-hand items with the images and text descriptions uploaded to the online platforms. Specifically, we design a multi-modal price suggestion system which takes as input the extracted visual and textual features along with some statistical item features collected from the second-hand item shopping platform to determine whether the image and text of an uploaded second-hand item are qualified for reasonable price suggestion with a binary classification model, and provide price suggestions for second-hand items with qualified images and text descriptions with a regression model. To satisfy different demands, two different constraints are added into the joint training of the classification model and the regression model. Moreover, a customized loss function is designed for optimizing the regression model to provide price suggestions for second-hand items, which can not only maximize the gain of the sellers but also facilitate the online transaction. We also derive a set of metrics to better evaluate the proposed price suggestion system. Extensive experiments on a large real-world dataset demonstrate the effectiveness of the proposed multi-modal price suggestion system.

price suggestion, regression model, suggestion, (16 more...)

arXiv.org Artificial Intelligence

2012.06008

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Machine Learning With R: Logistic Regression

#artificialintelligenceDec-9-2020, 16:37:20 GMT

Our little journey to machine learning with R continues! Today's topic is logistic regression – as an introduction to machine learning classification tasks. We'll cover data preparation, modeling, and evaluation of the well-known Titanic dataset. That's it for the introduction section – we have many things to cover, so let's jump right to it. Logistic regression is a great introductory algorithm for binary classification (two class values) borrowed from the field of statistics.

dataset, logistic function, logistic regression, (15 more...)

#artificialintelligence

Genre:

Research Report > Experimental Study (0.95)
Research Report > New Finding (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.84)

Add feedback

Consistent regression of biophysical parameters with kernel methods

Díaz, Emiliano, Pérez-Suay, Adrián, Laparra, Valero, Camps-Valls, Gustau

arXiv.org Machine LearningDec-9-2020

This paper introduces a novel statistical regression framework that allows the incorporation of consistency constraints. A linear and nonlinear (kernel-based) formulation are introduced, and both imply closed-form analytical solutions. The models exploit all the information from a set of drivers while being maximally independent of a set of auxiliary, protected variables. We successfully illustrate the performance in the estimation of chlorophyll content.

consistency, prediction, regression, (17 more...)

arXiv.org Machine Learning

2012.04922

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)

Genre: Research Report (0.40)

Industry: Energy (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

Optimal Survival Trees

Bertsimas, Dimitris, Dunn, Jack, Gibson, Emma, Orfanoudaki, Agni

arXiv.org Machine LearningDec-8-2020

Survival analysis methods are required for censored data in which the outcome of interest is generally the time until an event (onset of disease, death, etc.), but the exact time of the event is unknown (censored) for some individuals. When a lower bound for these missing values is known (for example, a patient is known to be alive until at least time t) the data is said to be right-censored. A common survival analysis technique is Cox proportional hazards regression (Cox, 1972) which models the hazard rate for an event as a linear combination of covariate effects. Although this model is widely used and easily interpreted, its parametric nature makes it unable to identify nonlinear effects or interactions between covariates (Bou-Hamad et al., 2011). Recursive partitioning techniques (also referred to as trees) are a popular alternative to parametric models. When applied to survival data, survival tree algorithms partition the covariate space into smaller and smaller regions (nodes) containing observations with homogeneous survival outcomes.

ctree 0, ost 0, rpart 0, (15 more...)

arXiv.org Machine Learning

2012.04284

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Law > Civil Rights & Constitutional Law (0.78)
Health & Medicine > Therapeutic Area > Endocrinology (0.67)
Banking & Finance > Insurance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Communications (0.93)
(2 more...)

Add feedback

Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately

Khani, Fereshte, Liang, Percy

arXiv.org Machine LearningDec-7-2020

The presence of spurious features interferes with the goal of obtaining robust models that perform well across many groups within the population. A natural remedy is to remove spurious features from the model. However, in this work we show that removal of spurious features can decrease accuracy due to the inductive biases of overparameterized models. We completely characterize how the removal of spurious features affects accuracy across different groups (more generally, test distributions) in noiseless overparameterized linear regression. In addition, we show that removal of spurious feature can decrease the accuracy even in balanced datasets -- each target co-occurs equally with each spurious feature; and it can inadvertently make the model more susceptible to other spurious features. Finally, we show that robust self-training can remove spurious features without affecting the overall accuracy. Experiments on the Toxic-Comment-Detectoin and CelebA datasets show that our results hold in non-linear models.

accuracy, removing spurious feature, spurious feature, (14 more...)

arXiv.org Machine Learning

2012.04104

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

A PAC-Bayesian Perspective on Structured Prediction with Implicit Loss Embeddings

Cantelobre, Théophile, Guedj, Benjamin, Pérez-Ortiz, María, Shawe-Taylor, John

arXiv.org Machine LearningDec-7-2020

Many practical machine learning tasks can be framed as Structured prediction problems, where several output variables are predicted and considered interdependent. Recent theoretical advances in structured prediction have focused on obtaining fast rates convergence guarantees, especially in the Implicit Loss Embedding (ILE) framework. PAC-Bayes has gained interest recently for its capacity of producing tight risk bounds for predictor distributions. This work proposes a novel PAC-Bayes perspective on the ILE Structured prediction framework. We present two generalization bounds, on the risk and excess risk, which yield insights into the behavior of ILE predictors. Two learning algorithms are derived from these bounds.

algorithm, pac-bayesian structured prediction, predictor, (10 more...)

arXiv.org Machine Learning

2012.0378

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(9 more...)

Genre:

Overview (1.00)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
(2 more...)

Add feedback

Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction

Brandsæter, Andreas, Glad, Ingrid K.

arXiv.org Machine LearningDec-7-2020

There is an increasing interest in and demand for interpretations and explanations of machine learning models and predictions in various application areas. In this paper, we consider data-driven models which are already developed, implemented and trained. Our goal is to interpret the models and explain and understand their predictions. Since the predictions made by data-driven models rely heavily on the data used for training, we believe explanations should convey information about how the training data affects the predictions. To do this, we propose a novel methodology which we call Shapley values for training data subset importance. The Shapley value concept originates from coalitional game theory, developed to fairly distribute the payout among a set of cooperating players. We extend this to subset importance, where a prediction is explained by treating the subsets of the training data as players in a game where the predictions are the payouts. We describe and illustrate how the proposed method can be useful and demonstrate its capabilities on several examples. We show how the proposed explanations can be used to reveal biasedness in models and erroneous training data. Furthermore, we demonstrate that when predictions are accurately explained in a known situation, then explanations of predictions by simple models correspond to the intuitive explanations. We argue that the explanations enable us to perceive more of the inner workings of the algorithms, and illustrate how models producing similar predictions can be based on very different parts of the training data. Finally, we show how we can use Shapley values for subset importance to enhance our training data acquisition, and by this reducing prediction error.

prediction, shapley value, subset, (17 more...)

arXiv.org Machine Learning

2012.03625

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Austria > Vienna (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.67)
Leisure & Entertainment (0.48)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Deep Learning Prerequisites: Linear Regression in Python

#artificialintelligenceDec-6-2020, 15:42:59 GMT

This course teaches you about one popular technique used in machine learning, data science and statistics: linear regression. We cover the theory from

data science, linear regression, python, (6 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)

Add feedback

LOWESS Regression in Python: How to Discover Clear Patterns in Your Data?

#artificialintelligenceDec-6-2020, 04:45:21 GMT

Machine Learning is making huge leaps forward, with an increasing number of algorithms enabling us to solve complex real-world problems. This story is part of a deep dive series explaining the mechanics of Machine Learning algorithms. In addition to giving you an understanding of how ML algorithms work, it also provides you with Python examples to build your own ML models. Locally Weighted Scatterplot Smoothing sits within the family of regression algorithms under the umbrella of Supervised Learning. This means that you need a set of labeled data with a numerical target variable to train your model.

algorithm, discover clear pattern, regression, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Add feedback

Urban Crowdsensing using Social Media: An Empirical Study on Transformer and Recurrent Neural Networks

Heng, Jerome, Liu, Junhua, Lim, Kwan Hui

arXiv.org Artificial IntelligenceDec-5-2020

An important aspect of urban planning is understanding crowd levels at various locations, which typically require the use of physical sensors. Such sensors are potentially costly and time consuming to implement on a large scale. To address this issue, we utilize publicly available social media datasets and use them as the basis for two urban sensing problems, namely event detection and crowd level prediction. One main contribution of this work is our collected dataset from Twitter and Flickr, alongside ground truth events. We demonstrate the usefulness of this dataset with two preliminary supervised learning approaches: firstly, a series of neural network models to determine if a social media post is related to an event and secondly a regression model using social media post counts to predict actual crowd levels. We discuss preliminary results from these tasks and highlight some challenges.

dataset, pedestrian count, tweet, (10 more...)

arXiv.org Artificial Intelligence

2012.03057

Country:

Oceania > Australia > Victoria > Melbourne (0.05)
Asia > Singapore (0.05)

Genre: Research Report (0.65)

Industry: Information Technology > Services (0.52)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback