Goto

Collaborating Authors

 Regression


aadhil96/Linear_Regression_python_Ecommerce_Project_ML01

#artificialintelligence

Ecommerce company based in New York City that sells clothing online but they also have in-store style and clothing advice sessions. Customers come in to the store, have sessions meetings with a personal stylist, then they can go home and order either on a mobile app or website for the clothes they want. The company is trying to decide whether to focus their efforts on their mobile app experience or their website.


Multi-resolution neural networks for tracking seismic horizons from few training images

arXiv.org Machine Learning

Detecting a specific horizon in seismic images is a valuable tool for geological interpretation. Because hand-picking the locations of the horizon is a time-consuming process, automated computational methods were developed starting three decades ago. Older techniques for such picking include interpolation of control points however, in recent years neural networks have been used for this task. Until now, most networks trained on small patches from larger images. This limits the networks ability to learn from large-scale geologic structures. Moreover, currently available networks and training strategies require label patches that have full and continuous annotations, which are also time-consuming to generate. We propose a projected loss-function for training convolutional networks with a multi-resolution structure, including variants of the U-net. Our networks learn from a small number of large seismic images without creating patches. The projected loss-function enables training on labels with just a few annotated pixels and has no issue with the other unknown label pixels. Training uses all data without reserving some for validation. Only the labels are split into training/testing. Contrary to other work on horizon tracking, we train the network to perform non-linear regression, and not classification. As such, we propose labels as the convolution of a Gaussian kernel and the known horizon locations that indicate uncertainty in the labels. The network output is the probability of the horizon location. We demonstrate the proposed computational ingredients on two different datasets, for horizon extrapolation and interpolation. We show that the predictions of our methodology are accurate even in areas far from known horizon locations because our learning strategy exploits all data in large seismic images.


Neural networks versus Logistic regression for 30 days all-cause readmission prediction

arXiv.org Machine Learning

Heart failure (HF) is one of the leading causes of hospital admissions in the US. Readmission within 30 days after a HF hospitalization is both a recognized indicator for disease progression and a source of considerable financial burden to the healthcare system. Consequently, the identification of patients at risk for readmission is a key step in improving disease management and patient outcome. In this work, we used a large administrative claims dataset to (1)explore the systematic application of neural network-based models versus logistic regression for predicting 30 days all-cause readmission after discharge from a HF admission, and (2)to examine the additive value of patients' hospitalization timelines on prediction performance. Based on data from 272,778 (49% female) patients with a mean (SD) age of 73 years (14) and 343,328 HF admissions (67% of total admissions), we trained and tested our predictive readmission models following a stratified 5-fold cross-validation scheme. Among the deep learning approaches, a recurrent neural network (RNN) combined with conditional random fields (CRF) model (RNNCRF) achieved the best performance in readmission prediction with 0.642 AUC (95% CI, 0.640-0.645). Other models, such as those based on RNN, convolutional neural networks and CRF alone had lower performance, with a non-timeline based model (MLP) performing worst. A competitive model based on logistic regression with LASSO achieved a performance of 0.643 AUC (95%CI, 0.640-0.646). We conclude that data from patient timelines improve 30 day readmission prediction for neural network-based models, that a logistic regression with LASSO has equal performance to the best neural network model and that the use of administrative data result in competitive performance compared to published approaches based on richer clinical datasets.


Classification of load forecasting studies by forecasting problem to select load forecasting techniques and methodologies

arXiv.org Machine Learning

This article proposes a two-dimensional classification methodology to select the relevant forecasting tools developed by the scientific community based on a classification of load forecasting studies. The inputs of the classifier are the articles of the literature and the outputs are articles classified into categories. The classification process relies on two couple of parameters that defines a forecasting problem. The temporal couple is the forecasting horizon and the forecasting resolution. The system couple is the system size and the load resolution. Each article is classified with key information about the dataset used and the forecasting tools implemented: the forecasting techniques (probabilistic or deterministic) and methodologies, the cleansing data techniques and the error metrics. This process is illustrated by reviewing and classifying thirty-four articles.


An Integrated Transfer Learning and Multitask Learning Approach for Pharmacokinetic Parameter Prediction

arXiv.org Machine Learning

Background: Pharmacokinetic evaluation is one of the key processes in drug discovery and development. However, current absorption, distribution, metabolism, excretion prediction models still have limited accuracy. Aim: This study aims to construct an integrated transfer learning and multitask learning approach for developing quantitative structure-activity relationship models to predict four human pharmacokinetic parameters. Methods: A pharmacokinetic dataset included 1104 U.S. FDA approved small molecule drugs. The dataset included four human pharmacokinetic parameter subsets (oral bioavailability, plasma protein binding rate, apparent volume of distribution at steady-state and elimination half-life). The pre-trained model was trained on over 30 million bioactivity data. An integrated transfer learning and multitask learning approach was established to enhance the model generalization. Results: The pharmacokinetic dataset was split into three parts (60:20:20) for training, validation and test by the improved Maximum Dissimilarity algorithm with the representative initial set selection algorithm and the weighted distance function. The multitask learning techniques enhanced the model predictive ability. The integrated transfer learning and multitask learning model demonstrated the best accuracies, because deep neural networks have the general feature extraction ability, transfer learning and multitask learning improved the model generalization. Conclusions: The integrated transfer learning and multitask learning approach with the improved dataset splitting algorithm was firstly introduced to predict the pharmacokinetic parameters. This method can be further employed in drug discovery and development.


Primal path algorithm for compositional data analysis

arXiv.org Machine Learning

In modern regression analysis, it is frequently observed that regression predictors consist of the proportions or relative ratios of certain values rather than absolute values. For example, in analyzing air pollution data, the percentages of chemicals in the air are considered relevant predictors to identify the source of a pollutant (Lee et al., 2007). These types of proportional data, typically called compositional data, are widely used in geoscience (Buccianti et al., 2006), microbiology (Montassier et al., 2016), and nutritional biochemistry (Leite, 2016). By the definition of compositional data, all compositional predictors lie on the simplex and are thus linearly dependent. Aitchison and Bacon-shone (1984) proposed a regression model for compositional data as follows.


Statistical learning of geometric characteristics of wireless networks

arXiv.org Machine Learning

Motivated by the prediction of cell loads in cellular networks, we formulate the following new, fundamental problem of statistical learning of geometric marks of point processes: An unknown marking function, depending on the geometry of point patterns, produces characteristics (marks) of the points. One aims at learning this function from the examples of marked point patterns in order to predict the marks of new point patterns. To approximate (interpolate) the marking function, in our baseline approach, we build a statistical regression model of the marks with respect some local point distance representation. In a more advanced approach, we use a global data representation via the scattering moments of random measures, which build informative and stable to deformations data representation, already proven useful in image analysis and related application domains. In this case, the regression of the scattering moments of the marked point patterns with respect to the non-marked ones is combined with the numerical solution of the inverse problem, where the marks are recovered from the estimated scattering moments. Considering some simple, generic marks, often appearing in the modeling of wireless networks, such as the shot-noise values, nearest neighbour distance, and some characteristics of the Voronoi cells, we show that the scattering moments can capture similar geometry information as the baseline approach, and can reach even better performance, especially for non-local marking functions. Our results motivate further development of statistical learning tools for stochastic geometry and analysis of wireless networks, in particular to predict cell loads in cellular networks from the locations of base stations and traffic demand.


A Novel Large-scale Ordinal Regression Model

arXiv.org Machine Learning

Ordinal regression (OR) is a special multiclass classification problem where an order relation exists among the labels. Recent years, people share their opinions and sentimental judgments conveniently with social networks and E-Commerce so that plentiful large-scale OR problems arise. However, few studies have focused on this kind of problems. Nonparallel Support Vector Ordinal Regression (NPSVOR) is a SVM-based OR model, which learns a hyperplane for each rank by solving a series of independent sub-optimization problems and then ensembles those learned hyperplanes to predict. The previous studies are focused on its nonlinear case and got a competitive testing performance, but its training is time consuming, particularly for large-scale data. In this paper, we consider NPSVOR's linear case and design an efficient training method based on the dual coordinate descent method (DCD). To utilize the order information among labels in prediction, a new prediction function is also proposed. Extensive contrast experiments on the text OR datasets indicate that the carefully implemented DCD is very suitable for training large data.


Unifying Topic, Sentiment & Preference in an HDP-Based Rating Regression Model for Online Reviews

arXiv.org Machine Learning

This paper proposes a new HDP based online review rating regression model named Topic-Sentiment-Preference Regression Analysis (TSPRA). TSPRA combines topics (i.e. product aspects), word sentiment and user preference as regression factors, and is able to perform topic clustering, review rating prediction, sentiment analysis and what we invent as "critical aspect" analysis altogether in one framework. TSPRA extends sentiment approaches by integrating the key concept "user preference" in collaborative filtering (CF) models into consideration, while it is distinct from current CF models by decoupling "user preference" and "sentiment" as independent factors. Our experiments conducted on 22 Amazon datasets show overwhelming better performance in rating predication against a state-of-art model FLAME (2015) in terms of error, Pearson's Correlation and number of inverted pairs. For sentiment analysis, we compare the derived word sentiments against a public sentiment resource SenticNet3 and our sentiment estimations clearly make more sense in the context of online reviews. Last, as a result of the de-correlation of "user preference" from "sentiment", TSPRA is able to evaluate a new concept "critical aspects", defined as the product aspects seriously concerned by users but negatively commented in reviews. Improvement to such "critical aspects" could be most effective to enhance user experience.


Types of Machine Learning and Top 10 Algorithms Everyone Should Know

#artificialintelligence

From detecting skin cancer to sorting corn cobbs to predicting early equipment maintenance, machine learning has granted computer systems entirely new abilities. Algorithms are the methods used to extract patterns from data for the purpose of granting computers the powers to predict and draw inferences. It will be interesting to learn how machine learning really works under the hood. Let's walk through a few examples and use it as an excuse to talk about the process of getting answers from your data using machine learning. Here are top 10 machine learning algorithms that everyone involved in Data Science, Machine Learning, and AI should know about.