Regression
Data Science Simplified Part 7: Log-Log Regression Models
In the last few blog posts of this series, we discussed simple linear regression model. We discussed multivariate regression model and methods for selecting the right model. Fernando has now created a better model. In this article will address that question. This article will elaborate about Log-Log regression models.
Decision Trees and Random Forests for Classification and Regression pt.1
Want to use something more interpertable, something that trains faster and performs pretty much just as well as the old Logistic Regression or even Neural Networks? You should consider Decision Trees for classification and regression. Decision Trees and their extension Random Forests are robust and easy-to-interpret machine learning algorithms for Classification and Regression tasks. Decision Trees and Decision Tree Learning together comprise a simple and fast way of learning a function that maps data x to outputs y, where x can be a mix of categorical and numeric variables and y can be categorical for classification, or numeric for regression. Methods such as SVMs, Logistic Regression and Deep Neural Nets pretty much do the same thing.
Logistic Regression - General concepts
I am relatively new to predictive modeling techniques and would like to get a few concepts cleared/discussed. I am currently in the process of building a logistic regression model using Weight of Evidence (WOE) technique. I understand that the log odds and WOEs tend to have a linear relationship - a pre-requisite for the model. In case of categorical variables, WOEs can be used to make them continuous. But what if, the Log odds have a U-shaped relationship with the independent variable.
Frequentist coverage and sup-norm convergence rate in Gaussian process regression
Yang, Yun, Bhattacharya, Anirban, Pati, Debdeep
Gaussian process (GP) regression is a powerful interpolation technique due to its flexibility in capturing non-linearity. In this paper, we provide a general framework for understanding the frequentist coverage of point-wise and simultaneous Bayesian credible sets in GP regression. As an intermediate result, we develop a Bernstein von-Mises type result under supremum norm in random design GP regression. Identifying both the mean and covariance function of the posterior distribution of the Gaussian process as regularized $M$-estimators, we show that the sampling distribution of the posterior mean function and the centered posterior distribution can be respectively approximated by two population level GPs. By developing a comparison inequality between two GPs, we provide exact characterization of frequentist coverage probabilities of Bayesian point-wise credible intervals and simultaneous credible bands of the regression function. Our results show that inference based on GP regression tends to be conservative; when the prior is under-smoothed, the resulting credible intervals and bands have minimax-optimal sizes, with their frequentist coverage converging to a non-degenerate value between their nominal level and one. As a byproduct of our theory, we show that the GP regression also yields minimax-optimal posterior contraction rate relative to the supremum norm, which provides a positive evidence to the long standing problem on optimal supremum norm contraction rate in GP regression.
Machine Learning for Survival Analysis: A Survey
Wang, Ping, Li, Yan, Reddy, Chandan K.
Accurately predicting the time of occurrence of an event of interest is a critical problem in longitudinal data analysis. One of the main challenges in this context is the presence of instances whose event outcomes become unobservable after a certain time point or when some instances do not experience any event during the monitoring period. Such a phenomenon is called censoring which can be effectively handled using survival analysis techniques. Traditionally, statistical approaches have been widely developed in the literature to overcome this censoring issue. In addition, many machine learning algorithms are adapted to effectively handle survival data and tackle other challenging problems that arise in real-world data. In this survey, we provide a comprehensive and structured review of the representative statistical methods along with the machine learning techniques used in survival analysis and provide a detailed taxonomy of the existing methods. We also discuss several topics that are closely related to survival analysis and illustrate several successful applications in various real-world application domains. We hope that this paper will provide a more thorough understanding of the recent advances in survival analysis and offer some guidelines on applying these approaches to solve new problems that arise in applications with censored data.
Data Science Simplified Part 6: Model Selection Methods
In the last article of this series, we had discussed multivariate linear regression model. Fernando creates a model that estimates the price of the car based on five input parameters. Fernando indeed has a better model. Yet, he wanted to select the best set of variables for input. The idea of model selection method is intuitive. How is an optimal model defined?
Techniques to address very low event rate for Logistic Regression Model
Hi, I wish I could help in such way. I myself using the Link Model to observe and study repeated events . My sampling study was "Random or Causality" for drawing winning lottery numbers. The term Regression is some how a slow process of continuity of events, regarding THE MODEL THAT is used. I only observed activities of all Celestial Bodies that caused things to happen the way they happened.
Time Series Prediction for Graphs in Kernel and Dissimilarity Spaces
Paaßen, Benjamin, Göpfert, Christina, Hammer, Barbara
Graph models are relevant in many fields, such as distributed computing, intelligent tutoring systems or social network analysis. In many cases, such models need to take changes in the graph structure into account, i.e. a varying number of nodes or edges. Predicting such changes within graphs can be expected to yield important insight with respect to the underlying dynamics, e.g. with respect to user behaviour. However, predictive techniques in the past have almost exclusively focused on single edges or nodes. In this contribution, we attempt to predict the future state of a graph as a whole. We propose to phrase time series prediction as a regression problem and apply dissimilarity- or kernel-based regression techniques, such as 1-nearest neighbor, kernel regression and Gaussian process regression, which can be applied to graphs via graph kernels. The output of the regression is a point embedded in a pseudo-Euclidean space, which can be analyzed using subsequent dissimilarity- or kernel-based processing methods. We discuss strategies to speed up Gaussian Processes regression from cubic to linear time and evaluate our approach on two well-established theoretical models of graph evolution as well as two real data sets from the domain of intelligent tutoring systems. We find that simple regression methods, such as kernel regression, are sufficient to capture the dynamics in the theoretical models, but that Gaussian process regression significantly improves the prediction error for real-world data.