Machine Learning has many advantages. It is the hot topic right now. For a trader or a fund manager, the pertinent question is "How can I apply this new tool to generate more alpha?". I will explore one such model that answers this question in a series of blogs. "How can I apply this new tool to generate more alpha?"Click

Pathak, Kumarjit, Kapila, Jitin, Barvey, Aasheesh, Gawande, Nikit

In regression modelling approach, the main step is to fit the regression line as close as possible to the target variable. In this process most algorithms try to fit all of the data in a single line and hence fitting all parts of target variable in one go. It was observed that the error between predicted and target variable usually have a varying behavior across the various quantiles of the dependent variable and hence single point diagnostic like MAPE has its limitation to signify the level of fitness across the distribution of Y(dependent variable). To address this problem, a novel approach is proposed in the paper to deal with regression fitting over various quantiles of target variable. Using this approach we have significantly improved the eccentric behavior of the distance (error) between predicted and actual value of regression. Our proposed solution is based on understanding the segmented behavior of the data with respect to the internal segments within the data and approach for retrospectively fitting the data based on each quantile behavior. We believe exploring and using this approach would help in achieving better and more explainable results in most settings of real world data modelling problems.

De Myttenaere, Arnaud, Golden, Boris, Grand, Bénédicte Le, Rossi, Fabrice

We study in this paper the consequences of using the Mean Absolute Percentage Error (MAPE) as a measure of quality for regression models. We show that finding the best model under the MAPE is equivalent to doing weighted Mean Absolute Error (MAE) regression. We show that universal consistency of Empirical Risk Minimization remains possible using the MAPE instead of the MAE.

Someone recently asked on the statistics Stack Exchange why the squared error is used in statistics. This is something I'd been wondering about myself recently, so I decided to take a crack at answering it. The post below is adapted from that answer. It's true that one could choose to use, say, the absolute error instead of the squared error. In fact, the absolute error is often closer to what you "care about" when making predictions from your model.