Chandra, Suresh
GBSVR: Granular Ball Support Vector Regression
Rastogi, Reshma, Bisht, Ankush, Kumar, Sanjay, Chandra, Suresh
Support Vector Regression (SVR) and its variants are widely used to handle regression tasks, however, since their solution involves solving an expensive quadratic programming problem, it limits its application, especially when dealing with large datasets. Additionally, SVR uses an epsilon-insensitive loss function which is sensitive to outliers and therefore can adversely affect its performance. We propose Granular Ball Support Vector Regression (GBSVR) to tackle problem of regression by using granular ball concept. These balls are useful in simplifying complex data spaces for machine learning tasks, however, to the best of our knowledge, they have not been sufficiently explored for regression problems. Granular balls group the data points into balls based on their proximity and reduce the computational cost in SVR by replacing the large number of data points with far fewer granular balls. This work also suggests a discretization method for continuous-valued attributes to facilitate the construction of granular balls. The effectiveness of the proposed approach is evaluated on several benchmark datasets and it outperforms existing state-of-the-art approaches
Tube Loss: A Novel Approach for Prediction Interval Estimation and probabilistic forecasting
Anand, Pritam, Bandyopadhyay, Tathagata, Chandra, Suresh
This paper proposes a novel loss function, called 'Tube Loss', for simultaneous estimation of bounds of a Prediction Interval (PI) in the regression setup, and also for generating probabilistic forecasts from time series data solving a single optimization problem. The PIs obtained by minimizing the empirical risk based on the Tube Loss are shown to be of better quality than the PIs obtained by the existing methods in the following sense. First, it yields intervals that attain the prespecified confidence level $t \in(0,1)$ asymptotically. A theoretical proof of this fact is given. Secondly, the user is allowed to move the interval up or down by controlling the value of a parameter. This helps the user to choose a PI capturing denser regions of the probability distribution of the response variable inside the interval, and thus, sharpening its width. This is shown to be especially useful when the conditional distribution of the response variable is skewed. Further, the Tube Loss based PI estimation method can trade-off between the coverage and the average width by solving a single optimization problem. It enables further reduction of the average width of PI through re-calibration. Also, unlike a few existing PI estimation methods the gradient descent (GD) method can be used for minimization of empirical risk. Finally, through extensive experimentation, we have shown the efficacy of the Tube Loss based PI estimation in kernel machines, neural networks and deep networks and also for probabilistic forecasting tasks. The codes of the experiments are available at https://github.com/ltpritamanand/Tube_loss
Improvement over Pinball Loss Support Vector Machine
Anand, Pritam, Rastogi, Reshma, Chandra, Suresh
Recently, there have been several papers that discuss the extension of the Pinball loss Support Vector Machine (Pin-SVM) model, originally proposed by Huang et al.,[1][2]. Pin-SVM classifier deals with the pinball loss function, which has been defined in terms of the parameter $\tau$. The parameter $\tau$ can take values in $[ -1,1]$. The existing Pin-SVM model requires to solve the same optimization problem for all values of $\tau$ in $[ -1,1]$. In this paper, we improve the existing Pin-SVM model for the binary classification task. At first, we note that there is major difficulty in Pin-SVM model (Huang et al. [1]) for $ -1 \leq \tau < 0$. Specifically, we show that the Pin-SVM model requires the solution of different optimization problem for $ -1 \leq \tau < 0$. We further propose a unified model termed as Unified Pin-SVM which results in a QPP valid for all $-1\leq \tau \leq 1$ and hence more convenient to use. The proposed Unified Pin-SVM model can obtain a significant improvement in accuracy over the existing Pin-SVM model which has also been empirically justified by extensive numerical experiments with real-world datasets.
A $\nu$- support vector quantile regression model with automatic accuracy control
Anand, Pritam, Rastogi, Reshma, Chandra, Suresh
The estimation of f τ( x) is difficult but, more informative than estimation of only mean regression f ( x). The estimation of f τ( x) for different values of τ can briefly describe the different characteristics of the conditional distribution of y/x . In many real world problems, the estimation of mean regression f ( x) is not required or enough, rather they require the estimation of quantile f τ(x). The study of quantile regression problem has initially been started in 1978 by Koenkar and Bassett[1]. Later, it has been briefly discussed and described by Koenker in his book (Koenker, [2]). Koenkar and Bassett [1] proposed the pinball loss function for the estimation of the quantile function f τ(x). For a given quantile τ (0, 1), the pinball loss function was an asymmetric loss function suitable for quantile estimation. It was given by P τ( u) null τu if u 0, (τ 1)u otherwise.
A new asymmetric $\epsilon$-insensitive pinball loss function based support vector quantile regression model
Anand, Pritam, Rastogi, Reshma, Chandra, Suresh
In this paper, we propose a novel asymmetric $\epsilon$-insensitive pinball loss function for quantile estimation. There exists some pinball loss functions which attempt to incorporate the $\epsilon$-insensitive zone approach in it but, they fail to extend the $\epsilon$-insensitive approach for quantile estimation in true sense. The proposed asymmetric $\epsilon$-insensitive pinball loss function can make an asymmetric $\epsilon$- insensitive zone of fixed width around the data and divide it using $\tau$ value for the estimation of the $\tau$th quantile. The use of the proposed asymmetric $\epsilon$-insensitive pinball loss function in Support Vector Quantile Regression (SVQR) model improves its prediction ability significantly. It also brings the sparsity back in SVQR model. Further, the numerical results obtained by several experiments carried on artificial and real world datasets empirically show the efficacy of the proposed `$\epsilon$-Support Vector Quantile Regression' ($\epsilon$-SVQR) model over other existing SVQR models.
Support Vector Regression via a Combined Reward Cum Penalty Loss Function
Anand, Pritam, Rastogi, Reshma, Chandra, Suresh
In this paper, we introduce a novel combined reward cum penalty loss function to handle the regression problem. The proposed combined reward cum penalty loss function penalizes the data points which lie outside the $\epsilon$-tube of the regressor and also assigns reward for the data points which lie inside of the $\epsilon$-tube of the regressor. The combined reward cum penalty loss function based regression (RP-$\epsilon$-SVR) model has several interesting properties which are investigated in this paper and are also supported with the experimental results.
M-Unit EigenAnt: An Ant Algorithm to Find the M Best Solutions
Shah, Sameena (Indian Institute of Technology Delhi) | Jayadeva, Jayadeva (Indian Institute of Technology Delhi) | Kothari, Ravi (IBM India Research Laboratory) | Chandra, Suresh (Indian Institute of Technology Delhi)
In this paper, we shed light on how powerful congestion control based on local interactions may be obtained. We show how ants can use repellent pheromones and incorporate the effect of crowding to avoid traffic congestion on the optimal path. Based on these interactions, we propose an ant algorithm, the M-unit EigenAnt algorithm, that leads to the selection of the M shortest paths. The ratio of selection of each of these paths is also optimal and regulated by an optimal amount of pheromone on each of them. To the best of our knowledge, the M -unit EigenAnt algorithm is the first antalgorithm that explicitly ensures the selection of the M shortest paths and regulates the amount of pheromone on them such that it is asymptotically optimal. In fact, it is in contrast with most ant algorithms that aim to discover just a single best path. We provide its convergence analysis and show that the steady state distribution of pheromone aligns with the eigenvectors of the cost matrix, and thus is related to its measure of quality. We also provide analysis to show that this property ensues even when the food is moved or path lengths change during foraging. We show that this behavior is robust in the presence of fluctuations and quickly reflects the change in the M optimal solutions. This makes it suitable for not only distributed applications butalso dynamic ones as well. Finally, we provide simulation results for the convergence to the optimal solution under different initial biases, dynamism in lengths of paths, and discovery of new paths.