loss-augmented inference
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Austria > Salzburg > Salzburg (0.04)
- Asia > Middle East > Jordan (0.04)
Efficient Optimization for Average Precision SVM
The accuracy of information retrieval systems is often measured using average precision (AP). Given a set of positive (relevant) and negative (non-relevant) samples, the parameters of a retrieval system can be estimated using the AP-SVM framework, which minimizes a regularized convex upper bound on the empirical AP loss. However, the high computational complexity of loss-augmented inference, which is required for learning an AP-SVM, prohibits its use with large training datasets. To alleviate this deficiency, we propose three complementary approaches.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Austria > Salzburg > Salzburg (0.04)
- Asia > Middle East > Jordan (0.04)
Efficient Optimization for Average Precision SVM
Mohapatra, Pritish, Jawahar, C.V., Kumar, M. Pawan
The accuracy of information retrieval systems is often measured using average precision (AP). Given a set of positive (relevant) and negative (non-relevant) samples, the parameters of a retrieval system can be estimated using the AP-SVM framework, which minimizes a regularized convex upper bound on the empirical AP loss. However, the high computational complexity of loss-augmented inference, which is required for learning an AP-SVM, prohibits its use with large training datasets. To alleviate this deficiency, we propose three complementary approaches. The second approach takes advantage of the fact that we do not require a full ranking during loss-augmented inference.
Efficient Non-greedy Optimization of Decision Trees
Norouzi, Mohammad, Collins, Maxwell, Johnson, Matthew A., Fleet, David J., Kohli, Pushmeet
Decision trees and randomized forests are widely used in computer vision and machine learning. Standard algorithms for decision tree induction optimize the split functions one node at a time according to some splitting criteria. This greedy procedure often leads to suboptimal trees. In this paper, we present an algorithm for optimizing the split functions at all levels of the tree jointly with the leaf parameters, based on a global objective. We show that the problem of finding optimal linear-combination (oblique) splits for decision trees is related to structured prediction with latent variables, and we formulate a convex-concave upper bound on the tree's empirical loss. Computing the gradient of the proposed surrogate objective with respect to each training exemplar is O(d^2), where d is the tree depth, and thus training deep trees is feasible. The use of stochastic gradient descent for optimization enables effective training with large datasets. Experiments on several classification benchmarks demonstrate that the resulting non-greedy decision trees outperform greedy decision tree baselines.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Austria > Salzburg > Salzburg (0.04)
- Asia > Middle East > Jordan (0.04)
Efficient Optimization for Average Precision SVM
Mohapatra, Pritish, Jawahar, C.V., Kumar, M. Pawan
The accuracy of information retrieval systems is often measured using average precision (AP). Given a set of positive (relevant) and negative (non-relevant) samples, the parameters of a retrieval system can be estimated using the AP-SVM framework, which minimizes a regularized convex upper bound on the empirical AP loss. However, the high computational complexity of loss-augmented inference, which is required for learning an AP-SVM, prohibits its use with large training datasets. To alleviate this deficiency, we propose three complementary approaches. The first approach guarantees an asymptotic decrease in the computational complexity of loss-augmented inference by exploiting the problem structure. The second approach takes advantage of the fact that we do not require a full ranking during loss-augmented inference. This helps us to avoid the expensive step of sorting the negative samples according to their individual scores. The third approach approximates the AP loss over all samples by the AP loss over difficult samples (for example, those that are incorrectly classified by a binary SVM), while ensuring the correct classification of the remaining samples. Using the PASCAL VOC action classification and object detection datasets, we show that our approaches provide significant speed-ups during training without degrading the test accuracy of AP-SVM.