Support Vector Machines
Distributionally Robust Graphical Models
In many structured prediction problems, complex relationships between variables are compactly defined using graphical structures. The most prevalent graphical prediction methods---probabilistic graphical models and large margin methods---have their own distinct strengths but also possess significant drawbacks. Conditional random fields (CRFs) are Fisher consistent, but they do not permit integration of customized loss metrics into their learning process. Large-margin models, such as structured support vector machines (SSVMs), have the flexibility to incorporate customized loss metrics, but lack Fisher consistency guarantees. We present adversarial graphical models (AGM), a distributionally robust approach for constructing a predictor that performs robustly for a class of data distributions defined using a graphical structure. Our approach enjoys both the flexibility of incorporating customized loss metrics into its design as well as the statistical guarantee of Fisher consistency. We present exact learning and prediction algorithms for AGM with time complexity similar to existing graphical models and show the practical benefits of our approach with experiments.
A Smoother Way to Train Structured Prediction Models
We present a framework to train a structured prediction model by performing smoothing on the inference algorithm it builds upon. Smoothing overcomes the non-smoothness inherent to the maximum margin structured prediction objective, and paves the way for the use of fast primal gradient-based optimization algorithms. We illustrate the proposed framework by developing a novel primal incremental optimization algorithm for the structural support vector machine. The proposed algorithm blends an extrapolation scheme for acceleration and an adaptive smoothing scheme and builds upon the stochastic variance-reduced gradient algorithm. We establish its worst-case global complexity bound and study several practical variants. We present experimental results on two real-world problems, namely named entity recognition and visual object localization. The experimental results show that the proposed framework allows us to build upon efficient inference algorithms to develop large-scale optimization algorithms for structured prediction which can achieve competitive performance on the two real-world problems.
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)
But How Does It Work in Theory? Linear SVM with Random Features
We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used. Our work extends the previous fast rate analysis of random features method from least square loss to 0-1 loss. We also show that the reweighted feature selection method, which approximates the optimized feature map, helps improve the performance of RFSVM in experiments on a synthetic data set.
- North America > United States > Illinois (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Transportation > Infrastructure & Services (0.46)
- Transportation > Air (0.45)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada (0.04)
But How Does It Work in Theory? Linear SVM with Random Features
Yitong Sun, Anna Gilbert, Ambuj Tewari
The random features method, proposed by Rahimi and Recht [2008], maps the data to a finite dimensional feature space as a random approximation to the feature space of RBF kernels. With explicit finite dimensional feature vectors available, the original KSVM is converted to a linear support vector machine (LSVM), that can be trained by faster algorithms (Shalev-Shwartz et al.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Spain > Canary Islands (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Alberta (0.04)
- North America > United States > Ohio (0.04)
- (3 more...)
- North America > United States > Texas > Brazos County > College Station (0.05)
- Asia > Middle East > Jordan (0.05)
- North America > Canada > Quebec > Montreal (0.04)