Support Vector Machines
Stability and Incentive Compatibility in a Kernel-Based Combinatorial Auction
Lahaie, Sebastien (Yahoo Research)
We present the design and analysis of an approximately incentive-compatible combinatorial auction. In just a single run, the auction is able to extract enough value information from bidders to compute approximate truth-inducing payments. This stands in contrast to current auction designs that need to repeat the allocation computation as many times as there are bidders to achieve incentive compatibility. The auction is formulated as a kernel method, which allows for flexibility in choosing the price structure via a kernel function. Our main result characterizes the extent to which our auction is incentive-compatible in terms of the complexity of the chosen kernel function. Our analysis of the auction's properties is based on novel insights connecting the notion of stability in statistical learning theory to that of universal competitive equilibrium in the auction literature.
Cost-Sensitive Semi-Supervised Support Vector Machine
Li, Yu-Feng (Nanjing University, China) | Kwok, James T. (Hong Kong University of Science and Technology) | Zhou, Zhi-Hua (Nanjing University, China)
In this paper, we study cost-sensitive semi-supervised learning where many of the training examples are unlabeled and different misclassification errors are associated with unequal costs. This scenario occurs in many real-world applications. For example, in some disease diagnosis, the cost of erroneously diagnosing a patient as healthy is much higher than that of diagnosing a healthy person as a patient. Also, the acquisition of labeled data requires medical diagnosis which is expensive, while the collection of unlabeled data such as basic health information is much cheaper. We propose the CS4VM (Cost-Sensitive Semi-Supervised Support Vector Machine) to address this problem. We show that the CS4VM, when given the label means of the unlabeled data, closely approximates the supervised cost-sensitive SVM that has access to the ground-truth labels of all the unlabeled data. This observation leads to an efficient algorithm which first estimates the label means and then trains the CS4VM with the plug-in label means by an efficient SVM solver. Experiments on a broad range of data sets show that the proposed method is capable of reducing the total cost and is computationally efficient.
Learning sparse gradients for variable selection and dimension reduction
Variable selection and dimension reduction are two commonly adopted approaches for high-dimensional data analysis, but have traditionally been treated separately. Here we propose an integrated approach, called sparse gradient learning (SGL), for variable selection and dimension reduction via learning the gradients of the prediction function directly from samples. By imposing a sparsity constraint on the gradients, variable selection is achieved by selecting variables corresponding to non-zero partial derivatives, and effective dimensions are extracted based on the eigenvectors of the derived sparse empirical gradient covariance matrix. An error analysis is given for the convergence of the estimated gradients to the true ones in both the Euclidean and the manifold setting. We also develop an efficient forward-backward splitting algorithm to solve the SGL problem, making the framework practically scalable for medium or large datasets. The utility of SGL for variable selection and feature extraction is explicitly given and illustrated on artificial data as well as real-world examples. The main advantages of our method include variable selection for both linear and nonlinear predictions, effective dimension reduction with sparse loadings, and an efficient algorithm for large p, small n problems.
Gaussian Processes for Machine Learning: Book webpage
The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed.
Empirical learning aided by weak domain knowledge in the form of feature importance
Standard hybrid learners that use domain knowledge require stronger knowledge that is hard and expensive to acquire. However, weaker domain knowledge can benefit from prior knowledge while being cost effective. Weak knowledge in the form of feature relative importance (FRI) is presented and explained. Feature relative importance is a real valued approximation of a feature's importance provided by experts. Advantage of using this knowledge is demonstrated by IANN, a modified multilayer neural network algorithm. IANN is a very simple modification of standard neural network algorithm but attains significant performance gains. Experimental results in the field of molecular biology show higher performance over other empirical learning algorithms including standard backpropagation and support vector machines. IANN performance is even comparable to a theory refinement system KBANN that uses stronger domain knowledge. This shows Feature relative importance can improve performance of existing empirical learning algorithms significantly with minimal effort.
Using a Kernel Adatron for Object Classification with RCS Data
Byl, Marten F., Demers, James T., Rietman, Edward A.
Rapid identification of object from radar cross section (RCS) signals is important for many space and military applications. This identification is a problem in pattern recognition which either neural networks or support vector machines should prove to be high-speed. Bayesian networks would also provide value but require significant preprocessing of the signals. In this paper, we describe the use of a support vector machine for object identification from synthesized RCS data. Our best results are from data fusion of X-band and S-band signals, where we obtained 99.4%, 95.3%, 100% and 95.6% correct identification for cylinders, frusta, spheres, and polygons, respectively. We also compare our results with a Bayesian approach and show that the SVM is three orders of magnitude faster, as measured by the number of floating point operations.
Discovering Serendipitous Information from Wikipedia by Using Its Network Structure
Noda, Yohei (University of Tokyo) | Kiyota, Yoji (University of Tokyo) | Nakagawa, Hiroshi (University of Tokyo)
Many researchers conducted studies on extracting relevant information from web documents. However, there are few studies on extracting serendipitous information. We propose methods to discover unexpected information from Wikipedia by using its network structure, for example, the distance between two categories. We evaluated two methods: a classification-based method using support vector machines (SVMs), and a ranking-based method using regression. We demonstrate advantages of regression over classification.
ECG Feature Extraction Techniques - A Survey Approach
Karpagachelvi, S., Arthanari, M., Sivakumar, M.
ECG Feature Extraction plays a significant role in diagnosing most of the cardiac diseases. One cardiac cycle in an ECG signal consists of the P-QRS-T waves. This feature extraction scheme determines the amplitudes and intervals in the ECG signal for subsequent analysis. The amplitudes and intervals value of P-QRS-T segment determines the functioning of heart of every human. Recently, numerous research and techniques have been developed for analyzing the ECG signal. The proposed schemes were mostly based on Fuzzy Logic Methods, Artificial Neural Networks (ANN), Genetic Algorithm (GA), Support Vector Machines (SVM), and other Signal Analysis techniques. All these techniques and algorithms have their advantages and limitations. This proposed paper discusses various techniques and transformations proposed earlier in literature for extracting feature from an ECG signal. In addition this paper also provides a comparative study of various methods proposed by researchers in extracting the feature from ECG signal.
Choosing Path Replanning Strategies for Unmanned Aircraft Systems
Wzorek, Mariusz (Linköping University) | Kvarnström, Jonas (Linköping University) | Doherty, Patrick (Linköping University)
Unmanned aircraft systems use a variety of techniques to plan collision-free flight paths given a map of obstacles and no-fly zones. However, maps are not perfect and obstacles may change over time or be detected during flight, which may invalidate paths that the aircraft is already following. Thus, dynamic in-flight replanning is required. Numerous strategies can be used for replanning, where the time requirements and the plan quality associated with each strategy depend on the environment around the original flight path. In this paper, we investigate the use of machine learning techniques, in particular support vector machines, to choose the best possible replanning strategy depending on the amount of time available. The system has been implemented, integrated and tested in hardware-in-the-loop simulation with a Yamaha RMAX helicopter platform.
Linear Time Feature Selection for Regularized Least-Squares
Pahikkala, Tapio, Airola, Antti, Salakoski, Tapio
We propose a novel algorithm for greedy forward feature selection for regularized least-squares (RLS) regression and classification, also known as the least-squares support vector machine or ridge regression. The algorithm, which we call greedy RLS, starts from the empty feature set, and on each iteration adds the feature whose addition provides the best leave-one-out cross-validation performance. Our method is considerably faster than the previously proposed ones, since its time complexity is linear in the number of training examples, the number of features in the original data set, and the desired size of the set of selected features. Therefore, as a side effect we obtain a new training algorithm for learning sparse linear RLS predictors which can be used for large scale learning. This speed is possible due to matrix calculus based short-cuts for leave-one-out and feature addition. We experimentally demonstrate the scalability of our algorithm and its ability to find good quality feature sets.