em-dd
EM-DD: An Improved Multiple-Instance Learning Technique
We present a new multiple-instance (MI) learning technique (EM(cid:173) DD) that combines EM with the diverse density (DD) algorithm. EM-DD is a general-purpose MI algorithm that can be applied with boolean or real-value labels and makes real-value predictions. On the boolean Musk benchmarks, the EM-DD algorithm without any tuning significantly outperforms all previous algorithms. EM-DD is relatively insensitive to the number of relevant attributes in the data set and scales up well to large bag sizes. Furthermore, EM(cid:173) DD provides a new framework for MI learning, in which the MI problem is converted to a single-instance setting by using EM to estimate the instance responsible for the label of the bag.
Support Vector Machines for Multiple-Instance Learning
Andrews, Stuart, Tsochantaridis, Ioannis, Hofmann, Thomas
This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including nonlinear classification via kernels, available to an area that up to now has been largely dominated by special purpose methods. We present experimental results on a pharmaceutical data set and on applications in automated image indexing and document categorization.
Support Vector Machines for Multiple-Instance Learning
Andrews, Stuart, Tsochantaridis, Ioannis, Hofmann, Thomas
This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including nonlinear classification via kernels, available to an area that up to now has been largely dominated by special purpose methods. We present experimental results on a pharmaceutical data set and on applications in automated image indexing and document categorization.
Support Vector Machines for Multiple-Instance Learning
Andrews, Stuart, Tsochantaridis, Ioannis, Hofmann, Thomas
This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including nonlinear classification via kernels, available to an area that up to now has been largely dominated by special purpose methods. We present experimental results on a pharmaceutical dataset and on applications in automated image indexing and document categorization. 1 Introduction Multiple-instance learning (MIL) [4] is a generalization of supervised classification in which training class labels are associated with sets of patterns, or bags, instead of individual patterns. While every pattern may possess an associated true label, it is assumed that pattern labels are only indirectly accessible through labels attached to bags.
EM-DD: An Improved Multiple-Instance Learning Technique
In this model, each training example is a set (or bag) of instances along with a single label equal to the maximum label among all instances in the bag. The individual instances within the bag are not given labels. The goal is to learn to accurately predict the label of previously unseen bags. Standard supervised learning can be viewed as a special case of MI learning where each bag holds a single instance. The MI learning model was originally motivated by the drug activity prediction problem where each instance is a possible conformation (or shape) of a molecule and each bag contains all likely low-energy conformations for the molecule.
EM-DD: An Improved Multiple-Instance Learning Technique
In this model, each training example is a set (or bag) of instances along with a single label equal to the maximum label among all instances in the bag. The individual instances within the bag are not given labels. The goal is to learn to accurately predict the label of previously unseen bags. Standard supervised learning can be viewed as a special case of MI learning where each bag holds a single instance. The MI learning model was originally motivated by the drug activity prediction problem where each instance is a possible conformation (or shape) of a molecule and each bag contains all likely low-energy conformations for the molecule.
EM-DD: An Improved Multiple-Instance Learning Technique
We present a new multiple-instance (MI) learning technique (EM DD) that combines EM with the diverse density (DD) algorithm. EM-DD is a general-purpose MI algorithm that can be applied with boolean or real-value labels and makes real-value predictions. On the boolean Musk benchmarks, the EM-DD algorithm without any tuning significantly outperforms all previous algorithms. EM-DD is relatively insensitive to the number of relevant attributes in the data set and scales up well to large bag sizes. Furthermore, EM DD provides a new framework for MI learning, in which the MI problem is converted to a single-instance setting by using EM to estimate the instance responsible for the label of the bag. 1 Introduction The multiple-instance (MI) learning model has received much attention.