Learning Multi-modal Similarity
–arXiv.org Artificial Intelligence
In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits multiple modalities, such as acoustic and visual content of video. Integrating such heterogeneous data to form a holistic similarity space is therefore a key challenge to be overcome in many real-world applications. We present a novel multiple kernel learning technique for integrating heterogeneous data into a single, unified similarity space. Our algorithm learns an optimal ensemble of kernel transfor- mations which conform to measurements of human perceptual similarity, as expressed by relative comparisons. To cope with the ubiquitous problems of subjectivity and inconsistency in multi- media similarity, we develop graph-based techniques to filter similarity measurements, resulting in a simplified and robust training procedure.
arXiv.org Artificial Intelligence
Aug-30-2010
- Country:
- Europe > United Kingdom
- England (0.14)
- North America > United States
- California (0.14)
- Europe > United Kingdom
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)
- Technology: