Learning Multi-modal Similarity
–arXiv.org Artificial Intelligence
In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits multiple modalities, such as acoustic and visual content of video. Integrating such heterogeneous data to form a holistic similarity space is therefore a key challenge to be overcome in many real-world applications. We present a novel multiple kernel learning technique for integrating heterogeneous data into a single, unified similarity space. Our algorithm learns an optimal ensemble of kernel transfor- mations which conform to measurements of human perceptual similarity, as expressed by relative comparisons. To cope with the ubiquitous problems of subjectivity and inconsistency in multi- media similarity, we develop graph-based techniques to filter similarity measurements, resulting in a simplified and robust training procedure.
arXiv.org Artificial Intelligence
Aug-30-2010
- Country:
- North America > United States
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- New York > New York County
- New York City (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- California > San Diego County
- San Diego (0.04)
- Pennsylvania > Philadelphia County
- Europe
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Cambridgeshire > Cambridge (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- United Kingdom > England
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.81)
- Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Technology: