Cross-Domain Matching for Bag-of-Words Data via Kernel Embeddings of Latent Distributions
Yoshikawa, Yuya, Iwata, Tomoharu, Sawada, Hiroshi, Yamada, Takeshi
–Neural Information Processing Systems
We propose a kernel-based method for finding matching between instances across different domains, such as multilingual documents and images with annotations. Each instance is assumed to be represented as a multiset of features, e.g., a bag-of-words representation for documents. The major difficulty in finding cross-domain relationships is that the similarity between instances in different domains cannot be directly measured. To overcome this difficulty, the proposed method embeds all the features of different domains in a shared latent space, and regards each instance as a distribution of its own features in the shared latent space. To represent the distributions efficiently and nonparametrically, we employ the framework of the kernel embeddings of distributions.
Neural Information Processing Systems
Feb-14-2020, 08:59:00 GMT
- Technology: