Compositional Embeddings for Multi-Label One-Shot Learning

Li, Zeqian, Mozer, Michael C., Whitehill, Jacob

arXiv.org Machine Learning 

We explore the idea of compositional set embeddings that can be used to infer not just a single class per input (e.g., image, video, audio signal), but a collection of classes, in the setting of one-shot learning. Class compositionality is useful in tasks such as multi-object detection in images and multi-speaker diarization in audio. Specifically, we devise and implement two novel models consisting of (1) an embedding function f trained jointly with a "composite" function g that computes set union operations between the classes encoded in two embedding vectors; and (2) embedding f trained jointly with a "query" function h that computes whether the classes encoded in one embedding subsume the classes encoded in another embedding. In contrast to previously developed methods, these models must both determine the classes associated with the input examples and encode the relationships between different class label sets. In experiments conducted on simulated data, OmniGlot, LibriSpeech and Open Images datasets, the proposed composite embedding models outperform baselines based on traditional embedding methods.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found