Reviews: Learning Deep Embeddings with Histogram Loss

Neural Information Processing Systems 

The authors provide a new loss function for learning embeddings in deep networks, called histogram loss. This loss is based on a pairwise classification: whether two labels belong to the same class or not. In particular, the authors suggest to look at the similarity distribution of the embeddings on the L2 unit sphere (all embeddings are L2 normalized). The idea is to look at the distribution of the similar embedding (positive pairs) and the distribution of the non-similar ones (negative pairs) and make the probability that positive pairs has smaller score then negative pairs, smaller. After reviewing previous work in the area (Section 2), in Section 3 they develop a method how to estimate the Histogram loss.