vp-tree
Variational Structured Semantic Inference for Diverse Image Captioning
Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang
Despite the exciting progress in image captioning, generating diverse captions for a given image remains as an open problem. Existing methods typically apply generative models such as V ariational Auto-Encoder to diversify the captions, which however neglect two key factors of diverse expression, i.e., the lexical diversity and the syntactic diversity. To model these two inherent diversities in image captioning, we propose a V ariational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema. VSSI-cap mainly innovates in a novel structure, i.e., V ariational Multi-modal Inferring tree (termed V arMI-tree). In particular, conditioned on the visual-textual features from the encoder, the V arMI-tree models the lexical and syntactic diversities by inferring their latent variables (with variations) in an approximate posterior inference guided by a visual semantic prior. Then, a reconstruction loss and the posterior-prior KL-divergence are jointly estimated to optimize the VSSI-cap model. Finally, diverse captions are generated upon the visual features and the latent variables from this structured encoder-inferer-decoder model. Experiments on the benchmark dataset show that the proposed VSSI-cap achieves significant improvements over the state-of-the-arts.
Learning to Prune in Metric and Non-Metric Spaces
Our focus is on approximate nearest neighbor retrieval in metric and non-metric spaces. We employ a VP-tree and explore two simple yet effective learning-to prune approaches: density estimation through sampling and "stretching" of the triangle inequality. Both methods are evaluated using data sets with metric (Euclidean) and non-metric (KL-divergence and Itakura-Saito) distance functions. Conditions on spaces where the VP-tree is applicable are discussed. The VP-tree with a learned pruner is compared against the recently proposed state-of-the-art approaches: the bbtree, the multi-probe locality sensitive hashing (LSH), and permutation methods. Our method was competitive against state-of-the-art methods and, in most cases, was more efficient for the same rank approximation quality.
fc8001f834f6a5f0561080d134d53d29-Reviews.html
Summary: The paper presents a method that learns a pruning algorithm for a VP-tree, in non-metric spaces. The idea is to estimate the decision function of the approximate nearest neighbor search in the VP-tree by sampling, and approximating it with a piecewise linear function. The learning to prune method is validated for the search efficiency against relevant baselines for prunning, and outperforms them substantially when the intrinsic dimensionality of the data is small. Clarity: The paper is mostly clearly written but sometimes does not really go into explaining the implementation details and the choice of some parameters (for example, why choose K 100, m 7, rho 8 and the bucket size 10 5? Line 185,227,315) Originality: Learning to approximate the approximate nearest neighbor classification on a VP-tree, to the extent of my knowledge, is the first work that'learns to prune' Significance: Nearest neighbor method is a very fundamental topic in search or classification; thus this learning-to-prune method which approximates the nearest neighbor search with a non-linear function would be of some interest to a wide audience. However, the datasets chosen for validation for the experiments seem rather simple and have low-dimensionality, which are far from realistic.
Learning to Prune in Metric and Non-Metric Spaces
Our focus is on approximate nearest neighbor retrieval in metric and non-metric spaces. We employ a VP-tree and explore two simple yet effective learning-toprune approaches: density estimation through sampling and "stretching" of the triangle inequality. Both methods are evaluated using data sets with metric (Euclidean) and non-metric (KL-divergence and Itakura-Saito) distance functions. Conditions on spaces where the VP-tree is applicable are discussed. The VP-tree with a learned pruner is compared against the recently proposed state-of-the-art approaches: the bbtree, the multi-probe locality sensitive hashing (LSH), and permutation methods. Our method was competitive against state-of-the-art methods and, in most cases, was more efficient for the same rank approximation quality.
Learning to Prune in Metric and Non-Metric Spaces
Boytsov, Leonid, Naidan, Bilegsaikhan
Our focus is on approximate nearest neighbor retrieval in metric and non-metric spaces. We employ a VP-tree and explore two simple yet effective learning-to prune approaches: density estimation through sampling and "stretching" of the triangle inequality. Both methods are evaluated using data sets with metric (Euclidean) and non-metric (KL-divergence and Itakura-Saito) distance functions. Conditions on spaces where the VP-tree is applicable are discussed. The VP-tree with a learned pruner is compared against the recently proposed state-of-the-art approaches: the bbtree, the multi-probe locality sensitive hashing (LSH), and permutation methods.
Building an Image Hashing Search Engine with VP-Trees and OpenCV - PyImageSearch
In this tutorial, you will learn how to build a scalable image hashing search engine using OpenCV, Python, and VP-Trees. Back in 2017, I wrote a tutorial on image hashing with OpenCV and Python (which is required reading for this tutorial). That guide showed you how to find identical/duplicate images in a given dataset. However, there was a scalability problem with that original tutorial -- namely that it did not scale! To find near-duplicate images, our original image hashing method would require us to perform a linear search, comparing the query hash to each individual image hash in our dataset. In a practical, real-world application that's far too slow -- we need to find a way to reduce that search to sub-linear time complexity. But how can we reduce search time so dramatically?
Learning to Prune in Metric and Non-Metric Spaces
Boytsov, Leonid, Naidan, Bilegsaikhan
Our focus is on approximate nearest neighbor retrieval in metric and non-metric spaces. We employ a VP-tree and explore two simple yet effective learning-to prune approaches: density estimation through sampling and “stretching” of the triangle inequality. Both methods are evaluated using data sets with metric (Euclidean) and non-metric (KL-divergence and Itakura-Saito) distance functions. Conditions on spaces where the VP-tree is applicable are discussed. The VP-tree with a learned pruner is compared against the recently proposed state-of-the-art approaches: the bbtree, the multi-probe locality sensitive hashing (LSH), and permutation methods. Our method was competitive against state-of-the-art methods and, in most cases, was more efficient for the same rank approximation quality.