Goto

Collaborating Authors

Learning Compact Visual Descriptors for Low Bit Rate Mobile Landmark Search

AI Magazine

Along with the ever-growing computational power of mobile devices, mobile visual search has undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using a location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with offline learning from geotagged photos from online photosharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using a Gaussian mixture model and then boosting a ranking-sensitive vocabulary within each region, with "entropy"-based feedback on the compactness of the descriptor to refine both phases iteratively.


Learning Compact Visual Descriptors for Low Bit Rate Mobile Landmark Search

AI Magazine

Coming with the ever growing computational power of mobile devices, mobile visual search have undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with an offline learning from geotagged photos from online photo sharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using Gaussian mixture model, and then boosting a ranking sensitive vocabulary within each region, with an “entropy” based descriptor compactness feedback to refine both phases iteratively. In online search, when entering a geographical region, the codebook in a mobile device are downstream adapted to generate extremely compact descriptors with promising discriminative ability. We have deployed landmark search apps to both HTC and iPhone mobile phones, working over the database of million scale images in typical areas like Beijing, New York, and Barcelona, and others. Our descriptor outperforms alternative compact descriptors (Chen et al. 2009; Chen et al., 2010; Chandrasekhar et al. 2009a; Chandrasekhar et al. 2009b) with significant margins. Beyond landmark search, this article will summarize the MPEG standarization progress of compact descriptor for visual search (CDVS) (Yuri et al. 2010; Yuri et al. 2011) towards application interoperability.


Learning Compact Visual Descriptor for Low Bit Rate Mobile Landmark Search

AAAI Conferences

In this paper, we propose to extract a compact yet discriminative visual descriptor directly on the mobile device, which tackles the wireless query transmission latency in mobile landmark search. This descriptor is offline learnt from the location contexts of geo-tagged Web photos from both Flickr and Panoramio with two phrases: First, we segment the landmark photo collections into discrete geographical regions using a Gaussian Mixture Model [Stauffer et al., 2000]. Second, a ranking sensitive vocabulary boosting is introduced to learn a compact codebook within each region. To tackle the locally optimal descriptor learning caused by imprecise geographical segmentation, we further iterate above phrases by feedback an “entropy” based descriptor compactness into a prior distribution to constrain the Gaussian mixture modeling. Consequently, when entering a specific geographical region, the codebook in the mobile device is downstream adapted, which ensures efficient extraction of compact descriptor, its low bit rate transmission, as well as promising discrimination ability. We deploy our descriptor within both HTC and iPhone mobile phones, testing landmark search in typical areas included Beijing, New York, and Barcelona containing one million images. Our learning descriptor outperforms alternative compact descriptors [Chen et al., 2009][Chen et al., 2010][Chandrasekhar et al., 2009a][Chandrasekhar et al., 2009b] with a large margin.


Learning Compact Visual Descriptors for Low Bit Rate Mobile Landmark Search

AI Magazine

Coming with the ever growing computational power of mobile devices, mobile visual search have undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. Beyond landmark search, this article will summarize the MPEG standarization progress of compact descriptor for visual search (CDVS) (Yuri et al.


Ji

AAAI Conferences

In this paper, we propose to extract a compact yet discriminative visual descriptor directly on the mobile device, which tackles the wireless query transmission latency in mobile landmark search. This descriptor is offline learnt from the location contexts of geo-tagged Web photos from both Flickr and Panoramio with two phrases: First, we segment the landmark photo collections into discrete geographical regions using a Gaussian Mixture Model [Stauffer et al., 2000]. Second, a ranking sensitive vocabulary boosting is introduced to learn a compact codebook within each region. To tackle the locally optimal descriptor learning caused by imprecise geographical segmentation, we further iterate above phrases by feedback an "entropy" based descriptor compactness into a prior distribution to constrain the Gaussian mixture modeling. Consequently, when entering a specific geographical region, the codebook in the mobile device is downstream adapted, which ensures efficient extraction of compact descriptor, its low bit rate transmission, as well as promising discrimination ability. We deploy our descriptor within both HTC and iPhone mobile phones, testing landmark search in typical areas included Beijing, New York, and Barcelona containing one million images. Our learning descriptor outperforms alternative compact descriptors [Chen et al., 2009][Chen et al., 2010][Chandrasekhar et al., 2009a][Chandrasekhar et al., 2009b] with a large margin.