Appendix ALimitations

Neural Information Processing Systems 

However, this drawback is inherited from the underlying model classandisnotaproperty ofourretrieval-based approach. However,thechoice of the underlying dataset as well as the overall construction strategy of this database isnot further investigated. This would beaninteresting direction forfuture work, aswealready observethatamodel trained only on ImageNet acquires strong zero-shot capabilities, see e.g. Forourmodel,this concerns the data used in training and inference, as the retrieval database can be considered as a part ofthe model. That is in contrast to the image database used for the retrieval algorithm: Here, retrieved images have a discernible effect on the output, and the database used during inference may only consist ofrelativelyfewhighquality images.