FlexCap: Describe Anything in Images in Controllable Detail

Neural Information Processing Systems 

We demonstrate FlexCap's effectiveness in several applications: first, it achieves strong performance in dense captioning tasks on the Visual Genome dataset. Second, we show how FlexCap's localized