Goto

Collaborating Authors

 image collection summarization


Learning Mixtures of Submodular Functions for Image Collection Summarization

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. We argue that submodularity is very natural to this problem, and we show that a number of previously used scoring functions are submodular -- a property not explicitly mentioned in these publications. We provide classes of submodular functions capturing the necessary properties of summaries, namely coverage, likelihood, and diversity. To learn mixtures of these submodular functions as scoring functions, we formulate summarization as a supervised learning problem using large-margin structured prediction. Furthermore, we introduce a novel evaluation metric, which we call V-ROUGE, for automatic summary scoring. While a similar metric called ROUGE has been successfully applied to document summarization [14], no such metric was known for quantifying the quality of image collection summaries. We provide a new dataset consisting of 14 real-world image collections along with many human-generated ground truth summaries collected using mechanical turk. We also extensively compare our method with previously explored methods for this problem and show that our learning approach outperforms all competitors on this new dataset. This paper provides, to our knowledge, the first systematic approach for quantifying the problem of image collection summarization, along with a new dataset of image collections and human summaries.


Learning Mixtures of Submodular Functions for Image Collection Summarization

Sebastian Tschiatschek, Rishabh K. Iyer, Haochen Wei, Jeff A. Bilmes

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. Submodularity is useful for this problem since it naturally represents characteristics such as fidelity and diversity, desirable for any summary. Several previously proposed image summarization scoring methodologies, in fact, instinctively arrived at submodularity. We provide classes of submodular component functions (including some which are instantiated via a deep neural network) over which mixtures may be learnt. We formulate the learning of such mixtures as a supervised problem via large-margin structured prediction.


Learning Mixtures of Submodular Functions for Image Collection Summarization

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. We argue that submodularity is very natural to this problem, and we show that a number of previously used scoring functions are submodular -- a property not explicitly mentioned in these publications. We provide classes of submodular functions capturing the necessary properties of summaries, namely coverage, likelihood, and diversity. To learn mixtures of these submodular functions as scoring functions, we formulate summarization as a supervised learning problem using large-margin structured prediction. Furthermore, we introduce a novel evaluation metric, which we call V-ROUGE, for automatic summary scoring. While a similar metric called ROUGE has been successfully applied to document summarization [14], no such metric was known for quantifying the quality of image collection summaries.


Learning Mixtures of Submodular Functions for Image Collection Summarization

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. Submodularity is useful for this problem since it naturally represents characteristics such as fidelity and diversity, desirable for any summary. Several previously proposed image summarization scoring methodologies, in fact, instinctively arrived at submodularity. We provide classes of submodular component functions (including some which are instantiated via a deep neural network) over which mixtures may be learnt. We formulate the learning of such mixtures as a supervised problem via large-margin structured prediction.


Tell Me What Is Good About This Property: Leveraging Reviews For Segment-Personalized Image Collection Summarization

Wysoczanska, Monika, Beladev, Moran, Assaraf, Karen Lastmann, Wang, Fengjun, Kleinfeld, Ofri, Amsalem, Gil, Boker, Hadas Harush

arXiv.org Artificial Intelligence

Image collection summarization techniques aim to present a compact representation of an image gallery through a carefully selected subset of images that captures its semantic content. When it comes to web content, however, the ideal selection can vary based on the user's specific intentions and preferences. This is particularly relevant at Booking.com, where presenting properties and their visual summaries that align with users' expectations is crucial. To address this challenge, we consider user intentions in the summarization of property visuals by analyzing property reviews and extracting the most significant aspects mentioned by users. By incorporating the insights from reviews in our visual summaries, we enhance the summaries by presenting the relevant content to a user. Moreover, we achieve it without the need for costly annotations. Our experiments, including human perceptual studies, demonstrate the superiority of our cross-modal approach, which we coin as CrossSummarizer over the no-personalization and image-based clustering baselines.


Learning Mixtures of Submodular Functions for Image Collection Summarization

Tschiatschek, Sebastian, Iyer, Rishabh K., Wei, Haochen, Bilmes, Jeff A.

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. We argue that submodularity is very natural to this problem, and we show that a number of previously used scoring functions are submodular -- a property not explicitly mentioned in these publications. We provide classes of submodular functions capturing the necessary properties of summaries, namely coverage, likelihood, and diversity. To learn mixtures of these submodular functions as scoring functions, we formulate summarization as a supervised learning problem using large-margin structured prediction. Furthermore, we introduce a novel evaluation metric, which we call V-ROUGE, for automatic summary scoring.


Differentiable Submodular Maximization

Tschiatschek, Sebastian, Sahin, Aytunc, Krause, Andreas

arXiv.org Machine Learning

We consider learning of submodular functions from data. These functions are important in machine learning and have a wide range of applications, e.g. data summarization, feature selection and active learning. Despite their combinatorial nature, submodular functions can be maximized approximately with strong theoretical guarantees in polynomial time. Typically, learning the submodular function and optimization of that function are treated separately, i.e. the function is first learned using a proxy objective and subsequently maximized. In contrast, we show how to perform learning and optimization jointly. By interpreting the output of greedy maximization algorithms as distributions over sequences of items and smoothening these distributions, we obtain a differentiable objective. In this way, we can differentiate through the maximization algorithms and optimize the model to work well with the optimization algorithm. We theoretically characterize the error made by our approach, yielding insights into the trade-off of smoothness and accuracy. We demonstrate the effectiveness of our approach for jointly learning and optimizing on synthetic maxcut data, and on a real world product recommendation application.


Learning Mixtures of Submodular Functions for Image Collection Summarization

Tschiatschek, Sebastian, Iyer, Rishabh K., Wei, Haochen, Bilmes, Jeff A.

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. We argue that submodularity is very natural to this problem, and we show that a number of previously used scoring functions are submodular — a property not explicitly mentioned in these publications. We provide classes of submodular functions capturing the necessary properties of summaries, namely coverage, likelihood, and diversity. To learn mixtures of these submodular functions as scoring functions, we formulate summarization as a supervised learning problem using large-margin structured prediction. Furthermore, we introduce a novel evaluation metric, which we call V-ROUGE, for automatic summary scoring. While a similar metric called ROUGE has been successfully applied to document summarization [14], no such metric was known for quantifying the quality of image collection summaries. We provide a new dataset consisting of 14 real-world image collections along with many human-generated ground truth summaries collected using mechanical turk. We also extensively compare our method with previously explored methods for this problem and show that our learning approach outperforms all competitors on this new dataset. This paper provides, to our knowledge, the first systematic approach for quantifying the problem of image collection summarization, along with a new dataset of image collections and human summaries.