Collaborating Authors

Do Less, Get More: Streaming Submodular Maximization with Subsampling

Neural Information Processing Systems

In this paper, we develop the first one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once. By carefully subsampling each element of the data stream, our algorithm enjoys the tightest approximation guarantees in various settings while having the smallest memory footprint and requiring the lowest number of function evaluations. More specifically, for a monotone submodular function and a $p$-matchoid constraint, our randomized algorithm achieves a $4p$ approximation ratio (in expectation) with $O(k)$ memory and $O(km/p)$ queries per element ($k$ is the size of the largest feasible solution and $m$ is the number of matroids used to define the constraint). For the non-monotone case, our approximation ratio increases only slightly to $4p+2-o(1)$. To the best or our knowledge, our algorithm is the first that combines the benefits of streaming and subsampling in a novel way in order to truly scale submodular maximization to massive machine learning problems. To showcase its practicality, we empirically evaluated the performance of our algorithm on a video summarization application and observed that it outperforms the state-of-the-art algorithm by up to fifty-fold while maintaining practically the same utility. We also evaluated the scalability of our algorithm on a large dataset of Uber pick up locations.

Submodular Maximization via Gradient Ascent: The Case of Deep Submodular Functions

Neural Information Processing Systems

We study the problem of maximizing deep submodular functions (DSFs) subject to a matroid constraint. DSFs are an expressive class of submodular functions that include, as strict subfamilies, the facility location, weighted coverage, and sums of concave composed with modular functions. We use a strategy similar to the continuous greedy approach, but we show that the multilinear extension of any DSF has a natural and computationally attainable concave relaxation that we can optimize using gradient ascent. Our results show a guarantee of $\max_{0 \delta 1}(1-\epsilon-\delta-e {-\delta 2\Omega(k)})$ with a running time of $O( icefrac{n 2}{\epsilon 2})$ plus time for pipage rounding to recover a discrete solution, where $k$ is the rank of the matroid constraint. This bound is often better than the standard $1-1/e$ guarantee of the continuous greedy algorithm, but runs much faster.

Stochastic Submodular Maximization: The Case of Coverage Functions

Neural Information Processing Systems

Stochastic optimization of continuous objectives is at the heart of modern machine learning. However, many important problems are of discrete nature and often involve submodular objectives. We seek to unleash the power of stochastic continuous optimization, namely stochastic gradient descent and its variants, to such discrete problems. We first introduce the problem of stochastic submodular optimization, where one needs to optimize a submodular objective which is given as an expectation. Our model captures situations where the discrete objective arises as an empirical risk (e.g., in the case of exemplar-based clustering), or is given as an explicit stochastic model (e.g., in the case of influence maximization in social networks). By exploiting that common extensions act linearly on the class of submodular functions, we employ projected stochastic gradient ascent and its variants in the continuous domain, and perform rounding to obtain discrete solutions. We focus on the rich and widely used family of weighted coverage functions. We show that our approach yields solutions that are guaranteed to match the optimal approximation guarantees, while reducing the computational cost by several orders of magnitude, as we demonstrate empirically.

Weakly Submodular Maximization Beyond Cardinality Constraints: Does Randomization Help Greedy? Machine Learning

Submodular functions are a broad class of set functions, which naturally arise in diverse areas. Many algorithms have been suggested for the maximization of these functions. Unfortunately, once the function deviates from submodularity, the known algorithms may perform arbitrarily poorly. Amending this issue, by obtaining approximation results for set functions generalizing submodular functions, has been the focus of recent works. One such class, known as weakly submodular functions, has received a lot of attention. A key result proved by Das and Kempe (2011) showed that the approximation ratio of the greedy algorithm for weakly submodular maximization subject to a cardinality constraint degrades smoothly with the distance from submodularity. However, no results have been obtained for maximization subject to constraints beyond cardinality. In particular, it is not known whether the greedy algorithm achieves any non-trivial approximation ratio for such constraints. In this paper, we prove that a randomized version of the greedy algorithm (previously used by Buchbinder et al. (2014) for a different problem) achieves an approximation ratio of $(1 + 1/\gamma)^{-2}$ for the maximization of a weakly submodular function subject to a general matroid constraint, where $\gamma$ is a parameter measuring the distance of the function from submodularity. Moreover, we also experimentally compare the performance of this version of the greedy algorithm on real world problems against natural benchmarks, and show that the algorithm we study performs well also in practice. To the best of our knowledge, this is the first algorithm with a non-trivial approximation guarantee for maximizing a weakly submodular function subject to a constraint other than the simple cardinality constraint. In particular, it is the first algorithm with such a guarantee for the important and broad class of matroid constraints.

Submodular Maximization Through Barrier Functions Machine Learning

In the constrained continuous optimization, barrier functions are usually used to impose an increasingly large cost on a feasible point as it approaches the boundary of the feasible region [32]. In effect, barrier functions replace constraints by a penalizing term in the primal objective function so that the solution stays away from the boundary of the feasible region. This is an attempt to approximate a constrained optimization problem with an unconstrained one and to later apply standard optimization techniques. While the benefits of barrier functions are studied extensively in the continuous domain [32], their use in discrete optimization is not very well understood. In this paper, we show how discrete barrier functions manifest themselves in constrained submodular maximization. Submodular functions formalize the intuitive diminishing returns condition, a property that not only allows optimization tractability but also appears in many machine learning applications, including video, image, and text summarization [7, 12, 23, 28, 35], active set selection in nonparametric learning [26], sequential decision making [27, 29] sensor placement, information gathering [10], privacy and fairness [16].