Goto

Collaborating Authors

 sublinear


On Coresets for Logistic Regression

Neural Information Processing Systems

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression.




On Coresets for Logistic Regression

Neural Information Processing Systems

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression.



Review for NeurIPS paper: Information theoretic limits of learning a sparse rule

Neural Information Processing Systems

Summary and Contributions: The paper considers the generalized linear model, in the high-dimensional sparse setting. In this setting the authors prove the existence of an'all-or-nothing phenomenon' (see GZ, RXZ) wherein there exists a function m_*(n) (sublinear in the ambient dimension n) such that - if the # samples m (1 eps) m_*, the error in estimating the signal vanishes; and - if m (1-eps) m_*, the error converges to the'trivial' error of estimating from the prior. Classical statistical or learning theory typically demonstrates a'graceful' decay of error as the number of samples grows large. The current work is in a line of work demonstrating a'sharp cutoff' phenomenon instead. This was initiated by GZ, RXZ in linear regression setting.


Improved Algorithms for Allen's Interval Algebra by Dynamic Programming with Sublinear Partitioning

arXiv.org Artificial Intelligence

Allen's interval algebra is one of the most well-known calculi in qualitative temporal reasoning with numerous applications in artificial intelligence. Recently, there has been a surge of improvements in the fine-grained complexity of NP-hard reasoning tasks, improving the running time from the naive $2^{O(n^2)}$ to $O^*((1.0615n)^{n})$, with even faster algorithms for unit intervals a bounded number of overlapping intervals (the $O^*(\cdot)$ notation suppresses polynomial factors). Despite these improvements the best known lower bound is still only $2^{o(n)}$ (under the exponential-time hypothesis) and major improvements in either direction seemingly require fundamental advances in computational complexity. In this paper we propose a novel framework for solving NP-hard qualitative reasoning problems which we refer to as dynamic programming with sublinear partitioning. Using this technique we obtain a major improvement of $O^*((\frac{cn}{\log{n}})^{n})$ for Allen's interval algebra. To demonstrate that the technique is applicable to more domains we apply it to a problem in qualitative spatial reasoning, the cardinal direction point algebra, and solve it in $O^*((\frac{cn}{\log{n}})^{2n/3})$ time. Hence, not only do we significantly advance the state-of-the-art for NP-hard qualitative reasoning problems, but obtain a novel algorithmic technique that is likely applicable to many problems where $2^{O(n)}$ time algorithms are unlikely.


Particle-based Online Bayesian Sampling

arXiv.org Artificial Intelligence

Online learning has gained increasing interest due Online optimization methods can directly be applied to update to its capability of tracking real-world streaming models that are fully specified by a certain value of its data. Although it has been widely studied in the parameters. Beyond such models, there is another class of setting of frequentist statistics, few works have models known as Bayesian models that treat the parameters considered online learning with the Bayesian sampling as random variables, thus giving an output also as a random problem. In this paper, we study an Online variable (often the expectation is taken as the final output on Particle-based Variational Inference (OPVI) algorithm par with the conventional case). The stochasticity enables that updates a set of particles to gradually Bayesian models to provide diverse outputs, characterize approximate the Bayesian posterior. To reduce prediction uncertainty, and be more robust to adversarial the gradient error caused by the use of stochastic attacks (Hernández-Lobato and Adams, 2015; Li and Gal, approximation, we include a sublinear increasing 2017; Yoon et al., 2018; Zhang et al., 2019; Tolpin et al., batch-size method to reduce the variance.


On Coresets for Logistic Regression

Neural Information Processing Systems

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression. For data sets with bounded $\mu(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\eps)$-coreset.


Approximate K-Means++ in Sublinear Time

AAAI Conferences

The quality of K-Means clustering is extremely sensitive to proper initialization. The classic remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive with the optimal solution. Unfortunately, k-means++ requires k full passes over the data which limits its applicability to massive datasets. We address this problem by proposing a simple and efficient seeding algorithm for K-Means clustering. The main idea is to replace the exact D2-sampling step in k-means++ with a substantially faster approximation based on Markov Chain Monte Carlo sampling. We prove that, under natural assumptions on the data, the proposed algorithm retains the full theoretical guarantees of k-means++ while its computational complexity is only sublinear in the number of data points. For such datasets, one can thus obtain a provably good clustering in sublinear time. Extensive experiments confirm that the proposed method is competitive with k-means++ on a variety of real-world, large-scale datasets while offering a reduction in runtime of several orders of magnitude.