AITopics | sublinear

On Coresets for Logistic Regression

Neural Information Processing SystemsMar-16-2026, 21:27:48 GMT

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

On Coresets for Logistic Regression

Alexander Munteanu, Chris Schwiegelshohn, Christian Sohler, David Woodruff

Neural Information Processing SystemsFeb-12-2026, 22:57:43 GMT

First, we show the negativeresult that no strongly sublinear sized coresets existforlogisticregression.

artificial intelligence, machine learning, regression, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Italy (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

a19744e268754fb0148b017647355b7b-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 08:59:13 GMT

algorithm, service time, time step, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)
Africa > Ethiopia (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

On Coresets for Logistic Regression

Neural Information Processing SystemsNov-20-2025, 22:18:26 GMT

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression.

coreset, logistic regression, name change, (3 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

Scheduling jobs with stochastic holding costs

Neural Information Processing SystemsAug-16-2025, 12:26:07 GMT

This question is motivated by several applications.

artificial intelligence, machine learning, service time, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)
Africa > Ethiopia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Review for NeurIPS paper: Information theoretic limits of learning a sparse rule

Neural Information Processing SystemsJan-25-2025, 16:18:00 GMT

Summary and Contributions: The paper considers the generalized linear model, in the high-dimensional sparse setting. In this setting the authors prove the existence of an'all-or-nothing phenomenon' (see GZ, RXZ) wherein there exists a function m_*(n) (sublinear in the ambient dimension n) such that - if the # samples m (1 eps) m_*, the error in estimating the signal vanishes; and - if m (1-eps) m_*, the error converges to the'trivial' error of estimating from the prior. Classical statistical or learning theory typically demonstrates a'graceful' decay of error as the number of samples grows large. The current work is in a line of work demonstrating a'sharp cutoff' phenomenon instead. This was initiated by GZ, RXZ in linear regression setting.

information theoretic limit, mutual information, neurips paper, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improved Algorithms for Allen's Interval Algebra by Dynamic Programming with Sublinear Partitioning

Eriksson, Leif, Lagerkvist, Victor

arXiv.org Artificial IntelligenceMay-25-2023

Allen's interval algebra is one of the most well-known calculi in qualitative temporal reasoning with numerous applications in artificial intelligence. Recently, there has been a surge of improvements in the fine-grained complexity of NP-hard reasoning tasks, improving the running time from the naive $2^{O(n^2)}$ to $O^*((1.0615n)^{n})$, with even faster algorithms for unit intervals a bounded number of overlapping intervals (the $O^*(\cdot)$ notation suppresses polynomial factors). Despite these improvements the best known lower bound is still only $2^{o(n)}$ (under the exponential-time hypothesis) and major improvements in either direction seemingly require fundamental advances in computational complexity. In this paper we propose a novel framework for solving NP-hard qualitative reasoning problems which we refer to as dynamic programming with sublinear partitioning. Using this technique we obtain a major improvement of $O^*((\frac{cn}{\log{n}})^{n})$ for Allen's interval algebra. To demonstrate that the technique is applicable to more domains we apply it to a problem in qualitative spatial reasoning, the cardinal direction point algebra, and solve it in $O^*((\frac{cn}{\log{n}})^{2n/3})$ time. Hence, not only do we significantly advance the state-of-the-art for NP-hard qualitative reasoning problems, but obtain a novel algorithmic technique that is likely applicable to many problems where $2^{O(n)}$ time algorithms are unlikely.

algorithm, artificial intelligence, qualitative reasoning, (15 more...)

arXiv.org Artificial Intelligence

2305.1595

Country: Europe > Sweden (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Qualitative Reasoning (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.63)

Add feedback

Particle-based Online Bayesian Sampling

Yang, Yifan, Liu, Chang, Zhang, Zheng

arXiv.org Artificial IntelligenceFeb-28-2023

Online learning has gained increasing interest due Online optimization methods can directly be applied to update to its capability of tracking real-world streaming models that are fully specified by a certain value of its data. Although it has been widely studied in the parameters. Beyond such models, there is another class of setting of frequentist statistics, few works have models known as Bayesian models that treat the parameters considered online learning with the Bayesian sampling as random variables, thus giving an output also as a random problem. In this paper, we study an Online variable (often the expectation is taken as the final output on Particle-based Variational Inference (OPVI) algorithm par with the conventional case). The stochasticity enables that updates a set of particles to gradually Bayesian models to provide diverse outputs, characterize approximate the Bayesian posterior. To reduce prediction uncertainty, and be more robust to adversarial the gradient error caused by the use of stochastic attacks (Hernández-Lobato and Adams, 2015; Li and Gal, approximation, we include a sublinear increasing 2017; Yoon et al., 2018; Zhang et al., 2019; Tolpin et al., batch-size method to reduce the variance.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2302.14796

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
(3 more...)

Add feedback

On Coresets for Logistic Regression

Munteanu, Alexander, Schwiegelshohn, Chris, Sohler, Christian, Woodruff, David

Neural Information Processing SystemsFeb-14-2020, 18:42:24 GMT

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression. For data sets with bounded $\mu(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\eps)$-coreset.

coreset, logistic regression, sublinear

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Approximate K-Means++ in Sublinear Time

Bachem, Olivier (Eidgenössische Technische Hochschule Zürich (ETH Zurich)) | Lucic, Mario (Eidgenössische Technische Hochschule Zürich (ETH Zurich)) | Hassani, S. Hamed (Eidgenössische Technische Hochschule Zürich (ETH Zurich)) | Krause, Andreas (Eidgenössische Technische Hochschule Zürich (ETH Zurich))

AAAI ConferencesApr-19-2016

The quality of K-Means clustering is extremely sensitive to proper initialization. The classic remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive with the optimal solution. Unfortunately, k-means++ requires k full passes over the data which limits its applicability to massive datasets. We address this problem by proposing a simple and efficient seeding algorithm for K-Means clustering. The main idea is to replace the exact D2-sampling step in k-means++ with a substantially faster approximation based on Markov Chain Monte Carlo sampling. We prove that, under natural assumptions on the data, the proposed algorithm retains the full theoretical guarantees of k-means++ while its computational complexity is only sublinear in the number of data points. For such datasets, one can thus obtain a provably good clustering in sublinear time. Extensive experiments confirm that the proposed method is competitive with k-means++ on a variety of real-world, large-scale datasets while offering a reduction in runtime of several orders of magnitude.

artificial intelligence, machine learning, mc 2, (18 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback