AITopics | bit complexity

Collaborating Authors

bit complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Forster Decomposition and Learning Halfspaces with Noise

Neural Information Processing SystemsApr-25-2026, 14:24:29 GMT

AForster transform is an operation that turns a distribution into one with good anticoncentration properties. While a Forster transform does not always exist, we show that any distribution can be efficiently decomposed as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently. As the main application of this result, we obtain the first polynomial-time algorithm for distribution-independent PAC learning of halfspaces in the Massart noise model with strongly polynomial sample complexity, i.e., independent of the bit complexity of the examples. Previous algorithms for this learning problem incurred sample complexity scaling polynomially with the bit complexity, even though such a dependence is not information-theoretically necessary.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

ForsterDecompositionandLearningHalfspaceswith Noise

Neural Information Processing SystemsFeb-8-2026, 08:46:37 GMT

Recentwork[DGT19]obtained thefirstcomputationally efficient learning algorithm with non-trivial error guarantee for this problem.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Forster Decomposition and Learning Halfspaces with Noise

Neural Information Processing SystemsDec-24-2025, 00:51:49 GMT

A Forster transform is an operation that turns a multivariate distribution into one with good anti-concentration properties. While a Forster transform does not always exist, we show that any distribution can be efficiently decomposed as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently. As the main application of this result, we obtain the first polynomial-time algorithm for distribution-independent PAC learning of halfspaces in the Massart noise model with strongly polynomial sample complexity, i.e., independent of the bit complexity of the examples. Previous algorithms for this learning problem incurred sample complexity scaling polynomially with the bit complexity, even though such a dependence is not information-theoretically necessary.

complexity, forster decomposition and learning halfspace, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets

Kim, Yujun, Moon, Chaewon, Yun, Chulhee

arXiv.org Artificial IntelligenceOct-29-2025

We study the parameter complexity of robust memorization for $\mathrm{ReLU}$ networks: the number of parameters required to interpolate any given dataset with $ε$-separation between differently labeled points, while ensuring predictions remain consistent within a $μ$-ball around each training sample. We establish upper and lower bounds on the parameter count as a function of the robustness ratio $ρ= μ/ ε$. Unlike prior work, we provide a fine-grained analysis across the entire range $ρ\in (0,1)$ and obtain tighter upper and lower bounds that improve upon existing results. Our findings reveal that the parameter complexity of robust memorization matches that of non-robust memorization when $ρ$ is small, but grows with increasing $ρ$.

artificial intelligence, machine learning, proj, (16 more...)

arXiv.org Artificial Intelligence

2510.24643

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Active Learning of Classifiers with Label and Seed Queries (Supplementary Material)

Neural Information Processing SystemsAug-18-2025, 20:51:13 GMT

A.3 Pseudocode of CPLearn and full proof of Theorem 10 We present CPLearn and prove Theorem 10. There are two main obstacles in implementing this process. Let us now turn to the invariants. As shown in Bressan et al. [2021a], this implies Now let us turn to CPLearn. This proves the second invariant.

artificial intelligence, machine learning, query, (18 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Lombardy > Milan (0.04)
Europe > Austria (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Forster Decomposition and Learning Halfspaces with Noise

Neural Information Processing SystemsOct-10-2024, 03:21:34 GMT

complexity, forster decomposition and learning halfspace, forster transform, (4 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Optimal Memorization Capacity of Transformers

Kajitsuka, Tokio, Sato, Issei

arXiv.org Artificial IntelligenceSep-26-2024

In recent years, the Transformer architecture (Vaswani et al., 2017) has played a pivotal role in the field of machine learning, becoming indispensable for a variety of models in the community. In addition to the original breakthroughs in natural language processing, such as the GPT series (Brown et al., 2020; Radford et al., 2018, 2019), it has been observed that in numerous applications, higher accuracy can be achieved by replacing existing models with Transformers. Specifically, models such as the Vision Transformer (Dosovitskiy et al., 2021) in image processing and the Diffusion Transformer (Peebles & Xie, 2023) in generative tasks have demonstrated exceptional performances in a wide variety of tasks. These examples demonstrate how effective and versatile Transformers are for a diverse range of purposes. Although the high performance of Transformers has led to their widespread use in practice, there are ongoing attempts to theoretically analyze what exactly contributes to their superior performance.

input sequence, sequence, transformer, (15 more...)

arXiv.org Artificial Intelligence

2409.17677

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.43)

Add feedback

A Strongly Polynomial Algorithm for Approximate Forster Transforms and its Application to Halfspace Learning

Diakonikolas, Ilias, Tzamos, Christos, Kane, Daniel M.

arXiv.org Artificial IntelligenceDec-6-2022

The Forster transform is a method of regularizing a dataset X (in particular, by placing it in radial isotropic position) while maintaining some of its essential properties. Forster transforms have been an essential tool in a diverse range of settings, including functional analysis [Bar98, GGdOW17], communication complexity [For02], coding theory [DSW17], mixed determinant/volume approximation [GS02], learning theory [HM13, HKLM20, DKT21, DPT21] and the Paulsen problem in frame theory [KLLR18, HM19]. The reader is referred to [AKS20] for a more detailed discussion. Known algorithms for computing (approximate) Forster transforms [HM13, AKS20, DKT21] rely on black-box convex optimization (e.g., the ellipsoid algorithm) and consequently have weakly polynomial runtimes. Here we study the question of whether Forster transforms can be computed in strongly polynomial time. We then leverage Forster transforms for the problem of PAC learning halfspaces (both in the realizable setting and in the presence of semi-random label noise). Intuitively speaking, a Forster transform is a mapping that turns a dataset into one with good anti-concentration properties.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2212.03008

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

List-Decodable Covariance Estimation

Ivkov, Misha, Kothari, Pravesh K.

arXiv.org Machine LearningJun-22-2022

We give the first polynomial time algorithm for \emph{list-decodable covariance estimation}. For any $\alpha > 0$, our algorithm takes input a sample $Y \subseteq \mathbb{R}^d$ of size $n\geq d^{\mathsf{poly}(1/\alpha)}$ obtained by adversarially corrupting an $(1-\alpha)n$ points in an i.i.d. sample $X$ of size $n$ from the Gaussian distribution with unknown mean $\mu_*$ and covariance $\Sigma_*$. In $n^{\mathsf{poly}(1/\alpha)}$ time, it outputs a constant-size list of $k = k(\alpha)= (1/\alpha)^{\mathsf{poly}(1/\alpha)}$ candidate parameters that, with high probability, contains a $(\hat{\mu},\hat{\Sigma})$ such that the total variation distance $TV(\mathcal{N}(\mu_*,\Sigma_*),\mathcal{N}(\hat{\mu},\hat{\Sigma}))<1-O_{\alpha}(1)$. This is the statistically strongest notion of distance and implies multiplicative spectral and relative Frobenius distance approximation for parameters with dimension independent error. Our algorithm works more generally for $(1-\alpha)$-corruptions of any distribution $D$ that possesses low-degree sum-of-squares certificates of two natural analytic properties: 1) anti-concentration of one-dimensional marginals and 2) hypercontractivity of degree 2 polynomials. Prior to our work, the only known results for estimating covariance in the list-decodable setting were for the special cases of list-decodable linear regression and subspace recovery due to Karmarkar, Klivans, and Kothari (2019), Raghavendra and Yau (2019 and 2020) and Bakshi and Kothari (2020). These results need superpolynomial time for obtaining any subconstant error in the underlying dimension. Our result implies the first polynomial-time \emph{exact} algorithm for list-decodable linear regression and subspace recovery that allows, in particular, to obtain $2^{-\mathsf{poly}(d)}$ error in polynomial-time. Our result also implies an improved algorithm for clustering non-spherical mixtures.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1145/3519935.3520006

2206.10942

Country:

North America > United States > New York (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(12 more...)

Genre:

Research Report (0.90)
Workflow (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Private Robust Estimation by Stabilizing Convex Relaxations

Kothari, Pravesh K., Manurangsi, Pasin, Velingker, Ameya

arXiv.org Machine LearningDec-7-2021

We give the first polynomial time and sample $(\epsilon, \delta)$-differentially private (DP) algorithm to estimate the mean, covariance and higher moments in the presence of a constant fraction of adversarial outliers. Our algorithm succeeds for families of distributions that satisfy two well-studied properties in prior works on robust estimation: certifiable subgaussianity of directional moments and certifiable hypercontractivity of degree 2 polynomials. Our recovery guarantees hold in the "right affine-invariant norms": Mahalanobis distance for mean, multiplicative spectral and relative Frobenius distance guarantees for covariance and injective norms for higher moments. Prior works obtained private robust algorithms for mean estimation of subgaussian distributions with bounded covariance. For covariance estimation, ours is the first efficient algorithm (even in the absence of outliers) that succeeds without any condition-number assumptions. Our algorithms arise from a new framework that provides a general blueprint for modifying convex relaxations for robust estimation to satisfy strong worst-case stability guarantees in the appropriate parameter norms whenever the algorithms produce witnesses of correctness in their run. We verify such guarantees for a modification of standard sum-of-squares (SoS) semidefinite programming relaxations for robust estimation. Our privacy guarantees are obtained by combining stability guarantees with a new "estimate dependent" noise injection mechanism in which noise scales with the eigenvalues of the estimated covariance. We believe this framework will be useful more generally in obtaining DP counterparts of robust estimators. Independently of our work, Ashtiani and Liaw [AL21] also obtained a polynomial time and sample private robust estimation algorithm for Gaussian distributions.

algorithm, estimation, estimation algorithm, (15 more...)

arXiv.org Machine Learning

2112.03548

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback