AITopics

2607.01545

Country: North America > United States > Colorado (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.66)

Manoj, Naren Sarayu, Patel, Kumar Kshitij

Distributionally Robust Linear Regression With Block Lewis Weights

arXiv.org Machine LearningJul-2-2026

Machine learning algorithms and their training datasets have grown substantially in both size and complexity over the past decade. This increased model complexity has made it challenging to interpret and predict their behavior in unobserved scenarios. Hence, many applications that involve societal decisions still rely on simple, interpretable models like linear regression, often after feature engineering. Examples of such applications include predicting national housing prices, estimating wages across industries, forecasting loan amounts across banks, predicting life insurance premiums across groups, and projecting energy consumption across communities [CGKMN24]. A shared safety and sometimes legal concern across the above applications is the potential for wildly different model qualities for different distributions, i.e., outputting a notably worse model for some source data distributions [Dat14; BS16; HPS16; VVB18; SBFVV19; BHJKR21; CGNSG23; Cho16; KLMR18; ADW19; CGKMN24; SVWZ24].

artificial intelligence, lemma 7, machine learning, (19 more...)

2607.00252

Country:

Europe (0.92)
North America > United States > California (0.27)

Genre:

Research Report (0.64)
Workflow (0.45)

Industry: Banking & Finance > Insurance (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Schliserman, Matan, Buzaglo, Gon, Evron, Itay, Soudry, Daniel

Convergence of Continual Learning in Homogeneous Deep Networks

arXiv.org Machine LearningJun-30-2026

We characterize weakly regularized continual classification in homogeneous models as sequential projections onto task margin sets. This result generalizes prior analyses restricted to either stationary (single-task) deep models or continual linear models. We show that global convergence generally fails, even for simple models linear in data but nonlinear in parameters. Nevertheless, by leveraging results from nonconvex projection theory, we identify regularity properties of homogeneous deep networks that guarantee local linear convergence under random and cyclic task sequences. Finally, we extend our analysis to continual regression, unifying the framework for homogeneous models.

artificial intelligence, continual learning, machine learning, (16 more...)

2606.30559

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.40)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Liu, Shixiang, Yang, Hanming

Adversarial Contamination Meets Hard Thresholding: An Iterative Algorithm with Signal Adaptivity and Minimax Optimality

arXiv.org Machine LearningJun-29-2026

Pervasive data contamination -- stemming from measurement errors, outliers, or adversarial corruption -- has motivated the development of robust statistical methods. In this context, we propose a two-stage Adversarial Contamination-resistant Iterative Hard Thresholding (AC-IHT) algorithm for high-dimensional regression with contamination. Our nonconvex algorithm achieves minimax near-optimal (up to logarithmic terms) estimation by iteratively updating the coefficient vector and the contamination vector with different thresholding scales. We further demonstrate that our AC-IHT estimator is signal-adaptive: under proper signal conditions, it adaptively attains a sharper estimation rate and more accurate support recovery. Moreover, it enjoys the strong oracle property, laying a theoretical foundation for asymptotic inference. Numerical experiments confirm its superior finite-sample performance. Finally, we discuss theoretical extensions of the proposed procedure to generalized linear models and to heavy-tailed noise settings.

artificial intelligence, data mining, machine learning, (19 more...)

2606.27685

Genre: Research Report > New Finding (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)

Neural Information Processing SystemsJun-23-2026, 06:27:09 GMT

New Parallel and Streaming Algorithms for Directed Densest Subgraph

Finding dense subgraphs is a fundamental problem with applications to community detection, clustering, and data mining. Our work focuses on finding approximate densest subgraphs in directed graphs in computational models for processing massive data. We consider two such models: Massively Parallel Computation (MPC) and semi-streaming. We show how to find a (2+ε)-approximation in O( logn) MPC rounds with sublinear memory per machine.

artificial intelligence, data mining, machine learning, (19 more...)

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Neural Information Processing SystemsJun-23-2026, 04:34:30 GMT

Minimax Adaptive Online Nonparametric Regression over Besov spaces

This adaptive mechanism adjusts the resolution of the predictions over both time and space, yielding refined regret bounds in terms of local regularity. Consequently, in heterogeneous environments, our adaptive guarantees can significantly surpass those obtained by standard global methods.

artificial intelligence, coefficient, machine learning, (19 more...)

Country: Europe > France (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.41)

Neural Information Processing SystemsJun-23-2026, 04:10:53 GMT

ACramér-von Mises Approach to Incentivizing Truthful Data Sharing

Modern data marketplaces and data sharing consortia increasingly rely on incentive mechanisms to encourage agents to contribute data. However, schemes that reward agents based on the quantity of submitted data are vulnerable to manipulation, as agents may submit fabricated or low-quality data to inflate their rewards. Prior work has proposed comparing each agent's data against others' to promote honesty: when others contribute genuine data, the best way to minimize discrepancy is to do the same. Yet prior implementations of this idea rely on very strong assumptions about the data distribution (e.g.

artificial intelligence, machine learning, natural language, (21 more...)

Country:

North America > United States (0.28)
Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)
(3 more...)

Neural Information Processing SystemsJun-23-2026, 00:54:35 GMT

Inverse Q-Learning Done Right: Offline Imitation Learning in Qπ-Realizable MDPs

We study the problem of offline imitation learning in Markov decision processes (MDPs), where the goal is to learn a well-performing policy given a dataset of state-action pairs generated by an expert policy. Complementing a recent line of work on this topic that assumes the expert belongs to a tractable class of known policies, we approach this problem from a new angle and leverage a different type of structural assumption about the environment. Specifically, for the class of linear Qπ-realizable MDPs, we introduce a new algorithm called saddle-point offline imitation learning (SPOIL), which is guaranteed to match the performance of any expert up to an additive error ε with access to O(ε 2) samples. Moreover, we extend this result to possibly nonlinear Qπ-realizable MDPs at the cost of a worse sample complexity of order O(ε 4). Finally, our analysis suggests a new loss function for training critic networks from expert data in deep imitation learning. Empirical evaluations on standard benchmarks demonstrate that the neural net implementation of SPOIL is superior to behavior cloning and competitive with state-of-the-art algorithms.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Neural Information Processing SystemsJun-23-2026, 00:37:23 GMT

APrivate Approximation of the 2nd-Moment Matrix of Any Subsamplable Input

We study the problem of differentially private second moment estimation and present a new algorithm that achieve strong privacy-utility trade-offs even for worst-case inputs under subsamplability assumptions on the data. We call an input (m,α,β)-subsamplable if a random subsample of size m(or larger) preserves w.p 1 β the spectral structure of the original second moment matrix up to a multiplicative factor of 1 α. Building upon subsamplability, we give a recursive algorithmic framework similar to Kamath et al. (2019) that abides zero-Concentrated Differential Privacy (zCDP) while preserving w.h.p the accuracy of the second moment estimation upto an arbitrary factor of (1 γ). We then show how to apply our algorithm to approximate the second moment matrix of a distribution D, even when a noticeable fraction of the input are outliers.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.93)

Chen, Junren, Mazumdar, Arya

Finite-Sample Performance of Gradient Descent in Logistic Regression with Gaussian Design

arXiv.org Machine LearningJun-23-2026

We consider the parameter estimation problem in logistic regression with Gaussian design: the estimation of a fixed unknown parameter $θ^*\in \mathbb{R}^d$ ($\|θ^*\|_2\ge 1$) from $n$ i.i.d. samples $\{(x_i,y_i)\}_{i=1}^n$, where $x_i\sim N(0,I_d)$ and $y_i|x_i \sim {\rm Bernoulli}(1/(1+\exp(-x_i^\top θ^*)))$. Our main aim is to characterize the finite-sample estimation performance and convergence behavior of gradient descent (GD) on the maximum likelihood objective (i.e., the logistic loss). Under small $O(1)$ stepsize and $0$ initialization, we show that GD linearly converges to a small neighborhood of $θ^*$ achieving an $\ell_2$ error of order $O(\sqrt{\|θ^*\|_2^5d/n})$. This substantially goes beyond existing theoretical results that lack non-asymptotic estimation error rate and exhibit much slower parameter convergence. We also establish a faster local linear convergence to the same statistical error under a large $Θ(\|θ^*\|_2)$ stepsize. The main technical component is to show that the gradient of the logistic loss satisfies a certain approximate invertibility condition (AIC). To that end, we uniformly control the deviation of the gradient from its population counterpart by covering and peeling arguments, and then show that the population GD is a contraction by a delicate analysis based on the eigenvalues of population Hessian matrices. Finally, we build upon the recent work Matsumoto and Mazumdar (2025) and devise a novel efficient estimator that attains a sharper rate in high dimensions. This indicates that the existing non-asymptotic guarantees exhibit sub-optimal dependence on $\|θ^*\|_2$, and that in many regimes $Θ(\sqrt{\|θ^*\|_2d/n})$ is the tight estimation error rate. Numerical examples are provided to corroborate our theoretical results.

artificial intelligence, error rate, machine learning, (17 more...)

2606.21683

Genre:

Research Report > New Finding (0.71)
Research Report > Experimental Study (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)