AITopics

Country:

Europe (1.00)
North America > United States (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)
Workflow (0.67)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsApr-24-2026, 23:50:23 GMT

Learning with little mixing

We study square loss in a realizable time-series framework with martingale difference noise. Our main result is a fast rate excess risk bound which shows that whenever a trajectory hypercontractivity condition holds, the risk of the leastsquares estimator on dependent data matches the iid rate order-wise after a burn-in time. In comparison, many existing results in learning from dependent data have rates where the effective sample size is deflated by a factor of the mixing-time of the underlying process, even after the burn-in time. Furthermore, our results allow the covariate process to exhibit long range correlations which are substantially weaker than geometric ergodicity. We call this phenomenon learning with little mixing, and present several examples for when it occurs: bounded function classes for which the L2 and L2+ε norms are equivalent, ergodic finite state Markov chains, various parametric models, and a broad family of infinite dimensional ℓ2(N)ellipsoids. By instantiating our main result to system identification of nonlinear dynamics with generalized linear model transitions, we obtain a nearly minimax optimal excess risk bound after only a polynomial burn-in time.

artificial intelligence, machine learning, theorem 4, (18 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Neural Information Processing SystemsFeb-13-2026, 13:55:30 GMT

6b8dfb8c0c12e6fafc6c256cb08a5ca7-Paper-Conference.pdf

large language model, machine learning, natural language, (22 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Genre: Workflow (0.51)

Industry:

Leisure & Entertainment > Games (0.72)
Materials > Metals & Mining > Iron (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Neural Information Processing SystemsFeb-7-2026, 19:25:44 GMT

Learningwithlittlemixing

We study square loss in a realizable time-series framework with martingale difference noise.

artificial intelligence, dep, machine learning, (18 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Gordaliza, Pedro M., Molchanova, Nataliia, Banus, Jaume, Sanchez, Thomas, Cuadra, Meritxell Bach

Causal Attribution of Model Performance Gaps in Medical Imaging Under Distribution Shifts

arXiv.org Artificial IntelligenceDec-11-2025

Deep learning models for medical image segmentation suffer significant performance drops due to distribution shifts, but the causal mechanisms behind these drops remain poorly understood. We extend causal attribution frameworks to high-dimensional segmentation tasks, quantifying how acquisition protocols and annotation variability independently contribute to performance degradation. We model the data-generating process through a causal graph and employ Shapley values to fairly attribute performance changes to individual mechanisms. Our framework addresses unique challenges in medical imaging: high-dimensional outputs, limited samples, and complex mechanism interactions. Validation on multiple sclerosis (MS) lesion segmentation across 4 centers and 7 annotators reveals context-dependent failure modes: annotation protocol shifts dominate when crossing annotators (7.4% $\pm$ 8.9% DSC attribution), while acquisition shifts dominate when crossing imaging centers (6.5% $\pm$ 9.1%). This mechanism-specific quantification enables practitioners to prioritize targeted interventions based on deployment context.

artificial intelligence, machine learning, mechanism, (15 more...)

2512.09094

Country: Europe > Switzerland (0.16)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (0.35)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceNov-21-2025

Operon: Incremental Construction of Ragged Data via Named Dimensions

Moon, Sungbin, Park, Jiho, Hwang, Suyoung, Koh, Donghyun, Moon, Seunghyun, Lee, Minhyeong

Modern data processing workflows frequently encounter ragged data: collections with variable-length elements that arise naturally in domains like natural language processing, scientific measurements, and autonomous AI agents. Existing workflow engines lack native support for tracking the shapes and dependencies inherent to ragged data, forcing users to manage complex indexing and dependency bookkeeping manually. We present Operon, a Rust-based workflow engine that addresses these challenges through a novel formalism of named dimensions with explicit dependency relations. Operon provides a domain-specific language where users declare pipelines with dimension annotations that are statically verified for correctness, while the runtime system dynamically schedules tasks as data shapes are incrementally discovered during execution. We formalize the mathematical foundation for reasoning about partial shapes and prove that Operon's incremental construction algorithm guarantees deterministic and confluent execution in parallel settings. The system's explicit modeling of partially-known states enables robust persistence and recovery mechanisms, while its per-task multi-queue architecture achieves efficient parallelism across heterogeneous task types. Empirical evaluation demonstrates that Operon outperforms an existing workflow engine with 14.94x baseline overhead reduction while maintaining near-linear end-to-end output rates as workloads scale, making it particularly suitable for large-scale data generation pipelines in machine learning applications.

artificial intelligence, machine learning, natural language, (19 more...)

2511.1608

Country:

North America > United States (0.28)
Europe > Austria (0.28)
North America > Canada (0.28)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

Neural Information Processing SystemsOct-8-2025, 20:44:47 GMT

6b8dfb8c0c12e6fafc6c256cb08a5ca7-Paper-Conference.pdf

large language model, machine learning, natural language, (22 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Industry:

Materials > Metals & Mining (0.96)
Leisure & Entertainment > Games (0.72)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
(2 more...)

Konkathi, Bala Rajesh, Tangirala, Arun K.

Causal discovery in deterministic discrete LTI-DAE systems

arXiv.org Artificial IntelligenceJun-26-2025

Discovering pure causes or driver variables in deterministic LTI systems is of vital importance in the data-driven reconstruction of causal networks. A recent work by Kathari and Tangirala, proposed in 2022, formulated the causal discovery method as a constraint identification problem. The constraints are identified using a dynamic iterative PCA (DIPCA)-based approach for dynamical systems corrupted with Gaussian measurement errors. The DIPCA-based method works efficiently for dynamical systems devoid of any algebraic relations. However, several dynamical systems operate under feedback control and/or are coupled with conservation laws, leading to differential-algebraic (DAE) or mixed causal systems. In this work, a method, namely the partition of variables (PoV), for causal discovery in LTI-DAE systems is proposed. This method is superior to the method that was presented by Kathari and Tangirala (2022), as PoV also works for pure dynamical systems, which are devoid of algebraic equations. The proposed method identifies the causal drivers up to a minimal subset. PoV deploys DIPCA to first determine the number of algebraic relations ($n_a$), the number of dynamical relations ($n_d$) and the constraint matrix. Subsequently, the subsets are identified through an admissible partitioning of the constraint matrix by finding the condition number of it. Case studies are presented to demonstrate the effectiveness of the proposed method.

artificial intelligence, constraint-based reasoning, pure source, (17 more...)

2506.20169

Country:

North America (0.14)
Asia > India > Tamil Nadu > Chennai (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.55)

Gahlawat, Harmender, Zehavi, Meirav

Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

arXiv.org Artificial IntelligenceMay-22-2025

Decision trees are a fundamental tool in machine learning for representing, classifying, and generalizing data. It is desirable to construct ``small'' decision trees, by minimizing either the \textit{size} ($s$) or the \textit{depth} $(d)$ of the \textit{decision tree} (\textsc{DT}). Recently, the parameterized complexity of \textsc{Decision Tree Learning} has attracted a lot of attention. We consider a generalization of \textsc{Decision Tree Learning} where given a \textit{classification instance} $E$ and an integer $t$, the task is to find a ``small'' \textsc{DT} that disagrees with $E$ in at most $t$ examples. We consider two problems: \textsc{DTSO} and \textsc{DTDO}, where the goal is to construct a \textsc{DT} minimizing $s$ and $d$, respectively. We first establish that both \textsc{DTSO} and \textsc{DTDO} are W[1]-hard when parameterized by $s+δ_{max}$ and $d+δ_{max}$, respectively, where $δ_{max}$ is the maximum number of features in which two differently labeled examples can differ. We complement this result by showing that these problems become \textsc{FPT} if we include the parameter $t$. We also consider the kernelization complexity of these problems and establish several positive and negative results for both \textsc{DTSO} and \textsc{DTDO}.

artificial intelligence, machine learning, test node, (17 more...)

2505.15648

Country: Europe (0.67)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

arXiv.org Artificial IntelligenceMay-19-2025

Towards Robust Spiking Neural Networks:Mitigating Heterogeneous Training Vulnerability via Dominant Eigencomponent Projection

Zhang, Desong, Hu, Jia, Min, Geyong

Spiking Neural Networks (SNNs) process information via discrete spikes, enabling them to operate at remarkably low energy levels. However, our experimental observations reveal a striking vulnerability when SNNs are trained using the mainstream method--direct encoding combined with backpropagation through time (BPTT): even a single backward pass on data drawn from a slightly different distribution can lead to catastrophic network collapse. Our theoretical analysis attributes this vulnerability to the repeated inputs inherent in direct encoding and the gradient accumulation characteristic of BPTT, which together produce an exceptional large Hessian spectral radius. To address this challenge, we develop a hyperparameter-free method called Dominant Eigencomponent Projection (DEP). By orthogonally projecting gradients to precisely remove their dominant components, DEP effectively reduces the Hessian spectral radius, thereby preventing SNNs from settling into sharp minima. Extensive experiments demonstrate that DEP not only mitigates the vulnerability of SNNs to heterogeneous data poisoning, but also significantly enhances overall robustness compared to key baselines, providing strong support for safer and more reliable SNN deployment.

artificial intelligence, arxivpreprintarxiv, machine learning, (15 more...)

2505.11134

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)