AITopics | col

Collaborating Authors

col

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Spectral Structure and Objective Equivalence of Orthogonal Multilabel Fisher Discriminants

Keith-Norambuena, Brian, Bekios-Calfa, Juan

arXiv.org Machine LearningMay-6-2026

We provide a unified theoretical analysis of Linear Discriminant Analysis with simultaneous multilabel scatter matrix formulations and Stiefel orthogonality constraints. Our contributions span both algebraic structure and statistical guarantees. On the algebraic side, we characterize the rank of the multilabel between-class scatter matrix, showing that the effective discriminant dimensionality can strictly exceed the classical single-label bound of $C-1$; we establish a multilabel partition of variance and prove that all four Fisher objectives are equivalent under the $W^\top S_t^{ML} W = I_r$ constraint while characterizing their divergence under the Stiefel constraint; and we prove a two-sided label-distance preservation bound relating projected distances to Hamming distances in label space. On the statistical side, we establish a finite-sample $O(k_{\max}\sqrt{d\log d/n}/gap_r)$ bound on the subspace estimation error under sub-Gaussian noise with a matching $Ω(σ^2 d/(n\,gap_r))$ minimax lower bound, establishing a near-minimax-optimal rate (matching up to logarithmic and $k_{\max}$ factors) for multilabel discriminant subspace estimation. We further provide high-probability distance concentration, robustness guarantees under label interactions, and a regularization analysis preserving the spectral structure when $d \gg n$. All results are verified numerically on synthetic data generated from the linear label-effect model, covering both the algebraic identities and the multilabel-specific quantities ($k_{\max}$, $κ(S_t^{ML})$, $\|Γ/n\|_2$, $Δ_r$) that govern the statistical bounds. The numerical experiments are designed as a sanity check for the theorems rather than as an empirical benchmark; evaluation on real multilabel datasets is left to future work targeting application-oriented venues.

data mining, machine learning, smlt, (19 more...)

arXiv.org Machine Learning

2605.03283

Country: North America > United States (0.67)

Genre: Research Report (0.49)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Training Details and Model

Neural Information Processing SystemsApr-25-2026, 16:03:12 GMT

We set the patch size to be 8. Our model is optimized by AdamW optimizer [3] with a learning rate2 of 0.0004, 250k training steps, linearly warm-up of 5000 steps and an exponentially weight-decaying3 schedule. The gradient norm is clipped at 1. We use Pytorch automatic mixed-precision and data4 paralleling for training acceleration. All models are trained on 4 Nvidia RTXA5000 GPUs with a5 total batch size of 128.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Object centric Cyclic Walks between Parts and Whole

Neural Information Processing SystemsApr-25-2026, 16:03:09 GMT

Learning object-centric representations from complex natural environments enables both humans and machines with reasoning abilities from low-level perceptual features. To capture compositional entities of the scene, we proposed cyclic walks between perceptual features extracted from vision transformers and object entities. First, a slot-attention module interfaces with these perceptual features and produces a finite set of slot representations. These slots can bind to any object entities in the scene via inter-slot competitions for attention. Next, we establish entity-feature correspondence with cyclic walks along high transition probability based on the pairwise similarity between perceptual features (aka "parts") and slot-binded object representations (aka "whole").

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
(2 more...)

Add feedback

FedAvgwithFineTuning: LocalUpdatesLeadto RepresentationLearning

Neural Information Processing SystemsFeb-19-2026, 02:07:11 GMT

Federated Learning (FL) [1]provides acommunication-efficient andprivacypreserving means to learn from data distributed across clients such as cell phones, autonomous vehicles, and hospitals. FL aims for each client to benefit from collaborating in the learning process without sacrificing data privacy or paying a substantial communication cost. Federated Averaging (FedAvg) [1] is the predominant FL algorithm.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Mesh-TensorFlow: Deep Learning for Supercomputers

Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, Blake Hechtman

Neural Information Processing SystemsFeb-12-2026, 15:42:17 GMT

However,batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately,efficient model-parallel algorithms tend tobe complicated todiscover, describe, and to implement, particularly on large clusters.

artificial intelligence, dimension, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Nevada > Washoe County > Reno (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sample Complexity of Interventional Causal Representation Learning

Neural Information Processing SystemsFeb-12-2026, 03:51:17 GMT

Consider a data-generation process that transforms low-dimensional latent causally-related variables to high-dimensional observed variables. Causal representation learning (CRL) is the process of using the observed data to recover the latent causal variables and the causal structure among them.

artificial intelligence, machine learning, recovery, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Auxiliary Losses for Learning Generalizable Concept-based Models

Neural Information Processing SystemsFeb-11-2026, 23:48:59 GMT

Bottleneck Models (CBMs) have gained popularity since their introduction.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Add feedback

Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks

Neural Information Processing SystemsFeb-11-2026, 11:17:50 GMT

We study the convergence rate of first-order methods for rectangular matrix factorization, which is a canonical nonconvex optimization problem.

artificial intelligence, initialization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Africa > Senegal > Kolda Region > Kolda (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

Global Convergence in Training Large-Scale Transformers

Neural Information Processing SystemsFeb-10-2026, 22:47:36 GMT

Despite the widespread success of Transformers across various domains, their optimization guarantees in large-scale model settings are not well-understood.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: