AITopics

2510.01588

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

arXiv.org Artificial IntelligenceOct-3-2025

SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs

Liu, Ruyue, Yin, Rong, Bo, Xiangzhen, Hao, Xiaoshuai, Liu, Yong, Zhong, Jinwen, Ma, Can, Wang, Weiping

Large scale pretrained models have revolutionized Natural Language Processing (NLP) and Computer Vision (CV), showcasing remarkable cross domain generalization abilities. However, in graph learning, models are typically trained on individual graph datasets, limiting their capacity to transfer knowledge across different graphs and tasks. This approach also heavily relies on large volumes of annotated data, which presents a significant challenge in resource-constrained settings. Unlike NLP and CV, graph structured data presents unique challenges due to its inherent heterogeneity, including domain specific feature spaces and structural diversity across various applications. To address these challenges, we propose a novel structure aware self supervised learning method for Text Attributed Graphs (SSTAG). By leveraging text as a unified representation medium for graph learning, SSTAG bridges the gap between the semantic reasoning of Large Language Models (LLMs) and the structural modeling capabilities of Graph Neural Networks (GNNs). Our approach introduces a dual knowledge distillation framework that co-distills both LLMs and GNNs into structure-aware multilayer perceptrons (MLPs), enhancing the scalability of large-scale TAGs. Additionally, we introduce an in-memory mechanism that stores typical graph representations, aligning them with memory anchors in an in-memory repository to integrate invariant knowledge, thereby improving the model's generalization ability. Extensive experiments demonstrate that SSTAG outperforms state-of-the-art models on cross-domain transfer learning tasks, achieves exceptional scalability, and reduces inference costs while maintaining competitive performance.

large language model, machine learning, natural language, (21 more...)

2510.01248

Country:

Asia > China (0.28)
North America > Mexico (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Neural Information Processing SystemsOct-2-2025, 23:03:45 GMT

Reviewer

We thank the reviewers for their kind and thoughtful comments on our work. Below, we respond to reviewer-specific comments. Our claim that a standard multilayer perceptron "fails to learn high frequencies in theory" is based on the theoretical For example, in the abstract, modifying "a standard MLP fails to learn high frequencies both We directly extend the 1D experiment in Figure 4 to a two-dimensional setting in Section 1.4 of the supplement, and

artificial intelligence, eigenvalue, machine learning, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.57)

Neural Information Processing SystemsOct-2-2025, 20:01:44 GMT

Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations

Zhenyao Zhu, Ping Luo, Xiaogang Wang, Xiaoou Tang

Neural Information Processing Systems http://nips.cc/

face identity and view representation, learning face identity, multi-view perceptron, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.40)

Yuki Yoshida, Masato Okada

Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing SystemsOct-2-2025, 10:12:18 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, order parameter, (14 more...)

Country: Asia > Japan (0.15)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.31)

Prateek Jain, Nagarajan Natarajan, Ambuj Tewari

Predtron: A Family of Online Algorithms for General Prediction Problems

Neural Information Processing SystemsOct-2-2025, 08:18:05 GMT

Modern prediction problems arising in multilabel learning and learning to rank pose unique challenges to the classical theory of supervised learning. These problems have large prediction and label spaces of a combinatorial nature and involve sophisticated loss functions. We offer a general framework to derive mistake driven online algorithms and associated loss bounds. The key ingredients in our framework are a general loss function, a general vector space representation of predictions, and a notion of margin with respect to a general norm. Our general algorithm, Predtron, yields the perceptron algorithm and its variants when instan-tiated on classic problems such as binary classification, multiclass classification, ordinal regression, and multilabel classification. For multilabel ranking and subset ranking, we derive novel algorithms, notions of margins, and loss bounds. A simulation study confirms the behavior predicted by our bounds and demonstrates the flexibility of the design choices in our framework.

artificial intelligence, inductive learning, machine learning, (18 more...)

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Asia > India (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Dan Rosenbaum, Yair Weiss

The Return of the Gating Network: Combining Generative Models and Discriminative Training in Natural Image Priors

Neural Information Processing SystemsOct-2-2025, 07:13:10 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, restoration, (16 more...)

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)

Neural Information Processing SystemsOct-2-2025, 02:21:12 GMT

latent space components, which traditionally assume a Euclidean metric over the latent space, by their hyperbolic

We thank the reviewers for their time, helpful feedback, and advice. We thank them for their kind words, and hope to address any remaining concerns below. We agree and propose the following replacement: "We show that replacing V AE We will improve that for the next version. In more detail, we compared three decoders: (i) a standard "vanilla" multilayer perceptron (implicitly relying on the This ablation study shows that linearising the Poincaré ball through the logarithm map (i.e. The analogy is not limited to the two-dimensional case.

artificial intelligence, decoder, machine learning, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.56)

arXiv.org Artificial IntelligenceOct-2-2025

Combating Noisy Labels via Dynamic Connection Masking

Zhang, Xinlei, Liu, Fan, Zhang, Chuanyi, Cheng, Fan, Zheng, Yuhui

Noisy labels are inevitable in real-world scenarios. Due to the strong capacity of deep neural networks to memorize corrupted labels, these noisy labels can cause significant performance degradation. Existing research on mitigating the negative effects of noisy labels has mainly focused on robust loss functions and sample selection, with comparatively limited exploration of regularization in model architecture. Inspired by the sparsity regularization used in Kolmogorov-Arnold Networks (KANs), we propose a Dynamic Connection Masking (DCM) mechanism for both Multi-Layer Perceptron Networks (MLPs) and KANs to enhance the robustness of classifiers against noisy labels. The mechanism can adaptively mask less important edges during training by evaluating their information-carrying capacity. Through theoretical analysis, we demonstrate its efficiency in reducing gradient error. Our approach can be seamlessly integrated into various noise-robust training methods to build more robust deep networks, including robust loss functions, sample selection strategies, and regularization techniques. Extensive experiments on both synthetic and real-world benchmarks demonstrate that our method consistently outperforms state-of-the-art (SOTA) approaches. Furthermore, we are also the first to investigate KANs as classifiers against noisy labels, revealing their superior noise robustness over MLPs in real-world noisy scenarios. Our code will soon be publicly available.

artificial intelligence, machine learning, noisy label, (16 more...)

2508.09697

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Jha, Nandan Kumar, Reagen, Brandon

Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space?

arXiv.org Artificial IntelligenceOct-2-2025

As large language models (LLMs) scale, the question is not only how large they become, but how much of their capacity is effectively utilized. Existing scaling laws relate model size to loss, yet overlook how components exploit their latent space. We study feed-forward networks (FFNs) and recast width selection as a spectral utilization problem. Using a lightweight diagnostic suite -- Hard Rank (participation ratio), Soft Rank (Shannon rank), Spectral Concentration, and the composite Spectral Utilization Index (SUI) -- we quantify how many latent directions are meaningfully activated across LLaMA, GPT-2, and nGPT families. Our key finding is an asymmetric spectral scaling law: soft rank follows an almost perfect power law with FFN width, while hard rank grows only sublinearly and with high variance. This asymmetry suggests that widening FFNs mostly adds low-energy tail directions, while dominant-mode subspaces saturate early. Moreover, at larger widths, variance further collapses into a narrow subspace, leaving much of the latent space under-utilized. These results recast FFN width selection as a principled trade-off between tail capacity and dominant-mode capacity, offering concrete guidance for inference-efficient LLM design.

large language model, machine learning, natural language, (17 more...)

2510.00537

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)