AITopics | attention coefficient

Collaborating Authors

attention coefficient

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6c7297baffe5c85ea1d9e1ccb1222ab8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 15:25:51 GMT

learning, module, shgp, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Simple and Efficient Heterogeneous Temporal Graph Neural Network

Wang, Yili, Huang, Tairan, He, Changlong, Li, Qiutong, Gao, Jianliang

arXiv.org Artificial IntelligenceOct-22-2025

Heterogeneous temporal graphs (HTGs) are ubiquitous data structures in the real world. Recently, to enhance representation learning on HTGs, numerous attention-based neural networks have been proposed. Despite these successes, existing methods rely on a decoupled temporal and spatial learning paradigm, which weakens interactions of spatio-temporal information and leads to a high model complexity. To bridge this gap, we propose a novel learning paradigm for HTGs called Simple and Efficient Heterogeneous Temporal Graph N}eural Network (SE-HTGNN). Specifically, we innovatively integrate temporal modeling into spatial learning via a novel dynamic attention mechanism, which retains attention information from historical graph snapshots to guide subsequent attention computation, thereby improving the overall discriminative representations learning of HTGs. Additionally, to comprehensively and adaptively understand HTGs, we leverage large language models to prompt SE-HTGNN, enabling the model to capture the implicit properties of node types as prior knowledge. Extensive experiments demonstrate that SE-HTGNN achieves up to 10x speed-up over the state-of-the-art and latest baseline while maintaining the best forecasting accuracy.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.18467

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Information Technology (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering

Neural Information Processing SystemsAug-15-2025, 15:36:57 GMT

Recent self-supervised pre-training methods on Heterogeneous Information Networks (HINs) have shown promising competitiveness over traditional semi-supervised Heterogeneous Graph Neural Networks (HGNNs).

learning, module, shgp, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Graph Collaborative Attention Network for Link Prediction in Knowledge Graphs

Hoang-Minh, Thanh

arXiv.org Artificial IntelligenceJul-31-2025

Knowledge graphs offer a structured representation of real-world entities and their relationships, enabling a wide range of applications from information retrieval to automated reasoning. In this paper, we conduct a systematic comparison between traditional rule-based approaches and modern deep learning methods for link prediction. We focus on KBGAT, a graph neural network model that leverages multi-head attention to jointly encode both entity and relation features within local neighborhood structures. To advance this line of research, we introduce \textbf{GCAT} (Graph Collaborative Attention Network), a refined model that enhances context aggregation and interaction between heterogeneous nodes. Experimental results on four widely-used benchmark datasets demonstrate that GCAT not only consistently outperforms rule-based methods but also achieves competitive or superior performance compared to existing neural embedding models. Our findings highlight the advantages of attention-based architectures in capturing complex relational patterns for knowledge graph completion tasks.

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.03947

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SEMA: a Scalable and Efficient Mamba like Attention via Token Localization and Averaging

Tran, Nhat Thanh, Xue, Fanghui, Zhang, Shuai, Lyu, Jiancheng, Zheng, Yunling, Qi, Yingyong, Xin, Jack

arXiv.org Artificial IntelligenceJun-11-2025

Attention is the critical component of a transformer. Yet the quadratic computational complexity of vanilla full attention in the input size and the inability of its linear attention variant to focus have been challenges for computer vision tasks. We provide a mathematical definition of generalized attention and formulate both vanilla softmax attention and linear attention within the general framework. We prove that generalized attention disperses, that is, as the number of keys tends to infinity, the query assigns equal weights to all keys. Motivated by the dispersion property and recent development of Mamba form of attention, we design Scalable and Efficient Mamba like Attention (SEMA) which utilizes token localization to avoid dispersion and maintain focusing, complemented by theoretically consistent arithmetic averaging to capture global aspect of attention. We support our approach on Imagenet-1k where classification results show that SEMA is a scalable and effective alternative beyond linear attention, outperforming recent vision Mamba models on increasingly larger scales of images at similar model parameter sizes.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.08297

Country: North America > United States > California (0.28)

Genre:

Research Report (0.70)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

IMPA-HGAE:Intra-Meta-Path Augmented Heterogeneous Graph Autoencoder

Lin, Di, Ren, Wanjing, Li, Xuanbin, Zhang, Rui

arXiv.org Artificial IntelligenceJun-10-2025

Self-supervised learning (SSL) methods have been increasingly applied to diverse downstream tasks due to their superior generalization capabilities and low annotation costs. However, most existing heterogeneous graph SSL models convert heterogeneous graphs into homogeneous ones via meta-paths for training, which only leverage information from nodes at both ends of meta-paths while under-utilizing the heterogeneous node information along the meta-paths. To address this limitation, this paper proposes a novel framework named IMPA-HGAE to enhance target node embeddings by fully exploiting internal node information along meta-paths. Experimental results validate that IMPA-HGAE achieves superior performance on heterogeneous datasets. Furthermore, this paper introduce innovative masking strategies to strengthen the representational capacity of generative SSL models on heterogeneous graph data. Additionally, this paper discuss the inter-pretability of the proposed method and potential future directions for generative self-supervised learning in heterogeneous graphs. This work provides insights into leveraging meta-path-guided structural semantics for robust representation learning in complex graph scenarios.

artificial intelligence, information, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.06809

Country: North America > United States (0.15)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Uncovering Issues in the Radio Access Network by Looking at the Neighbors

Suárez-Varela, José, Lutu, Andra

arXiv.org Artificial IntelligenceApr-22-2025

Mobile network operators (MNOs) manage Radio Access Networks (RANs) with massive amounts of cells over multiple radio generations (2G-5G). To handle such complexity, operations teams rely on monitoring systems, including anomaly detection tools that identify unexpected behaviors. In this paper, we present c-ANEMON, a Contextual ANomaly dEtection MONitor for the RAN based on Graph Neural Networks (GNNs). Our solution captures spatio-temporal variations by analyzing the behavior of individual cells in relation to their local neighborhoods, enabling the detection of anomalies that are independent of external mobility factors. This, in turn, allows focusing on anomalies associated with network issues (e.g., misconfigurations, equipment failures). We evaluate c-ANEMON using real-world data from a large European metropolitan area (7,890 cells; 3 months). First, we show that the GNN model within our solution generalizes effectively to cells from previously unseen areas, suggesting the possibility of using a single model across extensive deployment regions. Then, we analyze the anomalies detected by c-ANEMON through manual inspection and define several categories of long-lasting anomalies (6+ hours). Notably, 45.95% of these anomalies fall into a category that is more likely to require intervention by operations teams.

anomaly, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.14686

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry:

Information Technology > Networks (0.48)
Telecommunications > Networks (0.48)
Water & Waste Management > Water Management (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Weighted Graph Structure Learning with Attention Denoising for Node Classification

Wang, Tingting, Su, Jiaxin, Liu, Haobing, Jiang, Ruobing

arXiv.org Artificial IntelligenceMar-15-2025

--The node classification in graphs aims to predict the categories of unlabeled nodes utilizing a small set of labeled nodes. However, weighted graphs often contain noisy edges and anomalous edge weights, which can distort fine-grained relationships between nodes and hinder accurate classification. We propose the Edge Weight-aware Graph Structure Learning (EWGSL) method, which combines weight learning and graph structure learning to address these issues. EWGSL improves node classification by redefining attention coefficients in graph attention networks to incorporate node features and edge weights. It also applies graph structure learning to sparsify attention coefficients and uses a modified InfoNCE loss function to enhance performance by adapting to denoised graph weights. Extensive experimental results show that EWGSL has an average Micro-F1 improvement of 17.8 % compared to the best baseline.

artificial intelligence, graph structure, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2503.12157

Country: Asia > China > Shandong Province > Qingdao (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Reviews: Understanding Attention and Generalization in Graph Neural Networks

Neural Information Processing SystemsJan-23-2025, 12:53:21 GMT

UPDATE: I have increased the score to 6 as long as the authors will revise the paper as promised in the responses. This paper has more than one topic being discussed. It at the first part talks mostly about the attention mechanism, and in the second section it introduces a new model ChebyGIN, then in the third section it proposed a weakly-supervised attention training approach. Overall, the paper is not all about its title "Understanding Attention in Graph Neural Networks". In 2.3 the paper says "the performance of both GCNs and GINs is quite poor and, consequently, it is also hard for the attention subnetwork to learn", thus it proposes ChebyGIN as a stronger model.

attention and generalization, graph neural network, initialization, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)

Add feedback