gin
- Asia > South Korea > Seoul > Seoul (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Generalized Independent Noise Condition for Estimating Latent Variable Causal Graphs
Causal discovery aims to recover causal structures or models underlying the observed data. Despite its success in certain domains, most existing methods focus on causal relations between observed variables, while in many scenarios the observed ones may not be the underlying causal variables (e.g., image pixels), but are generated by latent causal variables or confounders that are causally related. To this end, in this paper, we consider Linear, Non-Gaussian Latent variable Models (LiNGLaMs), in which latent confounders are also causally related, and propose a Generalized Independent Noise (GIN) condition to estimate such latent variable graphs. Specifically, for two observed random vectors $\mathbf{Y}$ and $\mathbf{Z}$, GIN holds if and only if $\omega^{\intercal}\mathbf{Y}$ and $\mathbf{Z}$ are statistically independent, where $\omega$ is a parameter vector characterized from the cross-covariance between $\mathbf{Y}$ and $\mathbf{Z}$. From the graphical view, roughly speaking, GIN implies that causally earlier latent common causes of variables in $\mathbf{Y}$ d-separate $\mathbf{Y}$ from $\mathbf{Z}$. Interestingly, we find that the independent noise condition, i.e., if there is no confounder, causes are independent from the error of regressing the effect on the causes, can be seen as a special case of GIN. Moreover, we show that GIN helps locate latent variables and identify their causal structure, including causal directions. We further develop a recursive learning algorithm to achieve these goals. Experimental results on synthetic and real-world data demonstrate the effectiveness of our method.
GPU-Initiated Networking for NCCL
Hamidouche, Khaled, Bachan, John, Markthub, Pak, Gootzen, Peter-Jan, Agostini, Elena, Jeaugey, Sylvain, Shafi, Aamir, Theodorakis, Georgios, Venkata, Manjunath Gorentla
Modern AI workloads, especially Mixture-of-Experts (MoE) architectures, increasingly demand low-latency, fine-grained GPU-to-GPU communication with device-side control. Traditional GPU communication follows a host-initiated model, where the CPU orchestrates all communication operations - a characteristic of the CUDA runtime. Although robust for collective operations, applications requiring tight integration of computation and communication can benefit from device-initiated communication that eliminates CPU coordination overhead. NCCL 2.28 introduces the Device API with three operation modes: Load/Store Accessible (LSA) for NVLink/PCIe, Multimem for NVLink SHARP, and GPU-Initiated Networking (GIN) for network RDMA. This paper presents the GIN architecture, design, semantics, and highlights its impact on MoE communication. GIN builds on a three-layer architecture: i) NCCL Core host-side APIs for device communicator setup and collective memory window registration; ii) Device-side APIs for remote memory operations callable from CUDA kernels; and iii) A network plugin architecture with dual semantics (GPUDirect Async Kernel-Initiated and Proxy) for broad hardware support. The GPUDirect Async Kernel-Initiated backend leverages DOCA GPUNetIO for direct GPU-to-NIC communication, while the Proxy backend provides equivalent functionality via lock-free GPU-to-CPU queues over standard RDMA networks. We demonstrate GIN's practicality through integration with DeepEP, an MoE communication library. Comprehensive benchmarking shows that GIN provides device-initiated communication within NCCL's unified runtime, combining low-latency operations with NCCL's collective algorithms and production infrastructure.
- North America > United States > Tennessee (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Asia > India > Rajasthan > Jaipur (0.04)
- Information Technology > Hardware (1.00)
- Information Technology > Graphics (1.00)
- Information Technology > Communications > Networks (0.68)
- (2 more...)
- Asia > South Korea > Seoul > Seoul (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Multi-Channel Graph Neural Network for Financial Risk Prediction of NEEQ Enterprises
With the continuous evolution of China's multi-level capital market, the National Equities Exchange and Quotations (NEEQ), also known as the "New Third Board," has become a critical financing platform for small and medium-sized enterprises (SMEs). However, due to their limited scale and financial resilience, many NEEQ-listed companies face elevated risks of financial distress. To address this issue, we propose a multi-channel deep learning framework that integrates structured financial indicators, textual disclosures, and enterprise relationship data for comprehensive financial risk prediction. Specifically, we design a Triple-Channel Graph Isomorphism Network (GIN) that processes numeric, textual, and graph-based inputs separately. These modality-specific representations are fused using an attention-based mechanism followed by a gating unit to enhance robustness and prediction accuracy. Experimental results on data from 7,731 real-world NEEQ companies demonstrate that our model significantly outperforms traditional machine learning methods and single-modality baselines in terms of AUC, Precision, Recall, and F1 Score. This work provides theoretical and practical insights into risk modeling for SMEs and offers a data-driven tool to support financial regulators and investors.
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
We present an approach to modifying Transformer architectures by integrating graph-aware relational reasoning into the attention mechanism, merging concepts from graph neural networks and language modeling. Building on the inherent connection between attention and graph theory, we reformulate the Transformer's attention mechanism as a graph operation and propose Graph-Aware Isomorphic Attention. This method leverages advanced graph modeling strategies, including Graph Isomorphism Networks (GIN) and Principal Neighborhood Aggregation (PNA), to enrich the representation of relational structures. Our approach captures complex dependencies and generalizes across tasks, as evidenced by a reduced generalization gap and improved learning performance. Additionally, we expand the concept of graph-aware attention to introduce Sparse GIN-Attention, a fine-tuning approach that employs sparse GINs. By interpreting attention matrices as sparse adjacency graphs, this technique enhances the adaptability of pre-trained foundational models with minimal computational overhead, endowing them with graph-aware capabilities. Sparse GIN-Attention fine-tuning achieves improved training dynamics and better generalization compared to alternative methods like low-rank adaption (LoRA). We discuss latent graph-like structures within traditional attention mechanisms, offering a new lens through which Transformers can be understood. By evolving Transformers as hierarchical GIN models for relational reasoning. This perspective suggests profound implications for foundational model development, enabling the design of architectures that dynamically adapt to both local and global dependencies. Applications in bioinformatics, materials science, language modeling, and beyond could benefit from this synthesis of relational and sequential data modeling, setting the stage for interpretable and generalizable modeling strategies.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > Netherlands (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)