Goto

Collaborating Authors

 directionality


Robust Causal Directionality Inference in Quantum Inference under MNAR Observation and High-Dimensional Noise

Kang, Joonsung

arXiv.org Machine Learning

In quantum mechanics, observation actively shapes the system, paralleling the statistical notion of Missing Not At Random (MNAR). This study introduces a unified framework for \textbf{robust causal directionality inference} in quantum engineering, determining whether relations are system$\to$observation, observation$\to$system, or bidirectional. The method integrates CVAE-based latent constraints, MNAR-aware selection models, GEE-stabilized regression, penalized empirical likelihood, and Bayesian optimization. It jointly addresses quantum and classical noise while uncovering causal directionality, with theoretical guarantees for double robustness, perturbation stability, and oracle inequalities. Simulation and real-data analyses (TCGA gene expression, proteomics) show that the proposed MNAR-stabilized CVAE+GEE+AIPW+PEL framework achieves lower bias and variance, near-nominal coverage, and superior quantum-specific diagnostics. This establishes robust causal directionality inference as a key methodological advance for reliable quantum engineering.


Detailed balance in large language model-driven agents

Song, Zhuo-Yang, Cao, Qing-Hong, Luo, Ming-xing, Zhu, Hua Xing

arXiv.org Artificial Intelligence

Large language model (LLM)-driven agents are emerging as a powerful new paradigm for solving complex problems. Despite the empirical success of these practices, a theoretical framework to understand and unify their macroscopic dynamics remains lacking. This Letter proposes a method based on the least action principle to estimate the underlying generative directionality of LLMs embedded within agents. By experimentally measuring the transition probabilities between LLM-generated states, we statistically discover a detailed balance in LLM-generated transitions, indicating that LLM generation may not be achieved by generally learning rule sets and strategies, but rather by implicitly learning a class of underlying potential functions that may transcend different LLM architectures and prompt templates. To our knowledge, this is the first discovery of a macroscopic physical law in LLM generative dynamics that does not depend on specific model details. This work is an attempt to establish a macroscopic dynamics theory of complex AI systems, aiming to elevate the study of AI agents from a collection of engineering practices to a science built on effective measurements that are predictable and quantifiable.


Reversal Invariance in Autoregressive Language Models

Sahasrabudhe, Mihir

arXiv.org Artificial Intelligence

We formalize a structural property of the causal (autoregressive) language modeling (CLM) objective: reversal invariance. Formally, the next-token prediction loss assigns identical likelihood to a corpus and its reversal, implying that standard CLM pretraining is direction-blind. This symmetry explains why models trained on reversed text can achieve comparable performance to those trained on forward text, despite the inherently time-asymmetric nature of human language and reasoning. We argue that this invariance represents a limitation of current pretraining objectives rather than a benign artifact. If natural language encodes directional dependencies - phonological, morphological, or causal - a symmetric objective may fail to capture them. We therefore propose viewing pretraining through the lens of temporal asymmetry, motivating future work on loss functions and architectures that explicitly model the arrow of language while retaining standard language modeling capacity.


Toward General Digraph Contrastive Learning: A Dual Spatial Perspective

Su, Daohan, Zhang, Yang, Li, Xunkai, Li, Rong-Hua, Wang, Guoren

arXiv.org Artificial Intelligence

Abstract--Graph Contrastive Learning (GCL) has emerged as a powerful tool for extracting consistent representations from graphs, independent of labeled information. However, existing methods predominantly focus on undirected graphs, disregarding the pivotal directional information that is fundamental and indispensable in real-world networks (e.g., social networks and recommendations). In this paper, we introduce S2-DiGCL, a novel framework that emphasizes spatial insights from complex and real domain perspectives for directed graph (digraph) contrastive learning. From the complex-domain perspective, S2-DiGCL introduces personalized perturbations into the magnetic Laplacian to adaptively modulate edge phases and directional semantics. From the real-domain perspective, it employs a path-based subgraph augmentation strategy to capture fine-grained local asymmetries and topological dependencies. Extensive experiments on 7 real-world digraph datasets demonstrate the superiority of our approach, achieving SOT A performance with 4.41% improvement in node classification and 4.34% in link prediction under both supervised and unsupervised settings. Graph has become a fundamental data structure for modeling pairwise relationships across diverse domains, such as social interactions [1], [2], transportation networks [3], [4], and recommendation systems [5], [6]. This widespread use has spurred the rapid development of GNNs [7], [8], which effectively capture topological dependencies and node interactions. Despite advancements, conventional supervised GNNs face inherent limitations due to their reliance on extensive labeled data, posing a critical bottleneck as the volume of real-world graphs continues to grow while annotated data remain scarce and expensive to obtain. To mitigate this limitation, GCL [9], [10] has emerged as a promising self-supervised paradigm that learns robust and transferable node representations by enforcing consistency across multiple augmented graph views. While current GCL methodologies have demonstrated remarkable success on undirected graphs, their applicability to digraphs remains largely unexplored.


Comparing Human and Language Models Sentence Processing Difficulties on Complex Structures

Amouyal, Samuel Joseph, Meltzer-Asscher, Aya, Berant, Jonathan

arXiv.org Artificial Intelligence

Large language models (LLMs) that fluently converse with humans are a reality - but do LLMs experience human-like processing difficulties? We systematically compare human and LLM sentence comprehension across seven challenging linguistic structures. We collect sentence comprehension data from humans and five families of state-of-the-art LLMs, varying in size and training procedure in a unified experimental framework. Our results show LLMs overall struggle on the target structures, but especially on garden path (GP) sentences. Indeed, while the strongest models achieve near perfect accuracy on non-GP structures (93.7% for GPT-5), they struggle on GP structures (46.8% for GPT-5). Additionally, when ranking structures based on average performance, rank correlation between humans and models increases with parameter count. For each target structure, we also collect data for their matched baseline without the difficult structure. Comparing performance on the target vs. baseline sentences, the performance gap observed in humans holds for LLMs, with two exceptions: for models that are too weak performance is uniformly low across both sentence types, and for models that are too strong the performance is uniformly high. Together, these reveal convergence and divergence in human and LLM sentence comprehension, offering new insights into the similarity of humans and LLMs.


Directional Sheaf Hypergraph Networks: Unifying Learning on Directed and Undirected Hypergraphs

Mule, Emanuele, Fiorini, Stefano, Purificato, Antonio, Siciliano, Federico, Coniglio, Stefano, Silvestri, Fabrizio

arXiv.org Artificial Intelligence

Hypergraphs provide a natural way to represent higher-order interactions among multiple entities. While undirected hypergraphs have been extensively studied, the case of directed hypergraphs, which can model oriented group interactions, remains largely under-explored despite its relevance for many applications. Recent approaches in this direction often exhibit an implicit bias toward homophily, which limits their effectiveness in heterophilic settings. Rooted in the algebraic topology notion of Cellular Sheaves, Sheaf Neural Networks (SNNs) were introduced as an effective solution to circumvent such a drawback. While a generalization to hypergraphs is known, it is only suitable for undirected hypergraphs, failing to tackle the directed case. In this work, we introduce Directional Sheaf Hypergraph Networks (DSHN), a framework integrating sheaf theory with a principled treatment of asymmetric relations within a hypergraph. From it, we construct the Directed Sheaf Hypergraph Laplacian, a complex-valued operator by which we unify and generalize many existing Laplacian matrices proposed in the graph- and hypergraph-learning literature. Across 7 real-world datasets and against 13 baselines, DSHN achieves relative accuracy gains from 2% up to 20%, showing how a principled treatment of directionality in hypergraphs, combined with the expressive power of sheaves, can substantially improve performance.


Deriving Strategic Market Insights with Large Language Models: A Benchmark for Forward Counterfactual Generation

Ong, Keane, Mao, Rui, Varshney, Deeksha, Liang, Paul Pu, Cambria, Erik, Mengaldo, Gianmarco

arXiv.org Artificial Intelligence

Counterfactual reasoning typically involves considering alternatives to actual events. While often applied to understand past events, a distinct form-forward counterfactual reasoning-focuses on anticipating plausible future developments. This type of reasoning is invaluable in dynamic financial markets, where anticipating market developments can powerfully unveil potential risks and opportunities for stakeholders, guiding their decision-making. However, performing this at scale is challenging due to the cognitive demands involved, underscoring the need for automated solutions. LLMs offer promise, but remain unexplored for this application. To address this gap, we introduce a novel benchmark, FIN-FORCE-FINancial FORward Counterfactual Evaluation. By curating financial news headlines and providing structured evaluation, FIN-FORCE supports LLM based forward counterfactual generation. This paves the way for scalable and automated solutions for exploring and anticipating future market developments, thereby providing structured insights for decision-making. Through experiments on FIN-FORCE, we evaluate state-of-the-art LLMs and counterfactual generation methods, analyzing their limitations and proposing insights for future research. We release the benchmark, supplementary data and all experimental codes at the following link: https://github.com/keanepotato/fin_force


Pruning Increases Orderedness in Recurrent Computation

Song, Yiding

arXiv.org Artificial Intelligence

Inspired by the prevalence of recurrent circuits in biological brains, we investigate the degree to which directionality is a helpful inductive bias for artificial neural networks. Taking directionality as topologically-ordered information flow between neurons, we formalise a perceptron layer with all-to-all connections (mathematically equivalent to a weight-tied recurrent neural network) and demonstrate that directionality, a hallmark of modern feed-forward networks, can be induced rather than hard-wired by applying appropriate pruning techniques. Across different random seeds our pruning schemes successfully induce greater topological ordering in information flow between neurons without compromising performance, suggesting that directionality is not a prerequisite for learning, but may be an advantageous inductive bias discoverable by gradient descent and sparsification.


SCC-recursiveness in infinite argumentation (extended version)

Andrews, Uri, Mauro, Luca San

arXiv.org Artificial Intelligence

Argumentation frameworks (AFs) are a foundational tool in artificial intelligence for modeling structured reasoning and conflict. SCC-recursiveness is a well-known design principle in which the evaluation of arguments is decomposed according to the strongly connected components (SCCs) of the attack graph, proceeding recursively from "higher" to "lower" components. While SCC-recursive semantics such as \cft and \stgt have proven effective for finite AFs, Baumann and Spanring showed the failure of SCC-recursive semantics to generalize reliably to infinite AFs due to issues with well-foundedness. We propose two approaches to extending SCC-recursiveness to the infinite setting. We systematically evaluate these semantics using Baroni and Giacomin's established criteria, showing in particular that directionality fails in general. We then examine these semantics' behavior in finitary frameworks, where we find some of our semantics satisfy directionality. These results advance the theory of infinite argumentation and lay the groundwork for reasoning systems capable of handling unbounded or evolving domains.


Ignoring Directionality Leads to Compromised Graph Neural Network Explanations

Sun, Changsheng, Li, Xinke, Dong, Jin Song

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) have emerged as a powerful tool for modeling relational data in applications such as financial fraud detection [1], [2] and social network analysis [3]. As GNNs are increasingly deployed in safety-critical domains where their decisions impact human lives and societal well-being [4], [5], ensuring their trustworthiness has become essential. Unlike traditional software systems, where correctness can often be ensured through formal verification [6], [7], deep learning models--including GNNs--function as black boxes, making it difficult to validate their decisions. To address this, explainability has become essential for deploying GNNs in real-world decision-making pipelines. Recently, post-hoc explanation methods such as GNNExplainer [8] and PGExplainer [9] are widely used to enhance user trust, facilitate model debugging for developers, and provide external validation for regulatory compliance in these black-box GNN models. A useful analogy can be drawn between explaining con-volutional neural networks (CNNs) and GNNs. As shown in Figure 1, CNN explainability method Grad-CAM [10], highlight key image regions influencing a prediction--e.g., focusing on a dog's face to classify it as "dog." Similarly, GNN explainers identify critical subgraph structures affecting predictions.