pag
Decoding Causal Structure: End-to-End Mediation Pathways Inference
Causal mediation analysis is crucial for deconstructing complex mechanisms of action. However, in current mediation analysis, complex structures derived from causal discovery lack direct interpretation of mediation pathways, while traditional mediation analysis and effect estimation are limited by the reliance on pre-specified pathways, leading to a disconnection between structure discovery and causal mechanism understanding. Therefore, a unified framework integrating structure discovery, pathway identification, and effect estimation systematically quantifies mediation pathways under structural uncertainty, enabling automated identification and inference of mediation pathways. To this end, we propose Structure-Informed Guided Mediation Analysis (SIGMA), which guides automated mediation pathway identification through probabilistic causal structure discovery and uncertainty quantification, enabling end-to-end propagation of structural uncertainty from structure learning to effect estimation. Specifically, SIGMA employs differentiable Flow-Structural Equation Models to learn structural posteriors, generating diverse Directed Acyclic Graphs (DAGs) to quantify structural uncertainty. Based on these DAGs, we introduce the Path Stability Score to evaluate the marginal probability of pathways, identifying high-confidence mediation paths. For identified mediation pathways, we integrate Efficient Influence Functions with Bayesian model averaging to fuse within-structure estimation uncertainty and between-structure effect variation, propagating uncertainty to the final effect estimates. In synthetic data experiments, SIGMA achieves state-of-the-art performance in pathway identification accuracy and effect quantification precision under structural uncertainty, concurrent multiple pathways, and nonlinear scenarios. In real-world applications using Human Phenotype Project data, SIGMA identifies mediation effects of sleep quality on cardiovascular health through inflammatory and metabolic pathways, uncovering previously unspecified multiple mediation paths.
Structural Causal Bandits under Markov Equivalence
In decision-making processes, an intelligent agent with causal knowledge can optimize action spaces to avoid unnecessary exploration. A structural causal bandit framework provides guidance on how to prune actions that are unable to maximize reward by leveraging prior knowledge of the underlying causal structure among actions. A key assumption of this framework is that the agent has access to a fully-specified causal diagram representing the target system. In this paper, we extend the structural causal bandits to scenarios where the agent leverages a Markov equivalence class. In such cases, the causal structure is provided to the agent in the form of a partial ancestral graph (PAG). We propose a generalized framework for identifying potentially optimal actions within this graph structure, thereby broadening the applicability of structural causal bandits.
Causal Identification under Markov equivalence: Calculus, Algorithm, and Completeness
One common task in many data sciences applications is to answer questions about the effect of new interventions, like: 'what would happen to Y if we make X equal to x while observing covariates Z = z?'. Formally, this is known as conditional effect identification, where the goal is to determine whether a post-interventional distribution is computable from the combination of an observational distribution and assumptions about the underlying domain represented by a causal diagram. A plethora of methods was developed for solving this problem, including the celebrated do-calculus [Pearl, 1995]. In practice, these results are not always applicable since they require a fully specified causal diagram as input, which is usually not available. In this paper, we assume as the input of the task a less informative structure known as a partial ancestral graph (PAG), which represents a Markov equivalence class of causal diagrams, learnable from observational data.
Supplementary Material: Iterative Causal Discovery in the Possible Presence of Latent Confounders and Selection Bias
In this section we provide a detailed proof for the correctness and completeness of the ICD algorithm. For easier referencing we describe ICD in Algorithm 1, and describe the ICD-Sep conditions. A set Zis a subset of ICD-Sep(A,B) given r {0,...,|O| 2}, if and only if 1. |Z|= r, 2. Z Z, there exists a PDS-path ฮ B(A,Z) such that, (a) |ฮ B(A,Z)| r and (b) every node on ฮ B(A,Z) is in Z, and 3. Z Z, node Z is a possible ancestor of Aor B (not a necessary condition). Denote A,B a pair of nodes from O that are connected in G and disconnected in D, and such that Ais not an ancestor of B in D. If A B |[Z0] S, where Z0 O is a minimal separating set having size n+ 1, then there exists a subset Z O having the same size of n+ 1 such that that A B |Z S, and for every node Z Zthere exists a PDS-path ฮ B(A,Z) in G, such that every node V on the PDS-path is also in Z. Proof. It was previously shown that a minimal separating set for Aand B, where Ais not an ancestor of B, is a subset of D-Sep(A,B) (Spirtes et al., 2000, page 134 and Theorem 6.2; Spirtes et al., 1999).
144a3f71a03ab7c4f46f9656608efdb2-Paper.pdf
Understanding the underlying mechanisms is crucial for tasks such asexplaining aphenomenon, predicting, anddecision making. Pearl(2009) providedamachinery for automating the process of answering interventional and (retrospective) counterfactual queries even when only observed data is available, and determining if a query cannot be answered given the available data type (identifiability). This requires knowledge about the true underlying causal structure; however,inmanyreal-world situations, thisstructure isunknown.
PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier
Jiang, Yuhua, Xiong, Yuwen, Yuan, Yufeng, Xin, Chao, Xu, Wenyuan, Yue, Yu, Zhao, Qianchuan, Yan, Lin
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks, yet they still struggle to reliably verify the correctness of their own outputs. Existing solutions to this verification challenge often depend on separate verifier models or require multi-stage self-correction training pipelines, which limit scalability. In this paper, we propose Policy as Generative Verifier (PAG), a simple and effective framework that empowers LLMs to self-correct by alternating between policy and verifier roles within a unified multi-turn reinforcement learning (RL) paradigm. Distinct from prior approaches that always generate a second attempt regardless of model confidence, PAG introduces a selective revision mechanism: the model revises its answer only when its own generative verification step detects an error. This verify-then-revise workflow not only alleviates model collapse but also jointly enhances both reasoning and verification abilities. Extensive experiments across diverse reasoning benchmarks highlight PAG's dual advancements: as a policy, it enhances direct generation and self-correction accuracy; as a verifier, its self-verification outperforms self-consistency.
A Method for Enhancing the Safety of Large Model Generation Based on Multi-dimensional Attack and Defense
Currently, large models are prone to generating harmful content when faced with complex attack instructions, significantly reducing their defensive capabilities. To address this issue, this paper proposes a method based on constructing data aligned with multi-dimensional attack defense to enhance the generative security of large models. The core of our method lies in improving the effectiveness of safe alignment learning for large models by innova-tively increasing the diversity of attack instruction dimensions and the accuracy of generat-ing safe responses. To validate the effectiveness of our method, beyond existing security evaluation benchmarks, we additionally designed new security evaluation benchmarks and conducted comparative experiments using Llama3.2 as the baseline model. The final ex-perimental results demonstrate that our method can significantly improve the generative security of large models under complex instructional attacks, while also maintaining and enhancing the models' general capabilities.
MMDS: A Multimodal Medical Diagnosis System Integrating Image Analysis and Knowledge-based Departmental Consultation
Ren, Yi, Zhang, HanZhi, Li, Weibin, Fu, Jun, Liu, Diandong, Zhang, Tianyi, He, Jie, Jiao, Licheng
We present MMDS, a system capable of recognizing medical images and patient facial details, and providing professional medical diagnoses. The system consists of two core components:The first component is the analysis of medical images and videos. We trained a specialized multimodal medical model capable of interpreting medical images and accurately analyzing patients' facial emotions and facial paralysis conditions. The model achieved an accuracy of 72.59% on the FER2013 facial emotion recognition dataset, with a 91.1% accuracy in recognizing the "happy" emotion. In facial paralysis recognition, the model reached an accuracy of 92%, which is 30% higher than that of GPT-4o. Based on this model, we developed a parser for analyzing facial movement videos of patients with facial paralysis, achieving precise grading of the paralysis severity. In tests on 30 videos of facial paralysis patients, the system demonstrated a grading accuracy of 83.3%.The second component is the generation of professional medical responses. We employed a large language model, integrated with a medical knowledge base, to generate professional diagnoses based on the analysis of medical images or videos. The core innovation lies in our development of a department-specific knowledge base routing management mechanism, in which the large language model categorizes data by medical departments and, during the retrieval process, determines the appropriate knowledge base to query. This significantly improves retrieval accuracy in the RAG (retrieval-augmented generation) process.
A Post-Training Enhanced Optimization Approach for Small Language Models
This paper delves into the continuous post-training optimization methods for small language models, and proposes a continuous post-training alignment data construction method for small language models. The core of this method is based on the data guidance of large models, optimizing the diversity and accuracy of alignment data. In addition, to verify the effectiveness of the methods in this paper, we used Qwen2-0.5B-Instruct model as the baseline model for small language models, using the alignment dataset constructed by our proposed method, we trained and compared several groups of experiments, including SFT (Supervised Fine Tuning) post-training experiment and KTO (Kahneman Tversky optimization) post-training experiment, as well as SFT-KTO two-stage post-training experiment and model weight fusion experiment. Finally, we evaluated and analyzed the performance of post-training models, and confirmed that the continuous post-training optimization method proposed by us can significantly improve the performance of small language models.