Goto

Collaborating Authors

 aig


Functional Matching of Logic Subgraphs: Beyond Structural Isomorphism

arXiv.org Artificial Intelligence

Subgraph matching in logic circuits is foundational for numerous Electronic Design Automation (EDA) applications, including datapath optimization, arithmetic verification, and hardware trojan detection. However, existing techniques rely primarily on structural graph isomorphism and thus fail to identify function-related subgraphs when synthesis transformations substantially alter circuit topology. To overcome this critical limitation, we introduce the concept of functional subgraph matching, a novel approach that identifies whether a given logic function is implicitly present within a larger circuit, irrespective of structural variations induced by synthesis or technology mapping. Specifically, we propose a two-stage multi-modal framework: (1) learning robust functional embeddings across AIG and post-mapping netlists for functional subgraph detection, and (2) identifying fuzzy boundaries using a graph segmentation approach. Evaluations on standard benchmarks (ITC99, OpenABCD, ForgeEDA) demonstrate significant performance improvements over existing structural methods, with average $93.8\%$ accuracy in functional subgraph detection and a dice score of $91.3\%$ in fuzzy boundary identification. The source code and implementation details can be found at https://github.com/zyzheng17/Functional_Subgraph_Matching-Neurips25.


TRACE: Learning to Compute on Graphs

arXiv.org Artificial Intelligence

Learning to compute, the ability to model the functional behavior of a computational graph, is a fundamental challenge for graph representation learning. Yet, the dominant paradigm is architecturally mismatched for this task. This flawed assumption, central to mainstream message passing neural networks (MPNNs) and their conventional Transformer-based counterparts, prevents models from capturing the position-aware, hierarchical nature of computation. To resolve this, we introduce \textbf{TRACE}, a new paradigm built on an architecturally sound backbone and a principled learning objective. First, TRACE employs a Hierarchical Transformer that mirrors the step-by-step flow of computation, providing a faithful architectural backbone that replaces the flawed permutation-invariant aggregation. Second, we introduce \textbf{function shift learning}, a novel objective that decouples the learning problem. Instead of predicting the complex global function directly, our model is trained to predict only the \textit{function shift}, the discrepancy between the true global function and a simple local approximation that assumes input independence. We validate this paradigm on electronic circuits, one of the most complex and economically critical classes of computational graphs. Across a comprehensive suite of benchmarks, TRACE substantially outperforms all prior architectures. These results demonstrate that our architecturally-aligned backbone and decoupled learning objective form a more robust paradigm for the fundamental challenge of learning to compute on graphs.


Alignment Unlocks Complementarity: A Framework for Multiview Circuit Representation Learning

arXiv.org Artificial Intelligence

Multiview learning on Boolean circuits holds immense promise, as different graph-based representations offer complementary structural and semantic information. However, the vast structural heterogeneity between views--such as an And-Inverter Graph (AIG) versus an XOR-Majority Graph (XMG)--poses a critical barrier to effective fusion, especially for self-supervised techniques like masked modeling. Naively applying such methods fails, as the cross-view context is perceived as noise. Our key insight is that functional alignment is a necessary precondition to unlock the power of multiview self-supervision. We introduce MixGate, a framework built on a principled training curriculum that first teaches the model a shared, function-aware representation space via an Equivalence Alignment Loss. Only then do we introduce a multiview masked modeling objective, which can now leverage the aligned views as a rich, complementary signal. Extensive experiments, including a crucial ablation study, demonstrate that our alignment-first strategy transforms masked modeling from an ineffective technique into a powerful performance driver. Multiview learning on Boolean circuits holds immense promise, as different graph-based representations offer complementary structural and semantic insights. While an And-Inverter Graph (AIG) provides a detailed structural view, a format like an XOR-Majority Graph (XMG) offers a semantically richer, high-level abstraction. This multiview approach has shown remarkable empirical success, surpassing earlier models that relied on single representations Li et al. (2022); Wang et al. (2022); Wu et al. (2023); Shi et al. (2023); Deng et al. (2024); Wang et al. (2024). The key challenge, however, arises from the vast structural heterogeneity between these views.


FuncGNN: Learning Functional Semantics of Logic Circuits with Graph Neural Networks

arXiv.org Artificial Intelligence

As integrated circuit scale grows and design complexity rises, effective circuit representation helps support logic synthesis, formal verification, and other automated processes in electronic design automation. And-Inverter Graphs (AIGs), as a compact and canonical structure, are widely adopted for representing Boolean logic in these workflows. However, the increasing complexity and integration density of modern circuits introduce structural heterogeneity and global logic information loss in AIGs, posing significant challenges to accurate circuit modeling. To address these issues, we propose FuncGNN, which integrates hybrid feature aggregation to extract multi-granularity topological patterns, thereby mitigating structural heterogeneity and enhancing logic circuit representations. FuncGNN further introduces gate-aware normalization that adapts to circuit-specific gate distributions, improving robustness to structural heterogeneity. Finally, FuncGNN employs multi-layer integration to merge intermediate features across layers, effectively synthesizing local and global semantic information for comprehensive logic representations. Experimental results on two logic-level analysis tasks (i.e., signal probability prediction and truth-table distance prediction) demonstrate that FuncGNN outperforms existing state-of-the-art methods, achieving improvements of 2.06% and 18.71%, respectively, while reducing training time by approximately 50.6% and GPU memory usage by about 32.8%.


Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment

arXiv.org Artificial Intelligence

Understanding the structure and function of circuits is crucial for electronic design automation (EDA). Circuits can be formulated as And-Inverter graphs (AIGs), enabling efficient implementation of representation learning through graph neural networks (GNNs). Masked modeling paradigms have been proven effective in graph representation learning. However, masking augmentation to original circuits will destroy their logical equivalence, which is unsuitable for circuit representation learning. Moreover, existing masked modeling paradigms often prioritize structural information at the expense of abstract information such as circuit function. To address these limitations, we introduce MGVGA, a novel constrained masked modeling paradigm incorporating masked gate modeling (MGM) and Verilog-AIG alignment (VGA). Specifically, MGM preserves logical equivalence by masking gates in the latent space rather than in the original circuits, subsequently reconstructing the attributes of these masked gates. Meanwhile, large language models (LLMs) have demonstrated an excellent understanding of the Verilog code functionality. Building upon this capability, VGA performs masking operations on original circuits and reconstructs masked gates under the constraints of equivalent Verilog codes, enabling GNNs to learn circuit functions from LLMs. We evaluate MGVGA on various logic synthesis tasks for EDA and show the superior performance of MGVGA compared to previous state-of-the-art methods. Our code is available at https://github.com/wuhy68/MGVGA.


OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine Learning Tasks in Logic Synthesis

arXiv.org Artificial Intelligence

--This paper introduces OpenLS-DGF, an adaptive logic synthesis dataset generation framework, to enhance machine learning (ML) applications within the logic synthesis process. Previous dataset generation flows were tailored for specific tasks or lacked integrated machine learning capabilities. While OpenLS-DGF supports various machine learning tasks by encapsulating the three fundamental steps of logic synthesis: Boolean representation, logic optimization, and technology mapping. It preserves the original information in both V erilog and machine-learning-friendly GraphML formats. The verilog files offer semi-customizable capabilities, enabling researchers to insert additional steps and incrementally refine the generated dataset. Furthermore, OpenLS-DGF includes an adaptive circuit engine that facilitates the final dataset management and downstream tasks. The generated OpenLS-D-v1 dataset comprises 46 combinational designs from established benchmarks, totaling over 966,000 Boolean circuits. OpenLS-D-v1 supports integrating new data features, making it more versatile for new challenges. This paper demonstrates the versatility of OpenLS-D-v1 through four distinct downstream tasks: circuit classification, circuit ranking, quality of results (QoR) prediction, and probability prediction. Each task is chosen to represent essential steps of logic synthesis, and the experimental results show the generated dataset from OpenLS-DGF achieves prominent diversity and applicability. OGIC synthesis is a key phase in the electronic design automation (EDA) flow of digital circuits, translating high-level specifications into a gate-level netlist. Recently, there has been a trend towards adopting ML approaches for the EDA [1] domain. V arious machine learning methodologies have been proposed, demonstrating improvements in different aspects of the logic synthesis process, including logic optimization [2], [3], [4], [5], [6], technology mapping [7], [8], [9], and formal verification [10], [11]. These machine learning-based techniques have shown their promise in improving the efficiency and quality of logic synthesis steps.


SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation

arXiv.org Artificial Intelligence

We introduce SeaDAG, a semi-autoregressive diffusion model for conditional generation of Directed Acyclic Graphs (DAGs). Considering their inherent layer-wise structure, we simulate layer-wise autoregressive generation by designing different denoising speed for different layers. Unlike conventional autoregressive generation that lacks a global graph structure view, our method maintains a complete graph structure at each diffusion step, enabling operations such as property control that require the full graph structure. Leveraging this capability, we evaluate the DAG properties during training by employing a graph property decoder. We explicitly train the model to learn graph conditioning with a condition loss, which enhances the diffusion model's capacity to generate graphs that are both realistic and aligned with specified properties. We evaluate our method on two representative conditional DAG generation tasks: (1) circuit generation from truth tables, where precise DAG structures are crucial for realizing circuit functionality, and (2) molecule generation based on quantum properties. Our approach demonstrates promising results, generating high-quality and realistic DAGs that closely align with given conditions.


Logic Synthesis Optimization with Predictive Self-Supervision via Causal Transformers

arXiv.org Artificial Intelligence

Contemporary hardware design benefits from the abstraction provided by high-level logic gates, streamlining the implementation of logic circuits. Logic Synthesis Optimization (LSO) operates at one level of abstraction within the Electronic Design Automation (EDA) workflow, targeting improvements in logic circuits with respect to performance metrics such as size and speed in the final layout. Recent trends in the field show a growing interest in leveraging Machine Learning (ML) for EDA, notably through ML-guided logic synthesis utilizing policy-based Reinforcement Learning (RL) methods.Despite these advancements, existing models face challenges such as overfitting and limited generalization, attributed to constrained public circuits and the expressiveness limitations of graph encoders. To address these hurdles, and tackle data scarcity issues, we introduce LSOformer, a novel approach harnessing Autoregressive transformer models and predictive SSL to predict the trajectory of Quality of Results (QoR). LSOformer integrates cross-attention modules to merge insights from circuit graphs and optimization sequences, thereby enhancing prediction accuracy for QoR metrics. Experimental studies validate the effectiveness of LSOformer, showcasing its superior performance over baseline architectures in QoR prediction tasks, where it achieves improvements of 5.74%, 4.35%, and 17.06% on the EPFL, OABCD, and proprietary circuits datasets, respectively, in inductive setup.


Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models

arXiv.org Artificial Intelligence

Text-to-image (T2I) diffusion models have become prominent tools for generating high-fidelity images from text prompts. However, when trained on unfiltered internet data, these models can produce unsafe, incorrect, or stylistically undesirable images that are not aligned with human preferences. To address this, recent approaches have incorporated human preference datasets to fine-tune T2I models or to optimize reward functions that capture these preferences. Although effective, these methods are vulnerable to reward hacking, where the model overfits to the reward function, leading to a loss of diversity in the generated images. In this paper, we prove the inevitability of reward hacking and study natural regularization techniques like KL divergence and LoRA scaling, and their limitations for diffusion models. We also introduce Annealed Importance Guidance (AIG), an inference-time regularization inspired by Annealed Importance Sampling, which retains the diversity of the base model while achieving Pareto-Optimal reward-diversity tradeoffs. Our experiments demonstrate the benefits of AIG for Stable Diffusion models, striking the optimal balance between reward optimization and image diversity. Furthermore, a user study confirms that AIG improves diversity and quality of generated images across different model architectures and reward functions.


MTLSO: A Multi-Task Learning Approach for Logic Synthesis Optimization

arXiv.org Artificial Intelligence

Electronic Design Automation (EDA) is essential for IC design and has recently benefited from AI-based techniques to improve efficiency. Logic synthesis, a key EDA stage, transforms high-level hardware descriptions into optimized netlists. Recent research has employed machine learning to predict Quality of Results (QoR) for pairs of And-Inverter Graphs (AIGs) and synthesis recipes. However, the severe scarcity of data due to a very limited number of available AIGs results in overfitting, significantly hindering performance. Additionally, the complexity and large number of nodes in AIGs make plain GNNs less effective for learning expressive graph-level representations. To tackle these challenges, we propose MTLSO - a Multi-Task Learning approach for Logic Synthesis Optimization. On one hand, it maximizes the use of limited data by training the model across different tasks. This includes introducing an auxiliary task of binary multi-label graph classification alongside the primary regression task, allowing the model to benefit from diverse supervision sources. On the other hand, we employ a hierarchical graph representation learning strategy to improve the model's capacity for learning expressive graph-level representations of large AIGs, surpassing traditional plain GNNs. Extensive experiments across multiple datasets and against state-of-the-art baselines demonstrate the superiority of our method, achieving an average performance gain of 8.22\% for delay and 5.95\% for area.