Goto

Collaborating Authors

 global placement


Recursive Learning-Based Virtual Buffering for Analytical Global Placement

Kahng, Andrew B., Liu, Yiting, Wang, Zhiang

arXiv.org Artificial Intelligence

--With scaling of interconnect versus gate delays in advanced technology nodes, placement with buffer porosity awareness is essential for timing closure in physical synthesis flows. However, existing approaches face two key challenges: (i) traditional van Ginneken-Lillis-style buffering approaches [20], [27] are computationally expensive during global placement; and (ii) machine learning-based approaches, such as BufFormer [18], omit important Electrical Rule Check (ERC) considerations and typically fail to "close the loop" back into the physical design flow. In this work, we propose MLBuf-RePlAce, an open-source learning-driven virtual buffering-aware analytical global placement framework, built on top of the OpenROAD infrastructure [34]. MLBuf-RePlAce adopts an efficient recursive learning-based generative buffering approach to predict buffer types and locations, addressing ERC violations during global placement. We compare MLBuf-RePlAce against the default virtual buffering-based timing-driven global placer in OpenROAD, using open-source testcases from the TILOS MacroPlacement [36] and OpenROAD-flow-scripts [33] repositories. Without degradation of post-route power, MLBuf-RePlAce achieves (maximum, average) improvements of (56%, 31%) in total negative slack (TNS) within the open-source OpenROAD flow. When evaluated by completion in a commercial flow, MLBuf-RePlAce achieves (maximum, average) improvements of (53%, 28%) in TNS with an average of 0.2% improvement in post-route power . Global placement is a critical step in VLSI physical design. State-of-the-art analytical global placers [8] [11] [14] [22] typically adopt the electrostatics-based placement approach [23], formulating global placement as nonlinear programming under density constraints. Additionally, for design implementation in advanced technology nodes, timing closure requires extensive buffer insertion [18] and brings a complex interplay with global placement.


Evolution of Optimization Algorithms for Global Placement via Large Language Models

Yao, Xufeng, Jiang, Jiaxi, Zhao, Yuxuan, Liao, Peiyu, Lin, Yibo, Yu, Bei

arXiv.org Artificial Intelligence

Optimization algorithms are widely employed to tackle complex problems, but designing them manually is often labor-intensive and requires significant expertise. Global placement is a fundamental step in electronic design automation (EDA). While analytical approaches represent the state-of-the-art (SOTA) in global placement, their core optimization algorithms remain heavily dependent on heuristics and customized components, such as initialization strategies, preconditioning methods, and line search techniques. This paper presents an automated framework that leverages large language models (LLM) to evolve optimization algorithms for global placement. We first generate diverse candidate algorithms using LLM through carefully crafted prompts. Then we introduce an LLM-based genetic flow to evolve selected candidate algorithms. The discovered optimization algorithms exhibit substantial performance improvements across many benchmarks. Specifically, Our design-case-specific discovered algorithms achieve average HPWL improvements of \textbf{5.05\%}, \text{5.29\%} and \textbf{8.30\%} on MMS, ISPD2005 and ISPD2019 benchmarks, and up to \textbf{17\%} improvements on individual cases. Additionally, the discovered algorithms demonstrate good generalization ability and are complementary to existing parameter-tuning methods.


Timing-Driven Global Placement by Efficient Critical Path Extraction

Shi, Yunqi, Xu, Siyuan, Kai, Shixiong, Lin, Xi, Xue, Ke, Yuan, Mingxuan, Qian, Chao

arXiv.org Artificial Intelligence

Initially, vanilla DREAMPlace [20] is run to distribute the cells within the layout. Subsequently, we perform a path-level timing analysis every m rounds to extract critical paths and update the pin-to-pin loss. This involves report_timing_endpoint(n,1), where n denotes the number of all failing endpoints, to collect data on critical paths. As we traverse these paths, each pin pair (i, j) involved is added to a maintained set P, unless it has already been included. To address the path-sharing effect, the weight w ( i,j) of each pin pair is dynamically updated as follows: w ( i,j) = null w 0, if ( i, j) / P, w (i,j) + w 1 (slack/ WNS), otherwise, (9) where w 0 and w 1 are hyperparameters, and slack indicates the negative slack of the respective critical path. The pin-to-pin attraction loss PP (x, y) of the layout is then computed as: PP (x, y) = null (i,j) P w ( i,j) Q(i, j), (10) with Q(i, j) and w (i,j) defined in Eqs. 8 and 9, respectively. After defining the loss function properly, we implement the CUDA kernel of PP loss for GPU-acceleration.


DG-RePlAce: A Dataflow-Driven GPU-Accelerated Analytical Global Placement Framework for Machine Learning Accelerators

Kahng, Andrew B., Wang, Zhiang

arXiv.org Artificial Intelligence

Global placement is a fundamental step in VLSI physical design. The wide use of 2D processing element (PE) arrays in machine learning accelerators poses new challenges of scalability and Quality of Results (QoR) for state-of-the-art academic global placers. In this work, we develop DG-RePlAce, a new and fast GPU-accelerated global placement framework built on top of the OpenROAD infrastructure, which exploits the inherent dataflow and datapath structures of machine learning accelerators. Experimental results with a variety of machine learning accelerators using a commercial 12nm enablement show that, compared with RePlAce (DREAMPlace), our approach achieves an average reduction in routed wirelength by 10% (7%) and total negative slack (TNS) by 31% (34%), with faster global placement and on-par total runtimes relative to DREAMPlace. Empirical studies on the TILOS MacroPlacement Benchmarks further demonstrate that post-route improvements over RePlAce and DREAMPlace may reach beyond the motivating application to machine learning accelerators.


Escaping Local Optima in Global Placement

Xue, Ke, Lin, Xi, Shi, Yunqi, Kai, Shixiong, Xu, Siyuan, Qian, Chao

arXiv.org Artificial Intelligence

Placement is crucial in the physical design, as it greatly affects power, performance, and area metrics. Recent advancements in analytical methods, such as DREAMPlace, have demonstrated impressive performance in global placement. However, DREAMPlace has some limitations, e.g., may not guarantee legalizable placements under the same settings, leading to fragile and unpredictable results. This paper highlights the main issue as being stuck in local optima, and proposes a hybrid optimization framework to efficiently escape the local optima, by perturbing the placement result iteratively. The proposed framework achieves significant improvements compared to state-of-the-art methods on two popular benchmarks.