dreamplace
Using Multi-modal Large Language Model to Boost Fireworks Algorithm's Ability in Settling Challenging Optimization Tasks
As optimization problems grow increasingly complex and diverse, advancements in optimization techniques and paradigm innovations hold significant importance. The challenges posed by optimization problems are primarily manifested in their non-convexity, high-dimensionality, black-box nature, and other unfavorable characteristics. Traditional zero-order or first-order methods, which are often characterized by low efficiency, inaccurate gradient information, and insufficient utilization of optimization information, are ill-equipped to address these challenges effectively. In recent years, the rapid development of large language models (LLM) has led to substantial improvements in their language understanding and code generation capabilities. Consequently, the design of optimization algorithms leveraging large language models has garnered increasing attention from researchers. In this study, we choose the fireworks algorithm(FWA) as the basic optimizer and propose a novel approach to assist the design of the FWA by incorporating multi-modal large language model(MLLM). To put it simply, we propose the concept of Critical Part(CP), which extends FWA to complex high-dimensional tasks, and further utilizes the information in the optimization process with the help of the multi-modal characteristics of large language models. We focus on two specific tasks: the \textit{traveling salesman problem }(TSP) and \textit{electronic design automation problem} (EDA). The experimental results show that FWAs generated under our new framework have achieved or surpassed SOTA results on many problem instances.
Chip Placement with Diffusion
Lee, Vint, Deng, Chun, Elzeiny, Leena, Abbeel, Pieter, Wawrzynek, John
Macro placement is a vital step in digital circuit design that defines the physical location of large collections of components, known as macros, on a 2-dimensional chip. The physical layout obtained during placement determines key performance metrics of the chip, such as power consumption, area, and performance. Existing learning-based methods typically fall short because of their reliance on reinforcement learning, which is slow and limits the flexibility of the agent by casting placement as a sequential process. Instead, we use a powerful diffusion model to place all components simultaneously. To enable such models to train at scale, we propose a novel architecture for the denoising model, as well as an algorithm to generate large synthetic datasets for pre-training. We empirically show that our model can tackle the placement task, and achieve competitive performance on placement benchmarks compared to state-of-the-art methods.
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia (0.04)
GOALPlace: Begin with the End in Mind
Agnesina, Anthony, Liang, Rongjian, Pradipta, Geraldo, Rajaram, Anand, Ren, Haoxing
Co-optimizing placement with congestion is integral to achieving high-quality designs. This paper presents GOALPlace, a new learning-based general approach to improving placement congestion by controlling cell density. Our method efficiently learns from an EDA tool's post-route optimized results and uses an empirical Bayes technique to adapt this goal/target to a specific placer's solutions, effectively beginning with the end in mind. It enhances correlation with the long-running heuristics of the tool's router and timing-opt engine -- while solving placement globally without expensive incremental congestion estimation and mitigation methods. A statistical analysis with a new hierarchical netlist clustering establishes the importance of density and the potential for an adequate cell density target across placements. Our experiments show that our method, integrated as a demonstration inside an academic GPU-accelerated global placer, consistently produces macro and standard cell placements of superior or comparable quality to commercial tools. Our empirical Bayes methodology also allows a substantial quality improvement over state-of-the-art academic mixed-size placers, achieving up to 10x fewer design rule check (DRC) violations, a 5% decrease in wirelength, and a 30% and 60% reduction in worst and total negative slack (WNS/TNS).
- North America > United States > District of Columbia > Washington (0.05)
- Europe > Netherlands > South Holland > Leiden (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia (0.04)
Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms
Wang, Zhihai, Geng, Zijie, Tu, Zhaojie, Wang, Jie, Qian, Yuxi, Xu, Zhexuan, Liu, Ziyan, Xu, Siyuan, Tang, Zhentao, Kai, Shixiong, Yuan, Mingxuan, Hao, Jianye, Li, Bin, Zhang, Yongdong, Wu, Feng
The increasing complexity of modern very-large-scale integration (VLSI) design highlights the significance of Electronic Design Automation (EDA) technologies. Chip placement is a critical step in the EDA workflow, which positions chip modules on the canvas with the goal of optimizing performance, power, and area (PPA) metrics of final chip designs. Recent advances have demonstrated the great potential of AI-based algorithms in enhancing chip placement. However, due to the lengthy workflow of chip design, the evaluations of these algorithms often focus on intermediate surrogate metrics, which are easy to compute but frequently reveal a substantial misalignment with the end-to-end performance (i.e., the final design PPA). To address this challenge, we introduce ChiPBench, which can effectively facilitate research in chip placement within the AI community. ChiPBench is a comprehensive benchmark specifically designed to evaluate the effectiveness of existing AI-based chip placement algorithms in improving final design PPA metrics. Specifically, we have gathered 20 circuits from various domains (e.g., CPU, GPU, and microcontrollers). These designs are compiled by executing the workflow from the verilog source code, which preserves necessary physical implementation kits, enabling evaluations for the placement algorithms on their impacts on the final design PPA. We executed six state-of-the-art AI-based chip placement algorithms on these designs and plugged the results of each single-point algorithm into the physical implementation workflow to obtain the final PPA results. Experimental results show that even if intermediate metric of a single-point algorithm is dominant, while the final PPA results are unsatisfactory. We believe that our benchmark will serve as an effective evaluation framework to bridge the gap between academia and industry.
- Asia > Middle East > Israel (0.04)
- Europe (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Workflow (1.00)
- Research Report > New Finding (0.34)
DG-RePlAce: A Dataflow-Driven GPU-Accelerated Analytical Global Placement Framework for Machine Learning Accelerators
Kahng, Andrew B., Wang, Zhiang
Global placement is a fundamental step in VLSI physical design. The wide use of 2D processing element (PE) arrays in machine learning accelerators poses new challenges of scalability and Quality of Results (QoR) for state-of-the-art academic global placers. In this work, we develop DG-RePlAce, a new and fast GPU-accelerated global placement framework built on top of the OpenROAD infrastructure, which exploits the inherent dataflow and datapath structures of machine learning accelerators. Experimental results with a variety of machine learning accelerators using a commercial 12nm enablement show that, compared with RePlAce (DREAMPlace), our approach achieves an average reduction in routed wirelength by 10% (7%) and total negative slack (TNS) by 31% (34%), with faster global placement and on-par total runtimes relative to DREAMPlace. Empirical studies on the TILOS MacroPlacement Benchmarks further demonstrate that post-route improvements over RePlAce and DREAMPlace may reach beyond the motivating application to machine learning accelerators.
Escaping Local Optima in Global Placement
Xue, Ke, Lin, Xi, Shi, Yunqi, Kai, Shixiong, Xu, Siyuan, Qian, Chao
Placement is crucial in the physical design, as it greatly affects power, performance, and area metrics. Recent advancements in analytical methods, such as DREAMPlace, have demonstrated impressive performance in global placement. However, DREAMPlace has some limitations, e.g., may not guarantee legalizable placements under the same settings, leading to fragile and unpredictable results. This paper highlights the main issue as being stuck in local optima, and proposes a hybrid optimization framework to efficiently escape the local optima, by perturbing the placement result iteratively. The proposed framework achieves significant improvements compared to state-of-the-art methods on two popular benchmarks.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.06)
- (7 more...)
Delving into Macro Placement with Reinforcement Learning
Jiang, Zixuan, Songhori, Ebrahim, Wang, Shen, Goldie, Anna, Mirhoseini, Azalia, Jiang, Joe, Lee, Young-Joon, Pan, David Z.
In physical design, human designers typically place macros via trial and error, which is a Markov decision process. Reinforcement learning (RL) methods have demonstrated superhuman performance on the macro placement. In this paper, we propose an extension to this prior work (Mirhoseini et al., 2020). We first describe the details of the policy and value network architecture. We replace the force-directed method with DREAMPlace for placing standard cells in the RL environment. We also compare our improved method with other academic placers on public benchmarks.
- North America > United States > Texas > Travis County > Austin (0.15)
- North America > United States > California > Santa Clara County > Mountain View (0.06)