Goto

Collaborating Authors

 pdn


Proximal Diffusion Neural Sampler

Guo, Wei, Choi, Jaemoo, Zhu, Yuchen, Tao, Molei, Chen, Yongxin

arXiv.org Machine Learning

The task of learning a diffusion-based neural sampler for drawing samples from an unnormalized target distribution can be viewed as a stochastic optimal control problem on path measures. However, the training of neural samplers can be challenging when the target distribution is multimodal with significant barriers separating the modes, potentially leading to mode collapse. We propose a framework named \textbf{Proximal Diffusion Neural Sampler (PDNS)} that addresses these challenges by tackling the stochastic optimal control problem via proximal point method on the space of path measures. PDNS decomposes the learning process into a series of simpler subproblems that create a path gradually approaching the desired distribution. This staged procedure traces a progressively refined path to the desired distribution and promotes thorough exploration across modes. For a practical and efficient realization, we instantiate each proximal step with a proximal weighted denoising cross-entropy (WDCE) objective. We demonstrate the effectiveness and robustness of PDNS through extensive experiments on both continuous and discrete sampling tasks, including challenging scenarios in molecular dynamics and statistical physics.


Hierarchical Decoupling Capacitor Optimization for Power Distribution Network of 2.5D ICs with Co-Analysis of Frequency and Time Domains Based on Deep Reinforcement Learning

Duan, Yuanyuan, Feng, Haiyang, Yu, Zhiping, Wu, Hanming, Shao, Leilai, Zhu, Xiaolei

arXiv.org Artificial Intelligence

With the growing need for higher memory bandwidth and computation density, 2.5D design, which involves integrating multiple chiplets onto an interposer, emerges as a promising solution. However, this integration introduces significant challenges due to increasing data rates and a large number of I/Os, necessitating advanced optimization of the power distribution networks (PDNs) both on-chip and on-interposer to mitigate the small signal noise and simultaneous switching noise (SSN). Traditional PDN optimization strategies in 2.5D systems primarily focus on reducing impedance by integrating decoupling capacitors (decaps) to lessen small signal noises. Unfortunately, relying solely on frequency-domain analysis has been proven inadequate for addressing coupled SSN, as indicated by our experimental results. In this work, we introduce a novel two-phase optimization flow using deep reinforcement learning to tackle both the on-chip small signal noise and SSN. Initially, we optimize the impedance in the frequency domain to maintain the small signal noise within acceptable limits while avoiding over-design. Subsequently, in the time domain, we refine the PDN to minimize the voltage violation integral (VVI), a more accurate measure of SSN severity. To the best of our knowledge, this is the first dual-domain optimization strategy that simultaneously addresses both the small signal noise and SSN propagation through strategic decap placement in on-chip and on-interposer PDNs, offering a significant step forward in the design of robust PDNs for 2.5D integrated systems.


DevFormer: A Symmetric Transformer for Context-Aware Device Placement

Kim, Haeyeon, Kim, Minsu, Berto, Federico, Kim, Joungho, Park, Jinkyoo

arXiv.org Artificial Intelligence

In this paper, we present DevFormer, a novel transformer-based architecture for addressing the complex and computationally demanding problem of hardware design optimization. Despite the demonstrated efficacy of transformers in domains including natural language processing and computer vision, their use in hardware design has been limited by the scarcity of offline data. Our approach addresses this limitation by introducing strong inductive biases such as relative positional embeddings and action-permutation symmetricity that effectively capture the hardware context and enable efficient design optimization with limited offline data. We apply DevFoemer to the problem of decoupling capacitor placement and show that it outperforms state-of-the-art methods in both simulated and real hardware, leading to improved performances while reducing the number of components by more than $30\%$. Finally, we show that our approach achieves promising results in other offline contextual learning-based combinatorial optimization tasks.


Transformer Network-based Reinforcement Learning Method for Power Distribution Network (PDN) Optimization of High Bandwidth Memory (HBM)

Park, Hyunwook, Kim, Minsu, Kim, Seongguk, Kim, Keunwoo, Kim, Haeyeon, Shin, Taein, Son, Keeyoung, Sim, Boogyo, Kim, Subin, Jeong, Seungtaek, Hwang, Chulsoon, Kim, Joungho

arXiv.org Artificial Intelligence

In this article, for the first time, we propose a transformer network-based reinforcement learning (RL) method for power distribution network (PDN) optimization of high bandwidth memory (HBM). The proposed method can provide an optimal decoupling capacitor (decap) design to maximize the reduction of PDN self- and transfer impedance seen at multiple ports. An attention-based transformer network is implemented to directly parameterize decap optimization policy. The optimality performance is significantly improved since the attention mechanism has powerful expression to explore massive combinatorial space for decap assignments. Moreover, it can capture sequential relationships between the decap assignments. The computing time for optimization is dramatically reduced due to the reusable network on positions of probing ports and decap assignment candidates. This is because the transformer network has a context embedding process to capture meta-features including probing ports positions. In addition, the network is trained with randomly generated data sets. Therefore, without additional training, the trained network can solve new decap optimization problems. The computing time for training and data cost are critically decreased due to the scalability of the network. Thanks to its shared weight property, the network can adapt to a larger scale of problems without additional training. For verification, we compare the results with conventional genetic algorithm (GA), random search (RS), and all the previous RL-based methods. As a result, the proposed method outperforms in all the following aspects: optimality performance, computing time, and data efficiency.


Pathfinder Discovery Networks for Neural Message Passing

Rozemberczki, Benedek, Englert, Peter, Kapoor, Amol, Blais, Martin, Perozzi, Bryan

arXiv.org Artificial Intelligence

In this work we propose Pathfinder Discovery Networks (PDNs), a method for jointly learning a message passing graph over a multiplex network with a downstream semi-supervised model. PDNs inductively learn an aggregated weight for each edge, optimized to produce the best outcome for the downstream learning task. PDNs are a generalization of attention mechanisms on graphs which allow flexible construction of similarity functions between nodes, edge convolutions, and cheap multiscale mixing layers. We show that PDNs overcome weaknesses of existing methods for graph attention (e.g. Graph Attention Networks), such as the diminishing weight problem. Our experimental results demonstrate competitive predictive performance on academic node classification tasks. Additional results from a challenging suite of node classification experiments show how PDNs can learn a wider class of functions than existing baselines. We analyze the relative computational complexity of PDNs, and show that PDN runtime is not considerably higher than static-graph models. Finally, we discuss how PDNs can be used to construct an easily interpretable attention mechanism that allows users to understand information propagation in the graph.


Precision disease networks (PDN)

Cabrera, J., Amaratunga, D., Kostis, W., Kostis, J

arXiv.org Machine Learning

The a rrows represent the frequency of the relationship from A to B i n the cluster of pa tients, R ed arrows repre sente a frequency of 75% or more of the cluster observations containinig the relation ship, green arrows repre sent a frequency in the range 50% - 75%, whereas yellow arrows repre sent repre sent a frequency in the range 25% - 50%. For step 3 we performed a hierarchical cluster analysis using the WARD method that resulted in 10 clust ers for each of the 4 PCA datasets. Figure 2 shows a scatter plot of the first two components of the first of the cluster analysis where the 10 clusters are shown in different colors. In Figure 3 we display the summaries of the 10 clusters. For step 4 we used the cox proportional hazard model with the variable "all death" as response that measures the date of death for any cause of death. We fitted 7 different models using different combinations of predictors as shown on table 1.


Plan-Recognition-Driven Attention Modeling for Visual Recognition

Zha, Yantian, Li, Yikang, Yu, Tianshu, Kambhampati, Subbarao, Li, Baoxin

arXiv.org Artificial Intelligence

Human visual recognition of activities or external agents involves an interplay between high-level plan recognition and low-level perception. Given that, a natural question to ask is: can low-level perception be improved by high-level plan recognition? We formulate the problem of leveraging recognized plans to generate better top-down attention maps \cite{gazzaniga2009,baluch2011} to improve the perception performance. We call these top-down attention maps specifically as plan-recognition-driven attention maps. To address this problem, we introduce the Pixel Dynamics Network. Pixel Dynamics Network serves as an observation model, which predicts next states of object points at each pixel location given observation of pixels and pixel-level action feature. This is like internally learning a pixel-level dynamics model. Pixel Dynamics Network is a kind of Convolutional Neural Network (ConvNet), with specially-designed architecture. Therefore, Pixel Dynamics Network could take the advantage of parallel computation of ConvNets, while learning the pixel-level dynamics model. We further prove the equivalence between Pixel Dynamics Network as an observation model, and the belief update in partially observable Markov decision process (POMDP) framework. We evaluate our Pixel Dynamics Network in event recognition tasks. We build an event recognition system, ER-PRN, which takes Pixel Dynamics Network as a subroutine, to recognize events based on observations augmented by plan-recognition-driven attention.


Poisson Sum-Product Networks: A Deep Architecture for Tractable Multivariate Poisson Distributions

Molina, Alejandro (Technische Universität Dortmund) | Natarajan, Sriraam (Indiana University) | Kersting, Kristian (Technische Universität Dortmund)

AAAI Conferences

Multivariate count data are pervasive in science in the form of histograms, contingency tables and others. Previous work on modeling this type of distributions do not allow for fast and tractable inference. In this paper we present a novel Poisson graphical model, the first based on sum product networks, called PSPN, allowing for positive as well as negative dependencies. We present algorithms for learning tree PSPNs from data as well as for tractable inference via symbolic evaluation. With these, information-theoretic measures such as entropy, mutual information, and distances among count variables can be computed without resorting to approximations. Additionally, we show a connection between PSPNs and LDA, linking the structure of tree PSPNs to a hierarchy of topics. The experimental results on several synthetic and real world datasets demonstrate that PSPN often outperform state-of-the-art while remaining tractable.