pdn
Proximal Diffusion Neural Sampler
Guo, Wei, Choi, Jaemoo, Zhu, Yuchen, Tao, Molei, Chen, Yongxin
The task of learning a diffusion-based neural sampler for drawing samples from an unnormalized target distribution can be viewed as a stochastic optimal control problem on path measures. However, the training of neural samplers can be challenging when the target distribution is multimodal with significant barriers separating the modes, potentially leading to mode collapse. We propose a framework named \textbf{Proximal Diffusion Neural Sampler (PDNS)} that addresses these challenges by tackling the stochastic optimal control problem via proximal point method on the space of path measures. PDNS decomposes the learning process into a series of simpler subproblems that create a path gradually approaching the desired distribution. This staged procedure traces a progressively refined path to the desired distribution and promotes thorough exploration across modes. For a practical and efficient realization, we instantiate each proximal step with a proximal weighted denoising cross-entropy (WDCE) objective. We demonstrate the effectiveness and robustness of PDNS through extensive experiments on both continuous and discrete sampling tasks, including challenging scenarios in molecular dynamics and statistical physics.
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland (0.04)
Hierarchical Decoupling Capacitor Optimization for Power Distribution Network of 2.5D ICs with Co-Analysis of Frequency and Time Domains Based on Deep Reinforcement Learning
Duan, Yuanyuan, Feng, Haiyang, Yu, Zhiping, Wu, Hanming, Shao, Leilai, Zhu, Xiaolei
With the growing need for higher memory bandwidth and computation density, 2.5D design, which involves integrating multiple chiplets onto an interposer, emerges as a promising solution. However, this integration introduces significant challenges due to increasing data rates and a large number of I/Os, necessitating advanced optimization of the power distribution networks (PDNs) both on-chip and on-interposer to mitigate the small signal noise and simultaneous switching noise (SSN). Traditional PDN optimization strategies in 2.5D systems primarily focus on reducing impedance by integrating decoupling capacitors (decaps) to lessen small signal noises. Unfortunately, relying solely on frequency-domain analysis has been proven inadequate for addressing coupled SSN, as indicated by our experimental results. In this work, we introduce a novel two-phase optimization flow using deep reinforcement learning to tackle both the on-chip small signal noise and SSN. Initially, we optimize the impedance in the frequency domain to maintain the small signal noise within acceptable limits while avoiding over-design. Subsequently, in the time domain, we refine the PDN to minimize the voltage violation integral (VVI), a more accurate measure of SSN severity. To the best of our knowledge, this is the first dual-domain optimization strategy that simultaneously addresses both the small signal noise and SSN propagation through strategic decap placement in on-chip and on-interposer PDNs, offering a significant step forward in the design of robust PDNs for 2.5D integrated systems.
- Energy > Power Industry (0.68)
- Semiconductors & Electronics (0.46)
DevFormer: A Symmetric Transformer for Context-Aware Device Placement
Kim, Haeyeon, Kim, Minsu, Berto, Federico, Kim, Joungho, Park, Jinkyoo
In this paper, we present DevFormer, a novel transformer-based architecture for addressing the complex and computationally demanding problem of hardware design optimization. Despite the demonstrated efficacy of transformers in domains including natural language processing and computer vision, their use in hardware design has been limited by the scarcity of offline data. Our approach addresses this limitation by introducing strong inductive biases such as relative positional embeddings and action-permutation symmetricity that effectively capture the hardware context and enable efficient design optimization with limited offline data. We apply DevFoemer to the problem of decoupling capacitor placement and show that it outperforms state-of-the-art methods in both simulated and real hardware, leading to improved performances while reducing the number of components by more than $30\%$. Finally, we show that our approach achieves promising results in other offline contextual learning-based combinatorial optimization tasks.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe (0.04)
- Asia (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Transformer Network-based Reinforcement Learning Method for Power Distribution Network (PDN) Optimization of High Bandwidth Memory (HBM)
Park, Hyunwook, Kim, Minsu, Kim, Seongguk, Kim, Keunwoo, Kim, Haeyeon, Shin, Taein, Son, Keeyoung, Sim, Boogyo, Kim, Subin, Jeong, Seungtaek, Hwang, Chulsoon, Kim, Joungho
In this article, for the first time, we propose a transformer network-based reinforcement learning (RL) method for power distribution network (PDN) optimization of high bandwidth memory (HBM). The proposed method can provide an optimal decoupling capacitor (decap) design to maximize the reduction of PDN self- and transfer impedance seen at multiple ports. An attention-based transformer network is implemented to directly parameterize decap optimization policy. The optimality performance is significantly improved since the attention mechanism has powerful expression to explore massive combinatorial space for decap assignments. Moreover, it can capture sequential relationships between the decap assignments. The computing time for optimization is dramatically reduced due to the reusable network on positions of probing ports and decap assignment candidates. This is because the transformer network has a context embedding process to capture meta-features including probing ports positions. In addition, the network is trained with randomly generated data sets. Therefore, without additional training, the trained network can solve new decap optimization problems. The computing time for training and data cost are critically decreased due to the scalability of the network. Thanks to its shared weight property, the network can adapt to a larger scale of problems without additional training. For verification, we compare the results with conventional genetic algorithm (GA), random search (RS), and all the previous RL-based methods. As a result, the proposed method outperforms in all the following aspects: optimality performance, computing time, and data efficiency.
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Missouri > Phelps County > Rolla (0.04)
- (5 more...)
Pathfinder Discovery Networks for Neural Message Passing
Rozemberczki, Benedek, Englert, Peter, Kapoor, Amol, Blais, Martin, Perozzi, Bryan
In this work we propose Pathfinder Discovery Networks (PDNs), a method for jointly learning a message passing graph over a multiplex network with a downstream semi-supervised model. PDNs inductively learn an aggregated weight for each edge, optimized to produce the best outcome for the downstream learning task. PDNs are a generalization of attention mechanisms on graphs which allow flexible construction of similarity functions between nodes, edge convolutions, and cheap multiscale mixing layers. We show that PDNs overcome weaknesses of existing methods for graph attention (e.g. Graph Attention Networks), such as the diminishing weight problem. Our experimental results demonstrate competitive predictive performance on academic node classification tasks. Additional results from a challenging suite of node classification experiments show how PDNs can learn a wider class of functions than existing baselines. We analyze the relative computational complexity of PDNs, and show that PDN runtime is not considerably higher than static-graph models. Finally, we discuss how PDNs can be used to construct an easily interpretable attention mechanism that allows users to understand information propagation in the graph.
- Europe (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (2 more...)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
- Information Technology > Data Science > Data Mining (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Precision disease networks (PDN)
Cabrera, J., Amaratunga, D., Kostis, W., Kostis, J
The a rrows represent the frequency of the relationship from A to B i n the cluster of pa tients, R ed arrows repre sente a frequency of 75% or more of the cluster observations containinig the relation ship, green arrows repre sent a frequency in the range 50% - 75%, whereas yellow arrows repre sent repre sent a frequency in the range 25% - 50%. For step 3 we performed a hierarchical cluster analysis using the WARD method that resulted in 10 clust ers for each of the 4 PCA datasets. Figure 2 shows a scatter plot of the first two components of the first of the cluster analysis where the 10 clusters are shown in different colors. In Figure 3 we display the summaries of the 10 clusters. For step 4 we used the cox proportional hazard model with the variable "all death" as response that measures the date of death for any cause of death. We fitted 7 different models using different combinations of predictors as shown on table 1.
- North America > United States > New Jersey (0.06)
- North America > United States > New York (0.05)
Plan-Recognition-Driven Attention Modeling for Visual Recognition
Zha, Yantian, Li, Yikang, Yu, Tianshu, Kambhampati, Subbarao, Li, Baoxin
Human visual recognition of activities or external agents involves an interplay between high-level plan recognition and low-level perception. Given that, a natural question to ask is: can low-level perception be improved by high-level plan recognition? We formulate the problem of leveraging recognized plans to generate better top-down attention maps \cite{gazzaniga2009,baluch2011} to improve the perception performance. We call these top-down attention maps specifically as plan-recognition-driven attention maps. To address this problem, we introduce the Pixel Dynamics Network. Pixel Dynamics Network serves as an observation model, which predicts next states of object points at each pixel location given observation of pixels and pixel-level action feature. This is like internally learning a pixel-level dynamics model. Pixel Dynamics Network is a kind of Convolutional Neural Network (ConvNet), with specially-designed architecture. Therefore, Pixel Dynamics Network could take the advantage of parallel computation of ConvNets, while learning the pixel-level dynamics model. We further prove the equivalence between Pixel Dynamics Network as an observation model, and the belief update in partially observable Markov decision process (POMDP) framework. We evaluate our Pixel Dynamics Network in event recognition tasks. We build an event recognition system, ER-PRN, which takes Pixel Dynamics Network as a subroutine, to recognize events based on observations augmented by plan-recognition-driven attention.
- North America > United States > California > Los Angeles County > Pasadena (0.04)
- North America > United States > Arizona (0.04)
- Africa > Middle East > Djibouti > Arta > `Arta (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Poisson Sum-Product Networks: A Deep Architecture for Tractable Multivariate Poisson Distributions
Molina, Alejandro (Technische Universität Dortmund) | Natarajan, Sriraam (Indiana University) | Kersting, Kristian (Technische Universität Dortmund)
Multivariate count data are pervasive in science in the form of histograms, contingency tables and others. Previous work on modeling this type of distributions do not allow for fast and tractable inference. In this paper we present a novel Poisson graphical model, the first based on sum product networks, called PSPN, allowing for positive as well as negative dependencies. We present algorithms for learning tree PSPNs from data as well as for tractable inference via symbolic evaluation. With these, information-theoretic measures such as entropy, mutual information, and distances among count variables can be computed without resorting to approximations. Additionally, we show a connection between PSPNs and LDA, linking the structure of tree PSPNs to a hierarchy of topics. The experimental results on several synthetic and real world datasets demonstrate that PSPN often outperform state-of-the-art while remaining tractable.
- North America > United States > Indiana (0.04)
- Europe > Germany (0.04)
- North America > United States > Michigan > Wayne County > Detroit (0.04)
- Europe > Portugal > Porto > Porto (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)