Jin, Yifei
Estimating Voltage Drop: Models, Features and Data Representation Towards a Neural Surrogate
Jin, Yifei, Koutlis, Dimitrios, Bandala, Hector, Daoutis, Marios
Abstract--Accurate estimation of voltage drop (IR drop) in modern Application-Specific Integrated Circuits (ASICs) is highly time and resource demanding, due to the growing complexity and the transistor density in recent technology nodes. To mitigate this challenge, we investigate how Machine Learning (ML) techniques, including Extreme Gradient Boosting (XGBoost), Convolutional Neural Network (CNN), and Graph Neural Network (GNN) can aid in reducing the computational effort and implicitly the time required to estimate the IR drop in Integrated Circuits (ICs). ML algorithms, on the other hand, are explored as an alternative solution to offer quick and precise IR drop estimation, but in considerably less time. This study illustrates the effectiveness of ML algorithms in precisely estimating IR drop and optimizing ASIC sign-off. Thus, a new round of simulations is required for verification. This process is a standard routine in every ASIC design and manufacturing process, and it is defined as the "sign-off" REDICTION of IR drop is an important problem faced today often by ASIC designers. As the current (I) flows With the transition to larger density integration of transistors, through the Power Distribution Network (PDN), a part of the number of connection layers and interconnections the applied voltage inherently drops across the current path, have increased exponentially over the last decades, driven which is, in simple terms, the definition of IR drop. As a result, while commercial results in voltage drop, or to the grounding (GND), which tools are trying to keep up with the up-scaling demand, results in a ground bounce.
SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate
Jin, Yifei, Maatouk, Ali, Girdzijauskas, Sarunas, Xu, Shugong, Tassiulas, Leandros, Ying, Rex
Wireless ray-tracing (RT) is emerging as a key tool for three-dimensional (3D) wireless channel modeling, driven by advances in graphical rendering. Current approaches struggle to accurately model beyond 5G (B5G) network signaling, which often operates at higher frequencies and is more susceptible to environmental conditions and changes. Existing online learning solutions require real-time environmental supervision during training, which is both costly and incompatible with GPU-based processing. In response, we propose a novel approach that redefines ray trajectory generation as a sequential decision-making problem, leveraging generative models to jointly learn the optical, physical, and signal properties within each designated environment. Our work introduces the Scene-Aware Neural Decision Wireless Channel Raytracing Hierarchy (SANDWICH), an innovative offline, fully differentiable approach that can be trained entirely on GPUs. SANDWICH offers superior performance compared to existing online learning methods, outperforms the baseline by 4e^-2 radian in RT accuracy, and only fades 0.5 dB away from toplined channel gain estimation.
Generalizing soft actor-critic algorithms to discrete action spaces
Zhang, Le, Gu, Yong, Zhao, Xin, Zhang, Yanshuo, Zhao, Shu, Jin, Yifei, Wu, Xinxin
ATARI is a suite of video games used by reinforcement learning (RL) researchers to test the effectiveness of the learning algorithm. Receiving only the raw pixels and the game score, the agent learns to develop sophisticated strategies, even to the comparable level of a professional human games tester. Ideally, we also want an agent requiring very few interactions with the environment. Previous competitive model-free algorithms for the task use the valued-based Rainbow algorithm without any policy head. In this paper, we change it by proposing a practical discrete variant of the soft actor-critic (SAC) algorithm. The new variant enables off-policy learning using policy heads for discrete domains. By incorporating it into the advanced Rainbow variant, i.e., the ``bigger, better, faster'' (BBF), the resulting SAC-BBF improves the previous state-of-the-art interquartile mean (IQM) from 1.045 to 1.088, and it achieves these results using only replay ratio (RR) 2. By using lower RR 2, the training time of SAC-BBF is strictly one-third of the time required for BBF to achieve an IQM of 1.045 using RR 8. As a value of IQM greater than one indicates super-human performance, SAC-BBF is also the only model-free algorithm with a super-human level using only RR 2. The code is publicly available on GitHub at https://github.com/lezhang-thu/bigger-better-faster-SAC.
Are We Wasting Time? A Fast, Accurate Performance Evaluation Framework for Knowledge Graph Link Predictors
Cornell, Filip, Jin, Yifei, Karlgren, Jussi, Girdzijauskas, Sarunas
The standard evaluation protocol for measuring the quality of Knowledge Graph Completion methods - the task of inferring new links to be added to a graph - typically involves a step which ranks every entity of a Knowledge Graph to assess their fit as a head or tail of a candidate link to be added. In Knowledge Graphs on a larger scale, this task rapidly becomes prohibitively heavy. Previous approaches mitigate this problem by using random sampling of entities to assess the quality of links predicted or suggested by a method. However, we show that this approach has serious limitations since the ranking metrics produced do not properly reflect true outcomes. In this paper, we present a thorough analysis of these effects along with the following findings. First, we empirically find and theoretically motivate why sampling uniformly at random vastly overestimates the ranking performance of a method. We show that this can be attributed to the effect of easy versus hard negative candidates. Second, we propose a framework that uses relational recommenders to guide the selection of candidates for evaluation. We provide both theoretical and empirical justification of our methodology, and find that simple and fast methods can work extremely well, and that they match advanced neural approaches. Even when a large portion of true candidates for a property are missed, the estimation barely deteriorates. With our proposed framework, we can reduce the time and computation needed similar to random sampling strategies while vastly improving the estimation; on ogbl-wikikg2, we show that accurate estimations of the full, filtered ranking can be obtained in 20 seconds instead of 30 minutes. We conclude that considerable computational effort can be saved by effective preprocessing and sampling methods and still reliably predict performance accurately of the true performance for the entire ranking procedure.
Learning Cellular Coverage from Real Network Configurations using GNNs
Jin, Yifei, Daoutis, Marios, Girdzijauskas, Sarunas, Gionis, Aristides
Cellular coverage quality estimation has been a critical task for self-organized networks. In real-world scenarios, deep-learning-powered coverage quality estimation methods cannot scale up to large areas due to little ground truth can be provided during network design & optimization. In addition they fall short in produce expressive embeddings to adequately capture the variations of the cells' configurations. To deal with this challenge, we formulate the task in a graph representation and so that we can apply state-of-the-art graph neural networks, that show exemplary performance. We propose a novel training framework that can both produce quality cell configuration embeddings for estimating multiple KPIs, while we show it is capable of generalising to large (area-wide) scenarios given very few labeled cells. We show that our framework yields comparable accuracy with models that have been trained using massively labeled samples.
A Graph Attention Learning Approach to Antenna Tilt Optimization
Jin, Yifei, Vannella, Filippo, Bouton, Maxime, Jeong, Jaeseong, Hakim, Ezeddin Al
6G will move mobile networks towards increasing levels of complexity. To deal with this complexity, optimization of network parameters is key to ensure high performance and timely adaptivity to dynamic network environments. The optimization of the antenna tilt provides a practical and cost-efficient method to improve coverage and capacity in the network. Previous methods based on Reinforcement Learning (RL) have shown great promise for tilt optimization by learning adaptive policies outperforming traditional tilt optimization methods. However, most existing RL methods are based on single-cell features representation, which fails to fully characterize the agent state, resulting in suboptimal performance. Also, most of such methods lack scalability, due to state-action explosion, and generalization ability. In this paper, we propose a Graph Attention Q-learning (GAQ) algorithm for tilt optimization. GAQ relies on a graph attention mechanism to select relevant neighbors information, improve the agent state representation, and update the tilt control policy based on a history of observations using a Deep Q-Network (DQN). We show that GAQ efficiently captures important network information and outperforms standard DQN with local information by a large margin. In addition, we demonstrate its ability to generalize to network deployments of different sizes and densities.
DoubleEnsemble: A New Ensemble Method Based on Sample Reweighting and Feature Selection for Financial Data Analysis
Zhang, Chuheng, Li, Yuanqi, Chen, Xi, Jin, Yifei, Tang, Pingzhong, Li, Jian
Modern machine learning models (such as deep neural networks and boosting decision tree models) have become increasingly popular in financial market prediction, due to their superior capacity to extract complex non-linear patterns. However, since financial datasets have very low signal-to-noise ratio and are non-stationary, complex models are often very prone to overfitting and suffer from instability issues. Moreover, as various machine learning and data mining tools become more widely used in quantitative trading, many trading firms have been producing an increasing number of features (aka factors). Therefore, how to automatically select effective features becomes an imminent problem. To address these issues, we propose DoubleEnsemble, an ensemble framework leveraging learning trajectory based sample reweighting and shuffling based feature selection. Specifically, we identify the key samples based on the training dynamics on each sample and elicit key features based on the ablation impact of each feature via shuffling. Our model is applicable to a wide range of base models, capable of extracting complex patterns, while mitigating the overfitting and instability issues for financial market prediction. We conduct extensive experiments, including price prediction for cryptocurrencies and stock trading, using both DNN and gradient boosting decision tree as base models. Our experiment results demonstrate that DoubleEnsemble achieves a superior performance compared with several baseline methods.