Goto

Collaborating Authors

 dnt



DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD

Qi, Xianbiao, Chen, Marco, Xiao, Wenjie, Ye, Jiaquan, He, Yelin, Li, Chun-Guang, Lin, Zhouchen

arXiv.org Artificial Intelligence

Transformers have become the de facto backbone of modern deep learning, yet their training typically demands an advanced optimizer with adaptive learning rate like AdamW, rather than a momentum SGDW (mSGDW). Previous works show that it is mainly due to a heavy-tailed distribution of the gradients. In this paper, we introduce a Deeply Normalized Transformer (DNT), which is meticulously engineered to overcome this limitation enabling seamless training with vanilla mSGDW while yielding comparable performance to the Transformers trained via AdamW. To be specific, in DNT, we strategically integrate normalization techniques at proper positions in the Transformers to effectively modulate the Jacobian matrices of each layer, balance the influence of weights, activations, and their interactions, and thus enable the distributions of gradients concentrated. We provide both theoretical justifications of the normalization technique used in our DNT and extensive empirical evaluation on two popular Transformer architectures to validate that: a) DNT outperforms its counterparts (\ie, ViT and GPT), and b) DNT can be effectively trained with vanilla mSGDW.


Synergizing AI and Digital Twins for Next-Generation Network Optimization, Forecasting, and Security

Zhang, Zifan, Fang, Minghong, Chen, Dianwei, Yang, Xianfeng, Liu, Yuchen

arXiv.org Artificial Intelligence

Digital network twins (DNTs) are virtual representations of physical networks, designed to enable real-time monitoring, simulation, and optimization of network performance. When integrated with machine learning (ML) techniques, particularly federated learning (FL) and reinforcement learning (RL), DNTs emerge as powerful solutions for managing the complexities of network operations. This article presents a comprehensive analysis of the synergy of DNTs, FL, and RL techniques, showcasing their collective potential to address critical challenges in 6G networks. We highlight key technical challenges that need to be addressed, such as ensuring network reliability, achieving joint data-scenario forecasting, and maintaining security in high-risk environments. Additionally, we propose several pipelines that integrate DNT and ML within coherent frameworks to enhance network optimization and security. Case studies demonstrate the practical applications of our proposed pipelines in edge caching and vehicular networks. In edge caching, the pipeline achieves over 80% cache hit rates while balancing base station loads. In autonomous vehicular system, it ensure a 100% no-collision rate, showcasing its reliability in safety-critical scenarios. By exploring these synergies, we offer insights into the future of intelligent and adaptive network systems that automate decision-making and problem-solving.


Optimizing Wireless Resource Management and Synchronization in Digital Twin Networks

Yu, Hanzhi, Liu, Yuchen, Yang, Zhaohui, Sun, Haijian, Chen, Mingzhe

arXiv.org Artificial Intelligence

In this paper, we investigate an accurate synchronization between a physical network and its digital network twin (DNT), which serves as a virtual representation of the physical network. The considered network includes a set of base stations (BSs) that must allocate its limited spectrum resources to serve a set of users while also transmitting its partially observed physical network information to a cloud server to generate the DNT. Since the DNT can predict the physical network status based on its historical status, the BSs may not need to send their physical network information at each time slot, allowing them to conserve spectrum resources to serve the users. However, if the DNT does not receive the physical network information of the BSs over a large time period, the DNT's accuracy in representing the physical network may degrade. To this end, each BS must decide when to send the physical network information to the cloud server to update the DNT, while also determining the spectrum resource allocation policy for both DNT synchronization and serving the users. We formulate this resource allocation task as an optimization problem, aiming to maximize the total data rate of all users while minimizing the asynchronization between the physical network and the DNT. To address this problem, we propose a method based on the GRUs and the value decomposition network (VDN). Simulation results show that our GRU and VDN based algorithm improves the weighted sum of data rates and the similarity between the status of the DNT and the physical network by up to 28.96%, compared to a baseline method combining GRU with the independent Q learning.


Deep Neural Networks Fused with Textures for Image Classification

Bera, Asish, Bhattacharjee, Debotosh, Nasipuri, Mita

arXiv.org Artificial Intelligence

Fine-grained image classification (FGIC) is a challenging task in computer vision for due to small visual differences among inter-subcategories, but, large intra-class variations. Deep learning methods have achieved remarkable success in solving FGIC. In this paper, we propose a fusion approach to address FGIC by combining global texture with local patch-based information. The first pipeline extracts deep features from various fixed-size non-overlapping patches and encodes features by sequential modelling using the long short-term memory (LSTM). Another path computes image-level textures at multiple scales using the local binary patterns (LBP). The advantages of both streams are integrated to represent an efficient feature vector for image classification. The method is tested on eight datasets representing the human faces, skin lesions, food dishes, marine lives, etc. using four standard backbone CNNs. Our method has attained better classification accuracy over existing methods with notable margins.


Basic concepts, definitions, and methods in D number theory

Deng, Xinyang

arXiv.org Artificial Intelligence

Although DST has many advantages in representing and dealing with uncertainty, but it is limited by some hypotheses and constraints that are hardly satisfied in some situation [3-6]. There are two main aspects. First, in DST a frame of discernment (FOD) must be composed of mutually exclusive elements, which is called the FOD's exclusiveness hypothesis. Second, in DST the sum of basic probabilities or belief m(.) in a basic probability assignment (BPA) must be 1 (or basic probabilities can not be assigned to elements outside the FOD), which is called the BPA's completeness constraint. To overcome the above-mentioned limitations in DST, a new generalization of DST, called D number theory (DNT), has been proposed in recently [7, 8] for the fusion of uncertain information with non-exclusiveness and incompleteness. The theory of DNT stems from the concept of D numbers [9-16], and aims to build a more sophisticated framework for representing and reasoning with uncertain information similar to DST from a generic setmembership perspective, in which DNT relaxes the exclusiveness constraint of elements in FOD and completeness assumption of BPA in DST.


Belief and plausibility measures for D numbers

Deng, Xinyang

arXiv.org Artificial Intelligence

As a generalization of Dempster-Shafer theory, D number theory provides a framework to deal with uncertain information with non-exclusiven ess and incompleteness. However, some basic concepts in D number theory are not well defined. In this note, the belief and plausibility measures for D nu m-bers have been proposed, and basic properties of these measure s have been revealed as well. Keywords: Belief measure, Plausibility measure, D numbers, Dempster-Shafer theory 1. Introduction Dempster-Shafer evidence theory (DST) [1, 2] is one of the most p opular theories for dealing with uncertain information, and has been widely u sed in various fields [3-5]. But it is limited by some hypotheses and constrain ts that are hardly satisfied in some situation [6-9]. There are two main as pects.