Zheng, Lizhong
Separable Computation of Information Measures
Xu, Xiangxiang, Zheng, Lizhong
We study a separable design for computing information measures, where the information measure is computed from learned feature representations instead of raw data. Under mild assumptions on the feature representations, we demonstrate that a class of information measures admit such separable computation, including mutual information, $f$-information, Wyner's common information, G{\'a}cs--K{\"o}rner common information, and Tishby's information bottleneck. Our development establishes several new connections between information measures and the statistical dependence structure. The characterizations also provide theoretical guarantees of practical designs for estimating information measures through representation learning.
Dependence Induced Representations
Xu, Xiangxiang, Zheng, Lizhong
We study the problem of learning feature representations from a pair of random variables, where we focus on the representations that are induced by their dependence. We provide sufficient and necessary conditions for such dependence induced representations, and illustrate their connections to Hirschfeld--Gebelein--R\'{e}nyi (HGR) maximal correlation functions and minimal sufficient statistics. We characterize a large family of loss functions that can learn dependence induced representations, including cross entropy, hinge loss, and their regularized variants. In particular, we show that the features learned from this family can be expressed as the composition of a loss-dependent function and the maximal correlation function, which reveals a key connection between representations learned from different losses. Our development also gives a statistical interpretation of the neural collapse phenomenon observed in deep classifiers. Finally, we present the learning design based on the feature separation, which allows hyperparameter tuning during inference.
Towards xAI: Configuring RNN Weights using Domain Knowledge for MIMO Receive Processing
Jere, Shashank, Zheng, Lizhong, Said, Karim, Liu, Lingjia
Deep learning is making a profound impact in the physical layer of wireless communications. Despite exhibiting outstanding empirical performance in tasks such as MIMO receive processing, the reasons behind the demonstrated superior performance improvement remain largely unclear. In this work, we advance the field of Explainable AI (xAI) in the physical layer of wireless communications utilizing signal processing principles. Specifically, we focus on the task of MIMO-OFDM receive processing (e.g., symbol detection) using reservoir computing (RC), a framework within recurrent neural networks (RNNs), which outperforms both conventional and other learning-based MIMO detectors. Our analysis provides a signal processing-based, first-principles understanding of the corresponding operation of the RC. Building on this fundamental understanding, we are able to systematically incorporate the domain knowledge of wireless systems (e.g., channel statistics) into the design of the underlying RNN by directly configuring the untrained RNN weights for MIMO-OFDM symbol detection. The introduced RNN weight configuration has been validated through extensive simulations demonstrating significant performance improvements. This establishes a foundation for explainable RC-based architectures in MIMO-OFDM receive processing and provides a roadmap for incorporating domain knowledge into the design of neural networks for NextG systems.
Neural Network-based Two-Dimensional Filtering for OTFS Symbol Detection
Xu, Jiarui, Said, Karim, Zheng, Lizhong, Liu, Lingjia
Orthogonal time frequency space (OTFS) is a promising modulation scheme for wireless communication in high-mobility scenarios. Recently, a reservoir computing (RC) based approach has been introduced for online subframe-based symbol detection in the OTFS system, where only the limited over-the-air (OTA) pilot symbols are utilized for training. However, the previous RC-based approach does not design the RC architecture based on the properties of the OTFS system to fully unlock the potential of RC. This paper introduces a novel two-dimensional RC (2D-RC) approach for online symbol detection on a subframe basis in the OTFS system. The 2D-RC is designed to have a two-dimensional (2D) filtering structure to equalize the 2D circular channel effect in the delay-Doppler (DD) domain of the OTFS system. With the introduced architecture, the 2D-RC can operate in the DD domain with only a single neural network, unlike our previous work which requires multiple RCs to track channel variations in the time domain. Experimental results demonstrate the advantages of the 2D-RC approach over the previous RC-based approach and the compared model-based methods across different modulation orders.
Learning at the Speed of Wireless: Online Real-Time Learning for AI-Enabled MIMO in NextG
Xu, Jiarui, Jere, Shashank, Song, Yifei, Kao, Yi-Hung, Zheng, Lizhong, Liu, Lingjia
Integration of artificial intelligence (AI) and machine learning (ML) into the air interface has been envisioned as a key technology for next-generation (NextG) cellular networks. At the air interface, multiple-input multiple-output (MIMO) and its variants such as multi-user MIMO (MU-MIMO) and massive/full-dimension MIMO have been key enablers across successive generations of cellular networks with evolving complexity and design challenges. Initiating active investigation into leveraging AI/ML tools to address these challenges for MIMO becomes a critical step towards an AI-enabled NextG air interface. At the NextG air interface, the underlying wireless environment will be extremely dynamic with operation adaptations performed on a sub-millisecond basis by MIMO operations such as MU-MIMO scheduling and rank/link adaptation. Given the enormously large number of operation adaptation possibilities, we contend that online real-time AI/ML-based approaches constitute a promising paradigm. To this end, we outline the inherent challenges and offer insights into the design of such online real-time AI/ML-based solutions for MIMO operations. An online real-time AI/ML-based method for MIMO-OFDM channel estimation is then presented, serving as a potential roadmap for developing similar techniques across various MIMO operations in NextG.
Operator SVD with Neural Networks via Nested Low-Rank Approximation
Ryu, J. Jon, Xu, Xiangxiang, Erol, H. S. Melihcan, Bu, Yuheng, Zheng, Lizhong, Wornell, Gregory W.
Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called nesting for learning the top-$L$ singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning.
2D-RC: Two-Dimensional Neural Network Approach for OTFS Symbol Detection
Xu, Jiarui, Said, Karim, Zheng, Lizhong, Liu, Lingjia
Orthogonal time frequency space (OTFS) is a promising modulation scheme for wireless communication in high-mobility scenarios. Recently, a reservoir computing (RC) based approach has been introduced for online subframe-based symbol detection in the OTFS system, where only a limited number of over-the-air (OTA) pilot symbols are utilized for training. However, this approach does not leverage the domain knowledge specific to the OTFS system to fully unlock the potential of RC. This paper introduces a novel two-dimensional RC (2D-RC) method that incorporates the domain knowledge of the OTFS system into the design for symbol detection in an online subframe-based manner. Specifically, as the channel interaction in the delay-Doppler (DD) domain is a two-dimensional (2D) circular operation, the 2D-RC is designed to have the 2D circular padding procedure and the 2D filtering structure to embed this knowledge. With the introduced architecture, 2D-RC can operate in the DD domain with only a single neural network, instead of necessitating multiple RCs to track channel variations in the time domain as in previous work. Numerical experiments demonstrate the advantages of the 2D-RC approach over the previous RC-based approach and compared model-based methods across different OTFS system variants and modulation orders.
Detect to Learn: Structure Learning with Attention and Decision Feedback for MIMO-OFDM Receive Processing
Xu, Jiarui, Li, Lianjun, Zheng, Lizhong, Liu, Lingjia
The limited over-the-air (OTA) pilot symbols in multiple-input-multiple-output orthogonal-frequency-division-multiplexing (MIMO-OFDM) systems presents a major challenge for detecting transmitted data symbols at the receiver, especially for machine learning-based approaches. While it is crucial to explore effective ways to exploit pilots, one can also take advantage of the data symbols to improve detection performance. Thus, this paper introduces an online attention-based approach, namely RC-AttStructNet-DF, that can efficiently utilize pilot symbols and be dynamically updated with the detected payload data using the decision feedback (DF) mechanism. Reservoir computing (RC) is employed in the time domain network to facilitate efficient online training. The frequency domain network adopts the novel 2D multi-head attention (MHA) module to capture the time and frequency correlations, and the structural-based StructNet to facilitate the DF mechanism. The attention loss is designed to learn the frequency domain network. The DF mechanism further enhances detection performance by dynamically tracking the channel changes through detected data symbols. The effectiveness of the RC-AttStructNet-DF approach is demonstrated through extensive experiments in MIMO-OFDM and massive MIMO-OFDM systems with different modulation orders and under various scenarios.
Towards Explainable Machine Learning: The Effectiveness of Reservoir Computing in Wireless Receive Processing
Jere, Shashank, Said, Karim, Zheng, Lizhong, Liu, Lingjia
Deep learning has seen a rapid adoption in a variety of wireless communications applications, including at the physical layer. While it has delivered impressive performance in tasks such as channel equalization and receive processing/symbol detection, it leaves much to be desired when it comes to explaining this superior performance. In this work, we investigate the specific task of channel equalization by applying a popular learning-based technique known as Reservoir Computing (RC), which has shown superior performance compared to conventional methods and other learning-based approaches. Specifically, we apply the echo state network (ESN) as a channel equalizer and provide a first principles-based signal processing understanding of its operation. With this groundwork, we incorporate the available domain knowledge in the form of the statistics of the wireless channel directly into the weights of the ESN model. This paves the way for optimized initialization of the ESN model weights, which are traditionally untrained and randomly initialized. Finally, we show the improvement in receive processing/symbol detection performance with this optimized initialization through simulations. This is a first step towards explainable machine learning (XML) and assigning practical model interpretability that can be utilized together with the available domain knowledge to improve performance and enhance detection reliability.
A Geometric Framework for Neural Feature Learning
Xu, Xiangxiang, Zheng, Lizhong
We present a novel framework for learning system design based on neural feature extractors by exploiting geometric structures in feature spaces. First, we introduce the feature geometry, which unifies statistical dependence and features in the same functional space with geometric structures. By applying the feature geometry, we formulate each learning problem as solving the optimal feature approximation of the dependence component specified by the learning setting. We propose a nesting technique for designing learning algorithms to learn the optimal features from data samples, which can be applied to off-the-shelf network architectures and optimizers. To demonstrate the application of the nesting technique, we further discuss multivariate learning problems, including conditioned inference and multimodal learning, where we present the optimal features and reveal their connections to classical approaches.