Telecommunications
Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms
Konan, Joseph, Bhargave, Ojas, Agnihotri, Shikhar, Han, Shuo, Zeng, Yunyang, Shah, Ankit, Raj, Bhiksha
Within the ambit of VoIP (Voice over Internet Protocol) telecommunications, the complexities introduced by acoustic transformations merit rigorous analysis. This research, rooted in the exploration of proprietary sender-side denoising effects, meticulously evaluates platforms such as Google Meets and Zoom. The study draws upon the Deep Noise Suppression (DNS) 2020 dataset, ensuring a structured examination tailored to various denoising settings and receiver interfaces. A methodological novelty is introduced via the Oaxaca decomposition, traditionally an econometric tool, repurposed herein to analyze acoustic-phonetic perturbations within VoIP systems. To further ground the implications of these transformations, psychoacoustic metrics, specifically PESQ and STOI, were harnessed to furnish a comprehensive understanding of speech alterations. Cumulatively, the insights garnered underscore the intricate landscape of VoIP-influenced acoustic dynamics. In addition to the primary findings, a multitude of metrics are reported, extending the research purview. Moreover, out-of-domain benchmarking for both time and time-frequency domain speech enhancement models is included, thereby enhancing the depth and applicability of this inquiry. Repository: github.com/deepology/VoIP-DNS-Challenge
Robust Network Slicing: Multi-Agent Policies, Adversarial Attacks, and Defensive Strategies
Wang, Feng, Gursoy, M. Cenk, Velipasalar, Senem
In this paper, we present a multi-agent deep reinforcement learning (deep RL) framework for network slicing in a dynamic environment with multiple base stations and multiple users. In particular, we propose a novel deep RL framework with multiple actors and centralized critic (MACC) in which actors are implemented as pointer networks to fit the varying dimension of input. We evaluate the performance of the proposed deep RL algorithm via simulations to demonstrate its effectiveness. Subsequently, we develop a deep RL based jammer with limited prior information and limited power budget. The goal of the jammer is to minimize the transmission rates achieved with network slicing and thus degrade the network slicing agents' performance. We design a jammer with both listening and jamming phases and address jamming location optimization as well as jamming channel optimization via deep RL. We evaluate the jammer at the optimized location, generating interference attacks in the optimized set of channels by switching between the jamming phase and listening phase. We show that the proposed jammer can significantly reduce the victims' performance without direct feedback or prior knowledge on the network slicing policies. Finally, we devise a Nash-equilibrium-supervised policy ensemble mixed strategy profile for network slicing (as a defensive measure) and jamming. We evaluate the performance of the proposed policy ensemble algorithm by applying on the network slicing agents and the jammer agent in simulations to show its effectiveness.
Qualcomm announces Snapdragon 7 Gen 3 mobile chipset with AI acceleration
Qualcomm just unveiled the latest mobile chipset to join its armada, the Snapdragon 7 Gen 3. Obviously, this is a refresh of the mid-range Snapdragon 7 Gen 2 and brings some new features to the table. We've long known that Qualcomm chips were about to get on-device AI integration, and the Snapdragon 7 Gen 3 is no exception. Nearly every aspect of this chip seems to have been designed with artificial intelligence in mind, with Qualcomm saying that the components "deliver across-the-board advancements to ignite on-device AI." This should significantly speed up generative AI applications, with advertised benchmarks of just one second to create Stable Diffusion images from a text prompt. Of course, a mobile CPU is more than just AI, despite what marketing wants you to believe, and the 7 Gen 3 seems powerful for a mid-range chipset.
Wideband Audio Waveform Evaluation Networks: Efficient, Accurate Estimation of Speech Qualities
Catellier, Andrew, Voran, Stephen
Wideband Audio Waveform Evaluation Networks (WAWEnets) are convolutional neural networks that operate directly on wideband audio waveforms in order to produce evaluations of those waveforms. In the present work these evaluations give qualities of telecommunications speech (e.g., noisiness, intelligibility, overall speech quality). WAWEnets are no-reference networks because they do not require ``reference'' (original or undistorted) versions of the waveforms they evaluate. Our initial WAWEnet publication introduced four WAWEnets and each emulated the output of an established full-reference speech quality or intelligibility estimation algorithm. We have updated the WAWEnet architecture to be more efficient and effective. Here we present a single WAWEnet that closely tracks seven different quality and intelligibility values. We create a second network that additionally tracks four subjective speech quality dimensions. We offer a third network that focuses on just subjective quality scores and achieves very high levels of agreement. This work has leveraged 334 hours of speech in 13 languages, over two million full-reference target values and over 93,000 subjective mean opinion scores. We also interpret the operation of WAWEnets and identify the key to their operation using the language of signal processing: ReLUs strategically move spectral information from non-DC components into the DC component. The DC values of 96 output signals define a vector in a 96-D latent space and this vector is then mapped to a quality or intelligibility value for the input waveform.
The FCC will crack down on ISPs to improve connectivity in poorer areas
The Federal Communications Commission (FCC) is keeping a close eye on internet providers to make sure they provide Americans with equal access to broadband services regardless of customers' "income level, race, ethnicity, color, religion or national origin." Two years after the Bipartisan Infrastructure Law became official, the FCC has adopted (PDF) a final set of relevant rules to enforce. The Commission will have the power to investigate possible instances of "digital discrimination" under the new rules and could penalize providers for violating them. It could, for instance, look into a company's pricing, network upgrades and maintenance procedures to decide whether a provider is keeping an affluent area well-maintained while failing to provide the same level of service to a low-income area. As The Wall Street Journal explains, it could even hold companies like AT&T and Comcast liable even if they weren't intentionally discriminatory, as long as their actions "differentially impact consumers' access to broadband." If the FCC does receive complaints against a particular provider, though, it will take into account any technical and economic challenges it may be facing that prevents it from providing equal access to its services.
Algebraic Topological Networks via the Persistent Local Homology Sheaf
Cesa, Gabriele, Behboodi, Arash
In this work, we introduce a novel approach based on algebraic topology to enhance graph convolution and attention modules by incorporating local topological properties of the data. To do so, we consider the framework of sheaf neural networks, which has been previously leveraged to incorporate additional structure into graph neural networks' features and construct more expressive, non-isotropic messages. Specifically, given an input simplicial complex (e.g. generated by the cliques of a graph or the neighbors in a point cloud), we construct its local homology sheaf, which assigns to each node the vector space of its local homology. The intermediate features of our networks live in these vector spaces and we leverage the associated sheaf Laplacian to construct more complex linear messages between them. Moreover, we extend this approach by considering the persistent version of local homology associated with a weighted simplicial complex (e.g., built from pairwise distances of nodes embeddings). This i) solves the problem of the lack of a natural choice of basis for the local homology vector spaces and ii) makes the sheaf itself differentiable, which enables our models to directly optimize the topology of their intermediate features.
Joint User Pairing and Beamforming Design of Multi-STAR-RISs-Aided NOMA in the Indoor Environment via Multi-Agent Reinforcement Learning
Park, Yu Min, Tun, Yan Kyaw, Hong, Choong Seon
The development of 6G/B5G wireless networks, which have requirements that go beyond current 5G networks, is gaining interest from academia and industry. However, to increase 6G/B5G network quality, conventional cellular networks that rely on terrestrial base stations are constrained geographically and economically. Meanwhile, NOMA allows multiple users to share the same resources, which improves the spectral efficiency of the system and has the advantage of supporting a larger number of users. Additionally, by intelligently manipulating the phase and amplitude of both the reflected and transmitted signals, STAR-RISs can achieve improved coverage, increased spectral efficiency, and enhanced communication reliability. However, STAR-RISs must simultaneously optimize the amplitude and phase shift corresponding to reflection and transmission, which makes the existing terrestrial networks more complicated and is considered a major challenging issue. Motivated by the above, we study the joint user pairing for NOMA and beamforming design of Multi-STAR-RISs in an indoor environment. Then, we formulate the optimization problem with the objective of maximizing the total throughput of MUs by jointly optimizing the decoding order, user pairing, active beamforming, and passive beamforming. However, the formulated problem is a MINLP. To address this challenge, we first introduce the decoding order for NOMA networks. Next, we decompose the original problem into two subproblems, namely: 1) MU pairing and 2) Beamforming optimization under the optimal decoding order. For the first subproblem, we employ correlation-based K-means clustering to solve the user pairing problem. Then, to jointly deal with beamforming vector optimizations, we propose MAPPO, which can make quick decisions in the given environment owing to its low complexity.
Fairness-Driven Optimization of RIS-Augmented 5G Networks for Seamless 3D UAV Connectivity Using DRL Algorithms
Tian, Yu, Alhammadi, Ahmed, He, Jiguang, Fakhreddine, Aymen, Bader, Faouzi
In this paper, we study the problem of joint active and passive beamforming for reconfigurable intelligent surface (RIS)-assisted massive multiple-input multiple-output systems towards the extension of the wireless cellular coverage in 3D, where multiple RISs, each equipped with an array of passive elements, are deployed to assist a base station (BS) to simultaneously serve multiple unmanned aerial vehicles (UAVs) in the same time-frequency resource of 5G wireless communications. With a focus on ensuring fairness among UAVs, our objective is to maximize the minimum signal-to-interference-plus-noise ratio (SINR) at UAVs by jointly optimizing the transmit beamforming parameters at the BS and phase shift parameters at RISs. We propose two novel algorithms to address this problem. The first algorithm aims to mitigate interference by calculating the BS beamforming matrix through matrix inverse operations once the phase shift parameters are determined. The second one is based on the principle that one RIS element only serves one UAV and the phase shift parameter of this RIS element is optimally designed to compensate the phase offset caused by the propagation and fading. To obtain the optimal parameters, we utilize one state-of-the-art reinforcement learning algorithm, deep deterministic policy gradient, to solve these two optimization problems. Simulation results are provided to illustrate the effectiveness of our proposed solution and some insightful remarks are observed.
A Fast and Simple Algorithm for computing the MLE of Amplitude Density Function Parameters
Over the last decades, the family of $\alpha$-stale distributions has proven to be useful for modelling in telecommunication systems. Particularly, in the case of radar applications, finding a fast and accurate estimation for the amplitude density function parameters appears to be very important. In this work, the maximum likelihood estimator (MLE) is proposed for parameters of the amplitude distribution. To do this, the amplitude data are \emph{projected} on the horizontal and vertical axes using two simple transformations. It is proved that the \emph{projected} data follow a zero-location symmetric $\alpha$-stale distribution for which the MLE can be computed quite fast. The average of computed MLEs based on two \emph{projections} is considered as estimator for parameters of the amplitude distribution. Performance of the proposed \emph{projection} method is demonstrated through simulation study and analysis of two sets of real radar data.
Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach
Kim, Heasung, Ankireddy, Sravan Kumar
In this work, we consider the problem of network parameter optimization for rate maximization. We frame this as a joint optimization problem of power control, beam forming, and interference cancellation. We consider the setting where multiple Base Stations (BSs) communicate with multiple user equipment (UEs). Because of the exponential computational complexity of brute force search, we instead solve this nonconvex optimization problem using deep reinforcement learning (RL) techniques. Modern communication systems are notorious for their difficulty in exactly modeling their behavior. This limits us in using RL-based algorithms as interaction with the environment is needed for the agent to explore and learn efficiently. Further, it is ill-advised to deploy the algorithm in the real world for exploration and learning because of the high cost of failure. In contrast to the previous RL-based solutions proposed, such as deep-Q network (DQN) based control, we suggest an offline model-based approach. We specifically consider discrete batch-constrained deep Q-learning (BCQ) and show that performance similar to DQN can be achieved with only a fraction of the data without exploring. This maximizes sample efficiency and minimizes risk in deploying a new algorithm to commercial networks. We provide the entire project resource, including code and data, at the following link: https://github.com/Heasung-Kim/ safe-rl-deployment-for-5g.