Telecommunications
SoftBank to back AI startup Perplexity at 3 billion valuation
SoftBank Group's Vision Fund 2 is investing in U.S. artificial intelligence startup Perplexity AI at a 3 billion ( 481.4 billion) valuation, Masayoshi Son's latest bet on a sector he deems crucial to securing his legacy. SoftBank will invest between 10 million and 20 million in the firm, which aims to use AI to compete with Alphabet's Google search, according to people familiar with the matter. It's investing as part of a larger 250 million funding round that triples Perplexity's valuation and makes it one of the industry's most highly valued companies. The deal underscores how SoftBank is preparing to sharply accelerate its pace of AI investment. Its billionaire founder last week laid out a sprawling vision for the future of AI, including a commitment to realize what he called "artificial superintelligence."
Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network
Dissen, Yehoshua, Yonash, Shiry, Cohen, Israel, Keshet, Joseph
In the realm of automatic speech recognition (ASR), robustness in noisy environments remains a significant challenge. Recent ASR models, such as Whisper, have shown promise, but their efficacy in noisy conditions can be further enhanced. This study is focused on recovering from packet loss to improve the word error rate (WER) of ASR models. We propose using a front-end adaptation network connected to a frozen ASR model. The adaptation network is trained to modify the corrupted input spectrum by minimizing the criteria of the ASR model in addition to an enhancement loss function. Our experiments demonstrate that the adaptation network, trained on Whisper's criteria, notably reduces word error rates across domains and languages in packet-loss scenarios. This improvement is achieved with minimal affect to Whisper model's foundational performance, underscoring our method's practicality and potential in enhancing ASR models in challenging acoustic environments.
ViT LoS V2X: Vision Transformers for Environment-aware LoS Blockage Prediction for 6G Vehicular Networks
Gharsallah, Ghazi, Kaddoum, Georges
As wireless communication technology progresses towards the sixth generation (6G), high-frequency millimeter-wave (mmWave) communication has emerged as a promising candidate for enabling vehicular networks. It offers high data rates and low-latency communication. However, obstacles such as buildings, trees, and other vehicles can cause signal attenuation and blockage, leading to communication failures that can result in fatal accidents or traffic congestion. Predicting blockages is crucial for ensuring reliable and efficient communications. Furthermore, the advent of 6G technology is anticipated to integrate advanced sensing capabilities, utilizing a variety of sensor types. These sensors, ranging from traditional RF sensors to cameras and Lidar sensors, are expected to provide access to rich multimodal data, thereby enriching communication systems with a wealth of additional contextual information. Leveraging this multimodal data becomes essential for making precise network management decisions, including the crucial task of blockage detection. In this paper, we propose a Deep Learning (DL)-based approach that combines Convolutional Neural Networks (CNNs) and customized Vision Transformers (ViTs) to effectively extract essential information from multimodal data and predict blockages in vehicular networks. Our method capitalizes on the synergistic strengths of CNNs and ViTs to extract features from time-series multimodal data, which include images and beam vectors. To capture temporal dependencies between the extracted features and the blockage state at future time steps, we employ a Gated Recurrent Unit (GRU)-based architecture. Our results show that the proposed approach achieves high accuracy and outperforms state-of-the-art solutions, achieving more than $95\%$ accurate predictions.
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
Wu, Tung-Yu, Lin, Yu-Xiang, Weng, Tsui-Wei
Neuron-level interpretations aim to explain network behaviors and properties by investigating neurons responsive to specific perceptual or structural input patterns. Although there is emerging work in the vision and language domains, none is explored for acoustic models. To bridge the gap, we introduce $\textit{AND}$, the first $\textbf{A}$udio $\textbf{N}$etwork $\textbf{D}$issection framework that automatically establishes natural language explanations of acoustic neurons based on highly-responsive audio. $\textit{AND}$ features the use of LLMs to summarize mutual acoustic features and identities among audio. Extensive experiments are conducted to verify $\textit{AND}$'s precise and informative descriptions. In addition, we demonstrate a potential use of $\textit{AND}$ for audio machine unlearning by conducting concept-specific pruning based on the generated descriptions. Finally, we highlight two acoustic model behaviors with analysis by $\textit{AND}$: (i) models discriminate audio with a combination of basic acoustic features rather than high-level abstract concepts; (ii) training strategies affect model behaviors and neuron interpretability -- supervised training guides neurons to gradually narrow their attention, while self-supervised learning encourages neurons to be polysemantic for exploring high-level features.
Towards Neural Scaling Laws for Foundation Models on Temporal Graphs
Shirzadkhani, Razieh, Ngo, Tran Gia Bao, Shamsi, Kiarash, Huang, Shenyang, Poursafaei, Farimah, Azad, Poupak, Rabbany, Reihaneh, Coskunuzer, Baris, Rabusseau, Guillaume, Akcora, Cuneyt Gurcan
The field of temporal graph learning aims to learn from evolving network data to forecast future interactions. Given a collection of observed temporal graphs, is it possible to predict the evolution of an unseen network from the same domain? To answer this question, we first present the Temporal Graph Scaling (TGS) dataset, a large collection of temporal graphs consisting of eighty-four ERC20 token transaction networks collected from 2017 to 2023. Next, we evaluate the transferability of Temporal Graph Neural Networks (TGNNs) for the temporal graph property prediction task by pre-training on a collection of up to sixty-four token transaction networks and then evaluating the downstream performance on twenty unseen token networks. We find that the neural scaling law observed in NLP and Computer Vision also applies in temporal graph learning, where pre-training on greater number of networks leads to improved downstream performance. To the best of our knowledge, this is the first empirical demonstration of the transferability of temporal graphs learning. On downstream token networks, the largest pre-trained model outperforms single model TGNNs on thirteen unseen test networks. Therefore, we believe that this is a promising first step towards building foundation models for temporal graphs.
UAV Networks Surveillance Implementing an Effective Load-Aware Multipath Routing Protocol (ELAMRP)
Vavekanand, Raja, Sam, Kira, Singh, Vijay
In this work uses innovative multi-channel load-sensing techniques to deploy unmanned aerial vehicles (UAVs) for surveillance. The research aims to improve the quality of data transmission methods and improve the efficiency and reliability of surveillance systems by exploiting the mobility and adaptability of UAVs does the proposed protocol intelligently distribute network traffic across multiple channels, considering the load of each channel, While addressing challenges such as load balancing, this study investigates the effectiveness of the protocol by simulations or practical tests on The expected results have improved UAV-based surveillance systems, more flexible and efficient networks for applications such as security, emergency response and the environment alignment of monitoring -Offering infrastructures, which contribute to efficient and reliable monitoring solutions.
Telecom Language Models: Must They Be Large?
Piovesan, Nicola, De Domenico, Antonio, Ayed, Fadhel
The increasing interest in Large Language Models (LLMs) within the telecommunications sector underscores their potential to revolutionize operational efficiency. However, the deployment of these sophisticated models is often hampered by their substantial size and computational demands, raising concerns about their viability in resource-constrained environments. Addressing this challenge, recent advancements have seen the emergence of small language models that surprisingly exhibit performance comparable to their larger counterparts in many tasks, such as coding and common-sense reasoning. Phi-2, a compact yet powerful model, exemplifies this new wave of efficient small language models. This paper conducts a comprehensive evaluation of Phi-2's intrinsic understanding of the telecommunications domain. Recognizing the scale-related limitations, we enhance Phi-2's capabilities through a Retrieval-Augmented Generation approach, meticulously integrating an extensive knowledge base specifically curated with telecom standard specifications. The enhanced Phi-2 model demonstrates a profound improvement in accuracy, answering questions about telecom standards with a precision that closely rivals the more resource-intensive GPT-3.5. The paper further explores the refined capabilities of Phi-2 in addressing problem-solving scenarios within the telecom sector, highlighting its potential and limitations.
Semantic Revolution from Communications to Orchestration for 6G: Challenges, Enablers, and Research Directions
Shokrnezhad, Masoud, Mazandarani, Hamidreza, Taleb, Tarik, Song, Jaeseung, Li, Richard
In the context of emerging 6G services, the realization of everything-to-everything interactions involving a myriad of physical and digital entities presents a crucial challenge. This challenge is exacerbated by resource scarcity in communication infrastructures, necessitating innovative solutions for effective service implementation. Exploring the potential of Semantic Communications (SemCom) to enhance point-to-point physical layer efficiency shows great promise in addressing this challenge. However, achieving efficient SemCom requires overcoming the significant hurdle of knowledge sharing between semantic decoders and encoders, particularly in the dynamic and non-stationary environment with stringent end-to-end quality requirements. To bridge this gap in existing literature, this paper introduces the Knowledge Base Management And Orchestration (KB-MANO) framework. Rooted in the concepts of Computing-Network Convergence (CNC) and lifelong learning, KB-MANO is crafted for the allocation of network and computing resources dedicated to updating and redistributing KBs across the system. The primary objective is to minimize the impact of knowledge management activities on actual service provisioning. A proof-of-concept is proposed to showcase the integration of KB-MANO with resource allocation in radio access networks. Finally, the paper offers insights into future research directions, emphasizing the transformative potential of semantic-oriented communication systems in the realm of 6G technology.
When Large Language Models Meet Optical Networks: Paving the Way for Automation
Wang, Danshi, Wang, Yidi, Jiang, Xiaotian, Zhang, Yao, Pang, Yue, Zhang, Min
Since the advent of GPT, large language models (LLMs) have brought about revolutionary advancements in all walks of life. As a superior natural language processing (NLP) technology, LLMs have consistently achieved state-of-the-art performance on numerous areas. However, LLMs are considered to be general-purpose models for NLP tasks, which may encounter challenges when applied to complex tasks in specialized fields such as optical networks. In this study, we propose a framework of LLM-empowered optical networks, facilitating intelligent control of the physical layer and efficient interaction with the application layer through an LLM-driven agent (AI-Agent) deployed in the control layer. The AI-Agent can leverage external tools and extract domain knowledge from a comprehensive resource library specifically established for optical networks. This is achieved through user input and well-crafted prompts, enabling the generation of control instructions and result representations for autonomous operation and maintenance in optical networks. To improve LLM's capability in professional fields and stimulate its potential on complex tasks, the details of performing prompt engineering, establishing domain knowledge library, and implementing complex tasks are illustrated in this study. Moreover, the proposed framework is verified on two typical tasks: network alarm analysis and network performance optimization. The good response accuracies and sematic similarities of 2,400 test situations exhibit the great potential of LLM in optical networks.
Integrating Generative AI with Network Digital Twins for Enhanced Network Operations
Muhammad, Kassi, David, Teef, Nassisid, Giulia, Farus, Tina
As telecommunications networks become increasingly complex, the integration of advanced technologies such as network digital twins and generative artificial intelligence (AI) emerges as a pivotal solution to enhance network operations and resilience. This paper explores the synergy between network digital twins, which provide a dynamic virtual representation of physical networks, and generative AI, particularly focusing on Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). We propose a novel architectural framework that incorporates these technologies to significantly improve predictive maintenance, network scenario simulation, and real-time data-driven decision-making. Through extensive simulations, we demonstrate how generative AI can enhance the accuracy and operational efficiency of network digital twins, effectively handling real-world complexities such as unpredictable traffic loads and network failures. The findings suggest that this integration not only boosts the capability of digital twins in scenario forecasting and anomaly detection but also facilitates a more adaptive and intelligent network management system.