Cloud Computing: Overviews
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Leon, Vasileios, Hanif, Muhammad Abdullah, Armeniakos, Giorgos, Jiao, Xun, Shafique, Muhammad, Pekmestzi, Kiamal, Soudris, Dimitrios
The challenging deployment of compute-intensive applications from domains such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of computing systems to explore new design approaches. Approximate Computing appears as an emerging solution, allowing to tune the quality of results in the design of a system in order to improve the energy efficiency and/or performance. This radical paradigm shift has attracted interest from both academia and industry, resulting in significant research on approximation techniques and methodologies at different design layers (from system down to integrated circuits). Motivated by the wide appeal of Approximate Computing over the last 10 years, we conduct a two-part survey to cover key aspects (e.g., terminology and applications) and review the state-of-the art approximation techniques from all layers of the traditional computing stack. In Part II of our survey, we classify and present the technical details of application-specific and architectural approximation techniques, which both target the design of resource-efficient processors/accelerators & systems. Moreover, we present a detailed analysis of the application spectrum of Approximate Computing and discuss open challenges and future directions.
An Overview on Generative AI at Scale with Edge-Cloud Computing
Wang, Yun-Cheng, Xue, Jintang, Wei, Chengwei, Kuo, C. -C. Jay
As a specific category of artificial intelligence (AI), generative artificial intelligence (GenAI) generates new content that resembles what is created by humans. The rapid development of GenAI systems has created a huge amount of new data on the Internet, posing new challenges to current computing and communication frameworks. Currently, GenAI services rely on the traditional cloud computing framework due to the need for large computation resources. However, such services will encounter high latency because of data transmission and a high volume of requests. On the other hand, edge-cloud computing can provide adequate computation power and low latency at the same time through the collaboration between edges and the cloud. Thus, it is attractive to build GenAI systems at scale by leveraging the edge-cloud computing paradigm. In this overview paper, we review recent developments in GenAI and edge-cloud computing, respectively. Then, we use two exemplary GenAI applications to discuss technical challenges in scaling up their solutions using edge-cloud collaborative systems. Finally, we list design considerations for training and deploying GenAI systems at scale and point out future research directions.
AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges
Cheng, Qian, Sahoo, Doyen, Saha, Amrita, Yang, Wenzhuo, Liu, Chenghao, Woo, Gerald, Singh, Manpreet, Saverese, Silvio, Hoi, Steven C. H.
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes, particularly in cloud infrastructures, to provide actionable insights with the primary goal of maximizing availability. There are a wide variety of problems to address, and multiple use-cases, where AI capabilities can be leveraged to enhance operational efficiency. Here we provide a review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques. We discuss in depth the key types of data emitted by IT Operations activities, the scale and challenges in analyzing them, and where they can be helpful. We categorize the key AIOps tasks as - incident detection, failure prediction, root cause analysis and automated actions. We discuss the problem formulation for each task, and then present a taxonomy of techniques to solve these problems. We also identify relatively under explored topics, especially those that could significantly benefit from advances in AI literature. We also provide insights into the trends in this field, and what are the key investment opportunities.
A Survey on Task Allocation and Scheduling in Robotic Network Systems
Alirezazadeh, Saeid, Alexandre, Luรญs A.
Cloud Robotics is helping to create a new generation of robots that leverage the nearly unlimited resources of large data centers (i.e., the cloud), overcoming the limitations imposed by on-board resources. Different processing power, capabilities, resource sizes, energy consumption, and so forth, make scheduling and task allocation critical components. The basic idea of task allocation and scheduling is to optimize performance by minimizing completion time, energy consumption, delays between two consecutive tasks, along with others, and maximizing resource utilization, number of completed tasks in a given time interval, and suchlike. In the past, several works have addressed various aspects of task allocation and scheduling. In this paper, we provide a comprehensive overview of task allocation and scheduling strategies and related metrics suitable for robotic network cloud systems. We discuss the issues related to allocation and scheduling methods and the limitations that need to be overcome. The literature review is organized according to three different viewpoints: Architectures and Applications, Methods and Parameters. In addition, the limitations of each method are highlighted for future research.
A Survey of Secure Computation Using Trusted Execution Environments
Li, Xiaoguo, Zhao, Bowen, Yang, Guomin, Xiang, Tao, Weng, Jian, Deng, Robert H.
As an essential technology underpinning trusted computing, the trusted execution environment (TEE) allows one to launch computation tasks on both on- and off-premises data while assuring confidentiality and integrity. This article provides a systematic review and comparison of TEE-based secure computation protocols. We first propose a taxonomy that classifies secure computation protocols into three major categories, namely secure outsourced computation, secure distributed computation and secure multi-party computation. To enable a fair comparison of these protocols, we also present comprehensive assessment criteria with respect to four aspects: setting, methodology, security and performance. Based on these criteria, we review, discuss and compare the state-of-the-art TEE-based secure computation protocols for both general-purpose computation functions and special-purpose ones, such as privacy-preserving machine learning and encrypted database queries. To the best of our knowledge, this article is the first survey to review TEE-based secure computation protocols and the comprehensive comparison can serve as a guideline for selecting suitable protocols for deployment in practice. Finally, we also discuss several future research directions and challenges.
Autonomy and Intelligence in the Computing Continuum: Challenges, Enablers, and Future Directions for Orchestration
Kokkonen, Henna, Lovรฉn, Lauri, Motlagh, Naser Hossein, Kumar, Abhishek, Partala, Juha, Nguyen, Tri, Pujol, Vรญctor Casamayor, Kostakos, Panos, Leppรคnen, Teemu, Gonzรกlez-Gil, Alfonso, Sola, Ester, Angulo, Iรฑigo, Liyanage, Madhusanka, Bennis, Mehdi, Tarkoma, Sasu, Dustdar, Schahram, Pirttikangas, Susanna, Riekki, Jukka
Future AI applications require performance, reliability and privacy that the existing, cloud-dependant system architectures cannot provide. In this article, we study orchestration in the device-edge-cloud continuum, and focus on edge AI for resource orchestration. We claim that to support the constantly growing requirements of intelligent applications in the device-edge-cloud computing continuum, resource orchestration needs to embrace edge AI and emphasize local autonomy and intelligence. To justify the claim, we provide a general definition for continuum orchestration, and look at how current and emerging orchestration paradigms are suitable for the computing continuum. We describe certain major emerging research themes that may affect future orchestration, and provide an early vision of an orchestration paradigm that embraces those research themes. Finally, we survey current key edge AI methods and look at how they may contribute into fulfilling the vision of future continuum orchestration.
From Traditional Adaptive Data Caching to Adaptive Context Caching: A Survey
Weerasinghe, Shakthi, Zaslavsky, Arkady, Loke, Seng W., Hassani, Alireza, Abken, Amin, Medvedev, Alexey
Context information is in demand more than ever with the rapid increase in the number of context-aware Internet of Things applications developed worldwide. Research in context and context-awareness is being conducted to broaden its applicability in light of many practical and technical challenges. One of the challenges is improving performance when responding to a large number of context queries. Context Management Platforms that infer and deliver context to applications measure this problem using Quality of Service (QoS) parameters. Although caching is a proven way to improve QoS, transiency of context and features such as variability and heterogeneity of context queries pose an additional real-time cost management problem. This paper presents a critical survey of the state-of-the-art in adaptive data caching with the objective of developing a body of knowledge in cost- and performance-efficient adaptive caching strategies. We comprehensively survey a large number of research publications and evaluate, compare, and contrast different techniques, policies, approaches, and schemes in adaptive caching. Our critical analysis is motivated by the focus on adaptively caching context as a core research problem. A formal definition for adaptive context caching is then proposed, followed by identified features and requirements of a well-designed, objective optimal adaptive context caching strategy.
Topology-aware Federated Learning in Edge Computing: A Comprehensive Survey
Wu, Jiajun, Drew, Steve, Dong, Fan, Zhu, Zhuangdi, Zhou, Jiayu
The ultra-low latency requirements of 5G/6G applications and privacy constraints call for distributed machine learning systems to be deployed at the edge. With its simple yet effective approach, federated learning (FL) is proved to be a natural solution for massive user-owned devices in edge computing with distributed and private training data. Most vanilla FL algorithms based on FedAvg follow a naive star topology, ignoring the heterogeneity and hierarchy of the volatile edge computing architectures and topologies in reality. In this paper, we conduct a comprehensive survey on the existing work of optimized FL models, frameworks, and algorithms with a focus on their network topologies. After a brief recap of FL and edge computing networks, we introduce various types of edge network topologies, along with the optimizations under the aforementioned network topologies. Lastly, we discuss the remaining challenges and future works for applying FL in topology-specific edge networks.
Need of 6G for the Metaverse Realization
Siniarski, Bartlomiej, De Alwis, Chamitha, Yenduri, Gokul, Huynh-The, Thien, Gรr, Gรrkan, Gadekallu, Thippa Reddy, Liyanage, Madhusanka
The concept of the Metaverse aims to bring a fully-fledged extended reality environment to provide next generation applications and services. Development of the Metaverse is backed by many technologies, including, 5G, artificial intelligence, edge computing and extended reality. The advent of 6G is envisaged to mark a significant milestone in the development of the Metaverse, facilitating near-zero-latency, a plethora of new services and upgraded real-world infrastructure. This paper establishes the advantages of providing the Metaverse services over 6G along with an overview of the demanded technical requirements. The paper provides an insight to the concepts of the Metaverse and the envisaged technical capabilities of 6G mobile networks. Then, the technical aspects covering 6G for the development of the Metaverse, ranging from validating digital assets, interoperability, and efficient user interaction in the Metaverse to related security and privacy aspects are elaborated. Subsequently, the role of 6G technologies towards enabling the Metaverse, including artificial intelligence, blockchain, open radio access networks, edge computing, cloudification and internet of everything. The paper also presents 6G integration challenges and outlines ongoing projects towards developing the Metaverse technologies to facilitate the Metaverse applications and services.
Enabling All In-Edge Deep Learning: A Literature Review
Joshi, Praveen, Hasanuzzaman, Mohammed, Thapa, Chandra, Afli, Haithem, Scully, Ted
In recent years, deep learning (DL) models have demonstrated remarkable achievements on non-trivial tasks such as speech recognition and natural language understanding. One of the significant contributors to its success is the proliferation of end devices that acted as a catalyst to provide data for data-hungry DL models. However, computing DL training and inference is the main challenge. Usually, central cloud servers are used for the computation, but it opens up other significant challenges, such as high latency, increased communication costs, and privacy concerns. To mitigate these drawbacks, considerable efforts have been made to push the processing of DL models to edge servers. Moreover, the confluence point of DL and edge has given rise to edge intelligence (EI). This survey paper focuses primarily on the fifth level of EI, called all in-edge level, where DL training and inference (deployment) are performed solely by edge servers. All in-edge is suitable when the end devices have low computing resources, e.g., Internet-of-Things, and other requirements such as latency and communication cost are important in mission-critical applications, e.g., health care. Firstly, this paper presents all in-edge computing architectures, including centralized, decentralized, and distributed. Secondly, this paper presents enabling technologies, such as model parallelism and split learning, which facilitate DL training and deployment at edge servers. Thirdly, model adaptation techniques based on model compression and conditional computation are described because the standard cloud-based DL deployment cannot be directly applied to all in-edge due to its limited computational resources. Fourthly, this paper discusses eleven key performance metrics to evaluate the performance of DL at all in-edge efficiently. Finally, several open research challenges in the area of all in-edge are presented.