We produce a multitude of attributes (characteristics attached to an entity -- building, parcel, etc.) using various sources such as aerial imagery. The idea is to build Deep Learning models from a few thousand buildings using in-house-tagged labels or existing labels from open data. In a second step, the models are deployed on the whole French territory, which represents more than 35 million images to process (i.e. 4 TB of data to deal with). This second step is the focus of this post. The challenge is to be able to infer at low cost and in a short amount of time, (less than a day).
Video analytics pipelines have steadily shifted to edge deployments to reduce bandwidth overheads and privacy violations, but in doing so, face an ever-growing resource tension. Most notably, edge-box GPUs lack the memory needed to concurrently house the growing number of (increasingly complex) models for real-time inference. Unfortunately, existing solutions that rely on time/space sharing of GPU resources are insufficient as the required swapping delays result in unacceptable frame drops and accuracy violations. We present model merging, a new memory management technique that exploits architectural similarities between edge vision models by judiciously sharing their layers (including weights) to reduce workload memory costs and swapping delays. Our system, GEMEL, efficiently integrates merging into existing pipelines by (1) leveraging several guiding observations about per-model memory usage and inter-layer dependencies to quickly identify fruitful and accuracy-preserving merging configurations, and (2) altering edge inference schedules to maximize merging benefits. Experiments across diverse workloads reveal that GEMEL reduces memory usage by up to 60.7%, and improves overall accuracy by 8-39% relative to time/space sharing alone.
Edge technology aims to bring Cloud resources (specifically, the compute, storage, and network) to the closed proximity of the Edge devices, i.e., smart devices where the data are produced and consumed. Embedding computing and application in Edge devices lead to emerging of two new concepts in Edge technology, namely, Edge computing and Edge analytics. Edge analytics uses some techniques or algorithms to analyze the data generated by the Edge devices. With the emerging of Edge analytics, the Edge devices have become a complete set. Currently, Edge analytics is unable to provide full support for the execution of the analytic techniques. The Edge devices cannot execute advanced and sophisticated analytic algorithms following various constraints such as limited power supply, small memory size, limited resources, etc. This article aims to provide a detailed discussion on Edge analytics. A clear explanation to distinguish between the three concepts of Edge technology, namely, Edge devices, Edge computing, and Edge analytics, along with their issues. Furthermore, the article discusses the implementation of Edge analytics to solve many problems in various areas such as retail, agriculture, industry, and healthcare. In addition, the research papers of the state-of-the-art edge analytics are rigorously reviewed in this article to explore the existing issues, emerging challenges, research opportunities and their directions, and applications.
There has been a rise in the use of Machine Learning as a Service (MLaaS) Vision APIs as they offer multiple services including pre-built models and algorithms, which otherwise take a huge amount of resources if built from scratch. As these APIs get deployed for high-stakes applications, it's very important that they are robust to different manipulations. Recent works have only focused on typical adversarial attacks when evaluating the robustness of vision APIs. We propose two new aspects of adversarial image generation methods and evaluate them on the robustness of Google Cloud Vision API's optical character recognition service and object detection APIs deployed in real-world settings such as sightengine.com, picpurify.com, Google Cloud Vision API, and Microsoft Azure's Computer Vision API. Specifically, we go beyond the conventional small-noise adversarial attacks and introduce secret embedding and transparent adversarial examples as a simpler way to evaluate robustness. These methods are so straightforward that even non-specialists can craft such attacks. As a result, they pose a serious threat where APIs are used for high-stakes applications. Our transparent adversarial examples successfully evade state-of-the art object detections APIs such as Azure Cloud Vision (attack success rate 52%) and Google Cloud Vision (attack success rate 36%). 90% of the images have a secret embedded text that successfully fools the vision of time-limited humans but is detected by Google Cloud Vision API's optical character recognition. Complementing to current research, our results provide simple but unconventional methods on robustness evaluation.
Many adversarial attacks and defenses have recently been proposed for Deep Neural Networks (DNNs). While most of them are in the white-box setting, which is impractical, a new class of query-based hard-label (QBHL) black-box attacks pose a significant threat to real-world applications (e.g., Google Cloud, Tencent API). Till now, there has been no generalizable and practical approach proposed to defend against such attacks. This paper proposes and evaluates PredCoin, a practical and generalizable method for providing robustness against QBHL attacks. PredCoin poisons the gradient estimation step, an essential component of most QBHL attacks. PredCoin successfully identifies gradient estimation queries crafted by an attacker and introduces uncertainty to the output. Extensive experiments show that PredCoin successfully defends against four state-of-the-art QBHL attacks across various settings and tasks while preserving the target model's overall accuracy. PredCoin is also shown to be robust and effective against several defense-aware attacks, which may have full knowledge regarding the internal mechanisms of PredCoin.
The remarkable predictive performance of deep neural networks (DNNs) has led to their adoption in service domains of unprecedented scale and scope. However, the widespread adoption and growing commercialization of DNNs have underscored the importance of intellectual property (IP) protection. Devising techniques to ensure IP protection has become necessary due to the increasing trend of outsourcing the DNN computations on the untrusted accelerators in cloud-based services. The design methodologies and hyper-parameters of DNNs are crucial information, and leaking them may cause massive economic loss to the organization. Furthermore, the knowledge of DNN's architecture can increase the success probability of an adversarial attack where an adversary perturbs the inputs and alter the prediction. In this work, we devise a two-stage attack methodology "DeepPeep" which exploits the distinctive characteristics of design methodologies to reverse-engineer the architecture of building blocks in compact DNNs. We show the efficacy of "DeepPeep" on P100 and P4000 GPUs. Additionally, we propose intelligent design maneuvering strategies for thwarting IP theft through the DeepPeep attack and proposed "Secure MobileNet-V1". Interestingly, compared to vanilla MobileNet-V1, secure MobileNet-V1 provides a significant reduction in inference latency ($\approx$60%) and improvement in predictive performance ($\approx$2%) with very-low memory and computation overheads.
As the advancement of deep learning (DL), the Internet of Things and cloud computing techniques for biomedical and healthcare problems, mobile healthcare systems have received unprecedented attention. Since DL techniques usually require enormous amount of computation, most of them cannot be directly deployed on the resource-constrained mobile and IoT devices. Hence, most of the mobile healthcare systems leverage the cloud computing infrastructure, where the data collected by the mobile and IoT devices would be transmitted to the cloud computing platforms for analysis. However, in the contested environments, relying on the cloud might not be practical at all times. For instance, the satellite communication might be denied or disrupted. We propose SAIA, a Split Artificial Intelligence Architecture for mobile healthcare systems. Unlike traditional approaches for artificial intelligence (AI) which solely exploits the computational power of the cloud server, SAIA could not only relies on the cloud computing infrastructure while the wireless communication is available, but also utilizes the lightweight AI solutions that work locally on the client side, hence, it can work even when the communication is impeded. In SAIA, we propose a meta-information based decision unit, that could tune whether a sample captured by the client should be operated by the embedded AI (i.e., keeping on the client) or the networked AI (i.e., sending to the server), under different conditions. In our experimental evaluation, extensive experiments have been conducted on two popular healthcare datasets. Our results show that SAIA consistently outperforms its baselines in terms of both effectiveness and efficiency.
In the last five years, edge computing has attracted tremendous attention from industry and academia due to its promise to reduce latency, save bandwidth, improve availability, and protect data privacy to keep data secure. At the same time, we have witnessed the proliferation of AI algorithms and models which accelerate the successful deployment of intelligence mainly in cloud services. These two trends, combined together, have created a new horizon: Edge Intelligence (EI). The development of EI requires much attention from both the computer systems research community and the AI community to meet these demands. However, existing computing techniques used in the cloud are not applicable to edge computing directly due to the diversity of computing sources and the distribution of data sources. We envision that there missing a framework that can be rapidly deployed on edge and enable edge AI capabilities. To address this challenge, in this paper we first present the definition and a systematic review of EI. Then, we introduce an Open Framework for Edge Intelligence (OpenEI), which is a lightweight software platform to equip edges with intelligent processing and data sharing capability. We analyze four fundamental EI techniques which are used to build OpenEI and identify several open problems based on potential research directions. Finally, four typical application scenarios enabled by OpenEI are presented.
The volume, veracity, variability and velocity of data produced from the ever increasing network of sensors connected to Internet pose challenges for power management, scalability and sustainability of cloud computing infrastructure. Increasing the data processing capability of edge computing devices at lower power requirements can reduce the overheads for cloud computing solutions. This paper provides the review of neuromorphic CMOS-memristive architectures that can be integrated into edge computing devices. We discuss why the neuromorphic architectures are useful for edge devices and show the advantages, drawbacks and open problems in the field of memristive circuit and architectures in terms of edge computing perspective.
Chinese technology company Baidu today announced the launch of a few new services within its Baidu Open Cloud public cloud infrastructure portfolio. Baidu TianSuan (Smart Big Data) lets customers "collect, store, process and analyze big data," Baidu said in a statement. Baidu TianXiang (Smart Multimedia Cloud) includes face recognition and live video streaming, while Baidu TianGong (Intelligent IoT Service) is a full-stack platform for integrating cloud applications with internet-connected devices. The additions bring Baidu more in line with the world's leading cloud infrastructure providers, including Amazon Web Services, Microsoft Azure, and Google Cloud Platform. Amazon and Microsoft have both introduced Internet of Things (IoT) services.