Goto

Collaborating Authors

Fast convolutional neural networks on FPGAs with hls4ml

arXiv.org Machine Learning

The hls4ml library [1, 2] is an open source software designed to facilitate the deployment of machine learning (ML) models on field-programmable gate arrays (FPGAs), targeting low-latency and low-power edge applications. Taking as input a neural network model, hls4ml generates C/C code designed to be transpiled into FPGA firmware by processing it with a high-level synthesis (HLS) library. The development of hls4ml was historically driven by the need to integrate ML algorithms in the first stage of the real-time data processing of particle physics experiments operating at the CERN Large Hadron Collider (LHC). The LHC produces high-energy proton collisions (or events) every 25 ns, each consisting of about 1 MB of raw data. Since this throughput is overwhelming for the currently available processing and storage resources, the LHC experiments run a real-time event selection system, the so-called Level-1 trigger (L1T), to reduce the event rate from 40 MHz to 100 kHz [3-6]. Due to the size of the buffering system, the L1T system operates with a fixed latency of O(1 µs). While hls4ml excels as a tool to automatically generate low-latency ML firmware for L1T applications, it also offers interesting opportunities for edge-computing applications beyond particle physics whenever efficient, e.g.


ResNet, AlexNet, VGG, Inception: Understanding various architectures of Convolutional Networks

#artificialintelligence

Convolutional neural networks are fantastic for visual recognition tasks. Good ConvNets are beasts with millions of parameters and many hidden layers. In fact, a bad rule of thumb is: 'higher the number of hidden layers, better the network'. AlexNet, VGG, Inception, ResNet are some of the popular networks. Why do these networks work so well?


ResNet, AlexNet, VGGNet, Inception: Understanding various architectures of Convolutional Networks - CV-Tricks.com

#artificialintelligence

Good ConvNets are beasts with millions of parameters and many hidden layers. In fact, a bad rule of thumb is: 'higher the number of hidden layers, better the network'. AlexNet, VGG, Inception, ResNet are some of the popular networks. Why do these networks work so well? Why do they have the structures they have?


ResNet, AlexNet, VGGNet, Inception: Understanding various architectures of Convolutional Networks - CV-Tricks.com

#artificialintelligence

Good ConvNets are beasts with millions of parameters and many hidden layers. In fact, a bad rule of thumb is: 'higher the number of hidden layers, better the network'. AlexNet, VGG, Inception, ResNet are some of the popular networks. Why do these networks work so well? Why do they have the structures they have?


Conditionally Deep Hybrid Neural Networks Across Edge and Cloud

arXiv.org Machine Learning

The pervasiveness of "Internet-of-Things" in our daily life has led to a recent surge in fog computing, encompassing a collaboration of cloud computing and edge intelligence. To that effect, deep learning has been a major driving force towards enabling such intelligent systems. However, growing model sizes in deep learning pose a significant challenge towards deployment in resource-constrained edge devices. Moreover, in a distributed intelligence environment, efficient workload distribution is necessary between edge and cloud systems. To address these challenges, we propose a conditionally deep hybrid neural network for enabling AI-based fog computing. The proposed network can be deployed in a distributed manner, consisting of quantized layers and early exits at the edge and full-precision layers on the cloud. During inference, if an early exit has high confidence in the classification results, it would allow samples to exit at the edge, and the deeper layers on the cloud are activated conditionally, which can lead to improved energy efficiency and inference latency. We perform an extensive design space exploration with the goal of minimizing energy consumption at the edge while achieving state-of-the-art classification accuracies on image classification tasks. We show that with binarized layers at the edge, the proposed conditional hybrid network can process 65% of inferences at the edge, leading to 5.5x computational energy reduction with minimal accuracy degradation on CIFAR-10 dataset. For the more complex dataset CIFAR-100, we observe that the proposed network with 4-bit quantization at the edge achieves 52% early classification at the edge with 4.8x energy reduction. The analysis gives us insights on designing efficient hybrid networks which achieve significantly higher energy efficiency than full-precision networks for edge-cloud based distributed intelligence systems.