Goto

Collaborating Authors

 layer input


Lowering PyTorch's Memory Consumption for Selective Differentiation

Bhatia, Samarth, Dangel, Felix

arXiv.org Artificial Intelligence

Memory is a limiting resource for many deep learning tasks. Beside the neural network weights, one main memory consumer is the computation graph built up by automatic differentiation (AD) for backpropagation. We observe that PyTorch's current AD implementation neglects information about parameter differentiability when storing the computation graph. This information is useful though to reduce memory whenever gradients are requested for a parameter subset, as is the case in many modern fine-tuning tasks. Specifically, inputs to layers that act linearly in their parameters (dense, convolution, or normalization layers) can be discarded whenever the parameters are marked as non-differentiable. We provide a drop-in, differentiability-agnostic implementation of such layers and demonstrate its ability to reduce memory without affecting run time.


Introducing an ensemble method for the early detection of Alzheimer's disease through the analysis of PET scan images

Borji, Arezoo, Hejazi, Taha-Hossein, Seifi, Abbas

arXiv.org Artificial Intelligence

Alzheimer's disease is a progressive neurodegenerative disorder that primarily affects cognitive functions such as memory, thinking, and behavior. In this disease, there is a critical phase, mild cognitive impairment, that is really important to be diagnosed early since some patients with progressive MCI will develop the disease. This study delves into the challenging task of classifying Alzheimer's patients into four distinct groups: control normal (CN), progressive mild cognitive impairment (pMCI), stable mild cognitive impairment (sMCI), and Alzheimer's disease (AD). This classification is based on a thorough examination of PET scan images obtained from the ADNI dataset, which provides a thorough understanding of the disease's progression. Several deep-learning and traditional machine-learning models have been used to detect Alzheimer's disease. In this paper, three deep-learning models, namely VGG16 and AlexNet, and a custom Convolutional neural network (CNN) with 8-fold cross-validation have been used for classification. Finally, an ensemble technique is used to improve the overall result of these models. The classofication results show that using deep-learning models to tell the difference between MCI patients gives an overall average accuracy of 93.13% and an AUC of 94.4%. Keywords: Alzheimer's disease; convolutional neural networks; PET scan images; voxel-based morphometry; ensemble methods.


Imitation Learning Inputting Image Feature to Each Layer of Neural Network

Yamane, Koki, Sakaino, Sho, Tsuji, Toshiaki

arXiv.org Artificial Intelligence

Imitation learning enables robots to learn and replicate human behavior from training data. Recent advances in machine learning enable end-to-end learning approaches that directly process high-dimensional observation data, such as images. However, these approaches face a critical challenge when processing data from multiple modalities, inadvertently ignoring data with a lower correlation to the desired output, especially when using short sampling periods. This paper presents a useful method to address this challenge, which amplifies the influence of data with a relatively low correlation to the output by inputting the data into each neural network layer. The proposed approach effectively incorporates diverse data sources into the learning process. Through experiments using a simple pick-and-place operation with raw images and joint information as input, significant improvements in success rates are demonstrated even when dealing with data from short sampling periods.


COMET: Coverage-guided Model Generation For Deep Learning Library Testing

Li, Meiziniu, Cao, Jialun, Tian, Yongqiang, Li, Tsz On, Wen, Ming, Cheung, Shing-Chi

arXiv.org Artificial Intelligence

Recent deep learning (DL) applications are mostly built on top of DL libraries. The quality assurance of these libraries is critical to the dependable deployment of DL applications. Techniques have been proposed to generate various DL models and apply them to test these libraries. However, their test effectiveness is constrained by the diversity of layer API calls in their generated DL models. Our study reveals that these techniques can cover at most 34.1% layer inputs, 25.9% layer parameter values, and 15.6% layer sequences. As a result, we find that many bugs arising from specific layer API calls (i.e., specific layer inputs, parameter values, or layer sequences) can be missed by existing techniques. Because of this limitation, we propose COMET to effectively generate DL models with diverse layer API calls for DL library testing. COMET: (1) designs a set of mutation operators and a coverage-based search algorithm to diversify layer inputs, layer parameter values, and layer sequences in DL models. (2) proposes a model synthesis method to boost the test efficiency without compromising the layer API call diversity. Our evaluation result shows that COMET outperforms baselines by covering twice as many layer inputs (69.7% vs. 34.1%), layer parameter values (50.2% vs. 25.9%), and layer sequences (39.0% vs. 15.6%) as those by the state-of-the-art. Moreover, COMET covers 3.4% more library branches than those by existing techniques. Finally, COMET detects 32 new bugs in the latest version of eight popular DL libraries, including TensorFlow and MXNet, with 21 of them confirmed by DL library developers and 7 of those confirmed bugs have been fixed by developers.


Integer-Only Inference for Deep Learning in Native C

#artificialintelligence

Integer-only inference allows for the compression of deep learning models for deployment on low-compute and low-latency devices. Many embedded devices are programmed using native C and do not support floating-point operations and dynamic allocation. Nevertheless, small deep learning models can be deployed to such devices with an integer-only inference pipeline through uniform quantization and the fixed-point representation. We employed these methods to deploy a deep reinforcement learning (RL) model on a network interface card (NIC) (Tessler et al. 2021[1]). Successfully deploying the RL model required inference latency of O(microseconds) on a device with no floating-point operation support.


10 most impressive Research Papers around Artificial Intelligence

@machinelearnbot

Artificial Intelligence research advances are transforming technology as we know it. The AI research community is solving some of the most technology problems related to software and hardware infrastructure, theory and algorithms. Interestingly, the field of AI AI research has drawn acolytes from the non-tech field as well. Case in point -- prolific Hollywood actor Kristen Stewart's highly publicized paper on Artificial Intelligence, originally published at Cornell University library's open access site. Stewart co-authored the paper, titled "Bringing Impressionism to Life with Neural Style Transfer in Come Swim" with the American poet and literary critic David Shapiro and Adobe Research Engineer Bhautik Joshi.


10 most impressive Research Papers around Artificial Intelligence

#artificialintelligence

Artificial Intelligence research advances are transforming technology as we know it. The AI research community is solving some of the most technology problems related to software and hardware infrastructure, theory and algorithms. Interestingly, the field of AI AI research has drawn acolytes from non-tech field as well. Case in point -- prolific Hollywood actor Kristen Stewart's highly publicized paper on Artificial Intelligence, originally published at Cornell University library's open access site. Stewart co-authored the paper, titled "Bringing Impressionism to Life with Neural Style Transfer in Come Swim" with American poet and literary critic David Shapiro and Adobe Research Engineer Bhautik Joshi.