AITopics | AskariHemmat, MohammadHossein

Collaborating Authors

AskariHemmat, MohammadHossein

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

QGen: On the Ability to Generalize in Quantization Aware Training

AskariHemmat, MohammadHossein, Jeddi, Ahmadreza, Hemmat, Reyhane Askari, Lazarevich, Ivan, Hoffman, Alexander, Sah, Sudhakar, Saboori, Ehsan, Savaria, Yvon, David, Jean-Pierre

arXiv.org Artificial IntelligenceApr-19-2024

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance. In particular, first, we develop a theoretical model for quantization in neural networks and demonstrate how quantization functions as a form of regularization. Second, motivated by recent work connecting the sharpness of the loss landscape and generalization, we derive an approximate bound for the generalization of quantized models conditioned on the amount of quantization noise. We then validate our hypothesis by experimenting with over 2000 models trained on CIFAR-10, CIFAR-100, and ImageNet datasets on convolutional and transformer-based models.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2404.11769

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepliteRT: Computer Vision at the Edge

Ashfaq, Saad, Hoffman, Alexander, Mitra, Saptarshi, Sah, Sudhakar, AskariHemmat, MohammadHossein, Saboori, Ehsan

arXiv.org Artificial IntelligenceSep-19-2023

The proliferation of edge devices has unlocked unprecedented opportunities for deep learning model deployment in computer vision applications. However, these complex models require considerable power, memory and compute resources that are typically not available on edge platforms. Ultra low-bit quantization presents an attractive solution to this problem by scaling down the model weights and activations from 32-bit to less than 8-bit. We implement highly optimized ultra low-bit convolution operators for ARM-based targets that outperform existing methods by up to 4.34x. Our operator is implemented within Deeplite Runtime (DeepliteRT), an end-to-end solution for the compilation, tuning, and inference of ultra low-bit models on ARM devices. Compiler passes in DeepliteRT automatically convert a fake-quantized model in full precision to a compact ultra low-bit representation, easing the process of quantized model deployment on commodity hardware. We analyze the performance of DeepliteRT on classification and detection models against optimized 32-bit floating-point, 8-bit integer, and 2-bit baselines, achieving significant speedups of up to 2.20x, 2.33x and 2.17x, respectively.

computer vision, deep learning, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2309.10878

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables

Ganji, Darshan C., Ashfaq, Saad, Saboori, Ehsan, Sah, Sudhakar, Mitra, Saptarshi, AskariHemmat, MohammadHossein, Hoffman, Alexander, Hassanien, Ahmed, Léonardon, Mathieu

arXiv.org Artificial IntelligenceApr-18-2023

Quantization methods such as Learned Step Size ResNet34 74.1% 74.1% 72.4% Quantization can achieve model accuracy that is comparable ResNet50 76.9% 76.8% 74.6% to full-precision floating-point baselines even with subbyte VGG16 73.4% 73.5% 71.4% quantization. However, it is extremely challenging to deploy these ultra low-bit quantized models on mainstream CPU devices because commodity SIMD (Single Instruction, line, but achieving low latency inference with ultra low-bit Multiple Data) hardware typically supports no less than models on general purpose processors (GPPs) remains an 8-bit precision. To overcome this limitation, we propose active area of research [8, 11, 19]. DeepGEMM, a lookup table based approach for the execution Deep learning workloads on CPUs are typically accelerated of ultra low-precision convolutional neural networks by exploiting data-level parallelism through SIMD on SIMD hardware. The proposed method precomputes all programming. However, ultra low-bit deep learning operators possible products of weights and activations, stores them in can not be efficiently executed on these devices because a lookup table, and efficiently accesses them at inference sub-8-bit instructions are not generally supported in time to avoid costly multiply-accumulate operations. Our the vectorized instruction sets of mainstream CPU architectures 2-bit implementation outperforms corresponding 8-bit integer including SSE/AVX instructions on x86 and Neon instructions kernels in the QNNPACK framework by up to 1.74 on on Arm. Therefore, to enable ultra low-precision x86 platforms.

artificial intelligence, machine learning, opération, (18 more...)

arXiv.org Artificial Intelligence

2304.09049

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deeplite Neutrino: An End-to-End Framework for Constrained Deep Learning Model Optimization

Sankaran, Anush, Mastropietro, Olivier, Saboori, Ehsan, Idris, Yasser, Sawyer, Davis, AskariHemmat, MohammadHossein, Hacene, Ghouthi Boukli

arXiv.org Artificial IntelligenceJan-13-2021

Designing deep learning-based solutions is becoming a race for training deeper models with a greater number of layers. While a large-size deeper model could provide competitive accuracy, it creates a lot of logistical challenges and unreasonable resource requirements during development and deployment. This has been one of the key reasons for deep learning models not being excessively used in various production environments, especially in edge devices. There is an immediate requirement for optimizing and compressing these deep learning models, to enable on-device intelligence. In this research, we introduce a black-box framework, Deeplite Neutrino for production-ready optimization of deep learning models. The framework provides an easy mechanism for the end-users to provide constraints such as a tolerable drop in accuracy or target size of the optimized models, to guide the whole optimization process. The framework is easy to include in an existing production pipeline and is available as a Python Package, supporting PyTorch and Tensorflow libraries. The optimization performance of the framework is shown across multiple benchmark datasets and popular deep learning models. Further, the framework is currently used in production and the results and testimonials from several clients are summarized.

compression, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2101.04073

Genre: Research Report (0.65)

Industry: Information Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

U-Net Fixed-Point Quantization for Medical Image Segmentation

AskariHemmat, MohammadHossein, Honari, Sina, Rouhier, Lucas, Perone, Christian S., Cohen-Adad, Julien, Savaria, Yvon, David, Jean-Pierre

arXiv.org Machine LearningAug-2-2019

Model quantization is leveraged to reduce the memory consumption and the computation time of deep neural networks. This is achieved by representing weights and activations with a lower bit resolution when compared to their high precision floating point counterparts. The suitable level of quantization is directly related to the model performance. Lowering the quantization precision (e.g. 2 bits), reduces the amount of memory required to store model parameters and the amount of logic required to implement computational blocks, which contributes to reducing the power consumption of the entire system. These benefits typically come at the cost of reduced accuracy. The main challenge is to quantize a network as much as possible, while maintaining the performance accuracy. In this work, we present a quantization method for the U-Net architecture, a popular model in medical image segmentation. We then apply our quantization algorithm to three datasets: (1) the Spinal Cord Gray Matter Segmentation (GM), (2) the ISBI challenge for segmentation of neuronal structures in Electron Microscopic (EM), and (3) the public National Institute of Health (NIH) dataset for pancreas segmentation in abdominal CT scans. The reported results demonstrate that with only 4 bits for weights and 6 bits for activations, we obtain 8 fold reduction in memory requirements while loosing only 2.21%, 0.57% and 2.09% dice overlap score for EM, GM and NIH datasets respectively. Our fixed point quantization provides a flexible trade off between accuracy and memory requirement which is not provided by previous quantization methods for U-Net such as TernaryNet.

deep learning, neural network, quantization, (20 more...)

arXiv.org Machine Learning

1908.01073

Country: North America > Canada (0.15)

Genre: Research Report (0.70)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback