AITopics | Qiu, Yuxian

Collaborating Authors

Qiu, Yuxian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Adaptive Activation Rounding for Post-Training Quantization

Li, Zhengyi, Guo, Cong, Zhu, Zhanda, Zhou, Yangjie, Qiu, Yuxian, Gao, Xiaotian, Leng, Jingwen, Guo, Minyi

arXiv.org Artificial IntelligenceAug-23-2023

Post-training quantization attracts increasing attention due to its convenience in deploying quantized neural networks. Although rounding-to-nearest remains the prevailing method for DNN quantization, prior research has demonstrated its suboptimal nature when applied to weight quantization. They propose optimizing weight rounding schemes by leveraging output error rather than the traditional weight quantization error. Our study reveals that similar rounding challenges also extend to activation quantization. Despite the easy generalization, the challenges lie in the dynamic nature of activation. Adaptive rounding is expected for varying activations and the method is subjected to runtime overhead. To tackle this, we propose the AQuant quantization framework with a novel perspective to reduce output error by adjusting rounding schemes of activations. Instead of using the constant rounding border 0.5 of the rounding-to-nearest operation, we make the border become a function w.r.t. the activation value to change the activation rounding by the adaptive border. To deal with the runtime overhead, we use a coarse-grained version of the border function. Finally, we introduce our framework to optimize the border function. Extensive experiments show that AQuant achieves notable improvements compared to state-of-the-art works and pushes the accuracy of ResNet-18 up to 60.31% under the 2-bit weight and activation quantization.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2208.11945

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

Guo, Cong, Qiu, Yuxian, Leng, Jingwen, Gao, Xiaotian, Zhang, Chen, Liu, Yunxin, Yang, Fan, Zhu, Yuhao, Guo, Minyi

arXiv.org Artificial IntelligenceFeb-13-2022

Quantization of deep neural networks (DNN) has been proven effective for compressing and accelerating DNN models. Data-free quantization (DFQ) is a promising approach without the original datasets under privacy-sensitive and confidential scenarios. However, current DFQ solutions degrade accuracy, need synthetic data to calibrate networks, and are time-consuming and costly. This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements. With the theoretical analysis of the second-order information of DNN task loss, we decompose and approximate the Hessian-based optimization objective into three diagonal sub-items, which have different areas corresponding to three dimensions of weight tensor: element-wise, kernel-wise, and output channel-wise. Then, we progressively compose sub-items and propose a novel data-free optimization objective in the discrete domain, minimizing Constrained Absolute Sum of Error (or CASE in short), which surprisingly does not need any dataset and is even not aware of network architecture. We also design an efficient algorithm without back-propagation to further reduce the computation complexity of the objective solver. Finally, without fine-tuning and synthetic datasets, SQuant accelerates the data-free quantization process to a sub-second level with >30% accuracy improvement over the existing data-free post-training quantization works, with the evaluated models under 4-bit quantization. We have open-sourced the SQuant framework at https://github.com/clevercool/SQuant.

artificial intelligence, machine learning, on-the-fly data-free quantization, (3 more...)

arXiv.org Artificial Intelligence

2202.07471

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity

Guo, Cong, Hsueh, Bo Yang, Leng, Jingwen, Qiu, Yuxian, Guan, Yue, Wang, Zehuan, Jia, Xiaoying, Li, Xipeng, Guo, Minyi, Zhu, Yuhao

arXiv.org Artificial IntelligenceAug-29-2020

Network pruning can reduce the high computation cost of deep neural network (DNN) models. However, to maintain their accuracies, sparse models often carry randomly-distributed weights, leading to irregular computations. Consequently, sparse models cannot achieve meaningful speedup on commodity hardware (e.g., GPU) built for dense matrix computations. As such, prior works usually modify or design completely new sparsity-optimized architectures for exploiting sparsity. We propose an algorithm-software co-designed pruning method that achieves latency speedups on existing dense architectures. Our work builds upon the insight that the matrix multiplication generally breaks the large matrix into multiple smaller tiles for parallel execution. We propose a tiling-friendly "tile-wise" sparsity pattern, which maintains a regular pattern at the tile level for efficient execution but allows for irregular, arbitrary pruning at the global scale to maintain the high accuracy. We implement and evaluate the sparsity pattern on GPU tensor core, achieving a 1.95x speedup over the dense model.

deep learning, neural network, sparsity, (21 more...)

arXiv.org Artificial Intelligence

2008.13006

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Adversarial Defense Through Network Profiling Based Path Extraction

Qiu, Yuxian, Leng, Jingwen, Guo, Cong, Chen, Quan, Li, Chao, Guo, Minyi, Zhu, Yuhao

arXiv.org Machine LearningApr-17-2019

Recently, researchers have started decomposing deep neural network models according to their semantics or functions. Recent work has shown the effectiveness of decomposed functional blocks for defending adversarial attacks, which add small input perturbation to the input image to fool the DNN models. This work proposes a profiling-based method to decompose the DNN models to different functional blocks, which lead to the effective path as a new approach to exploring DNNs' internal organization. Specifically, the per-image effective path can be aggregated to the class-level effective path, through which we observe that adversarial images activate effective path different from normal images. We propose an effective path similarity-based method to detect adversarial images with an interpretable model, which achieve better accuracy and broader applicability than the state-of-the-art technique.

deep learning, effective path, neural network, (21 more...)

arXiv.org Machine Learning

1904.08089

Country:

North America > United States (0.93)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.83)

Industry:

Information Technology > Security & Privacy (0.50)
Government > Military (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback