AITopics | Huang, Guyue

Collaborating Authors

Huang, Guyue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Zhao, Tianchen, Ning, Xuefei, Fang, Tongcheng, Liu, Enshu, Huang, Guyue, Lin, Zinan, Yan, Shengen, Dai, Guohao, Wang, Yu

arXiv.org Artificial IntelligenceMay-29-2024

Diffusion models have achieved significant visual generation quality. However, their significant computational and memory costs pose challenge for their application on resource-constrained mobile devices or even desktop GPUs. Recent few-step diffusion models reduces the inference time by reducing the denoising steps. However, their memory consumptions are still excessive. The Post Training Quantization (PTQ) replaces high bit-width FP representation with low-bit integer values (INT4/8) , which is an effective and efficient technique to reduce the memory cost. However, when applying to few-step diffusion models, existing quantization methods face challenges in preserving both the image quality and text alignment. To address this issue, we propose an mixed-precision quantization framework - MixDQ. Firstly, We design specialized BOS-aware quantization method for highly sensitive text embedding quantization. Then, we conduct metric-decoupled sensitivity analysis to measure the sensitivity of each layer. Finally, we develop an integer-programming-based method to conduct bit-width allocation. While existing quantization methods fall short at W8A8, MixDQ could achieve W8A8 without performance loss, and W4A8 with negligible visual degradation. Compared with FP16, we achieve 3-4x reduction in model size and memory cost, and 1.45x latency speedup.

artificial intelligence, machine learning, quantization, (17 more...)

arXiv.org Artificial Intelligence

2405.17873

Country:

Asia > China (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs

Wang, Yuke, Feng, Boyuan, Wang, Zheng, Huang, Guyue, Ding, Yufei

arXiv.org Artificial IntelligenceMay-31-2023

Recently, graph neural networks (GNNs), as the backbone of graph-based machine learning, demonstrate great success in various domains (e.g., e-commerce). However, the performance of GNNs is usually unsatisfactory due to the highly sparse and irregular graph-based operations. To this end, we propose TC-GNN, the first GNN acceleration framework based on GPU Tensor Core Units (TCUs). The core idea is to reconcile the "Sparse" GNN computation with the high-performance "Dense" TCUs. Specifically, we conduct an in-depth analysis of the sparse operations in mainstream GNN computing frameworks. We introduce a novel sparse graph translation technique to facilitate TCU processing of the sparse GNN workload. We implement an effective CUDA core and TCU collaboration design to fully utilize GPU resources. We integrate TC-GNN with the PyTorch framework for high programmability. Rigorous experiments show an average of 1.70X speedup over the state-of-the-art DGL framework across various models and datasets.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2112.02052

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Machine Learning for Electronic Design Automation: A Survey

Huang, Guyue, Hu, Jingbo, He, Yifan, Liu, Jialong, Ma, Mingyuan, Shen, Zhaoyang, Wu, Juejian, Xu, Yuanfan, Zhang, Hengrui, Zhong, Kai, Ning, Xuefei, Ma, Yuzhe, Yang, Haoyu, Yu, Bei, Yang, Huazhong, Wang, Yu

arXiv.org Artificial IntelligenceJan-10-2021

In recent years, with the development of semiconductor technology, the scale of integrated circuit (IC) has grown exponentially, challenging the scalability and reliability of the circuit design flow. Therefore, EDA algorithms and software are required to be more effective and efficient to deal with extremely large search space with low latency. Machine learning (ML) is taking an important role in our lives these days, which has been widely used in many scenarios. ML methods, including traditional and deep learning algorithms, achieve amazing performance in solving classification, detection, and design space exploration problems. Additionally, ML methods show great potential to generate high-quality solutions for many NP-complete (NPC) problems, which are common in the EDA field, while traditional methods lead to huge time and resource consumption to solve these problems. Traditional methods usually solve every problem from the beginning, with a lack of knowledge accumulation. Instead, ML algorithms focus on extracting high-level features or patterns that can be reused in other related or similar situations, avoiding repeated complicated analysis. Therefore, applying machine learning methods is a promising direction to accelerate the solving of EDA problems. These authors are ordered alphabetically.

deep learning, neural network, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2102.03357

Country:

Asia > China (0.14)
North America > United States (0.14)
Europe > France (0.14)
Asia > Japan (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback