AITopics | depthshrinker

Collaborating Authors

depthshrinker

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

Fu, Yonggan, Yang, Haichuan, Yuan, Jiayi, Li, Meng, Wan, Cheng, Krishnamoorthi, Raghuraman, Chandra, Vikas, Lin, Yingyan Celine

arXiv.org Artificial IntelligenceJan-3-2025

Efficient deep neural network (DNN) models equipped with compact operators (e.g., depthwise convolutions) have shown great potential in reducing DNNs' theoretical complexity (e.g., the total number of weights/operations) while maintaining a decent model accuracy. However, existing efficient DNNs are still limited in fulfilling their promise in boosting real-hardware efficiency, due to their commonly adopted compact operators' low hardware utilization. In this work, we open up a new compression paradigm for developing real-hardware efficient DNNs, leading to boosted hardware efficiency while maintaining model accuracy. Interestingly, we observe that while some DNN layers' activation functions help DNNs' training optimization and achievable accuracy, they can be properly removed after training without compromising the model accuracy. Inspired by this observation, we propose a framework dubbed DepthShrinker, which develops hardware-friendly compact networks via shrinking the basic building blocks of existing efficient DNNs that feature irregular computation patterns into dense ones with much improved hardware utilization and thus real-hardware efficiency. Excitingly, our DepthShrinker framework delivers hardware-friendly compact networks that outperform both state-of-the-art efficient DNNs and compression techniques, e.g., a 3.06% higher accuracy and 1.53$\times$ throughput on Tesla V100 over SOTA channel-wise pruning method MetaPruning. Our codes are available at: https://github.com/facebookresearch/DepthShrinker.

artificial intelligence, depthshrinker, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2206.00843

Country: North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming

Kim, Jinuk, Jeong, Yeonwoo, Lee, Deokjae, Song, Hyun Oh

arXiv.org Artificial IntelligenceJun-2-2023

Recent works on neural network pruning advocate that reducing the depth of the network is more effective in reducing run-time memory usage and accelerating inference latency than reducing the width of the network through channel pruning. In this regard, some recent works propose depth compression algorithms that merge convolution layers. However, the existing algorithms have a constricted search space and rely on human-engineered heuristics. In this paper, we propose a novel depth compression algorithm which targets general convolution operations. We propose a subset selection problem that replaces inefficient activation layers with identity functions and optimally merges consecutive convolution operations into shallow equivalent convolution operations for efficient end-to-end inference latency. Since the proposed subset selection problem is NP-hard, we formulate a surrogate optimization problem that can be solved exactly via two-stage dynamic programming within a few seconds. We evaluate our methods and baselines by TensorRT for a fair inference latency comparison. Our method outperforms the baseline method with higher accuracy and faster inference speed in MobileNetV2 on the ImageNet dataset. Specifically, we achieve $1.41\times$ speed-up with $0.11$\%p accuracy gain in MobileNetV2-1.0 on the ImageNet.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2301.12187

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.92)

Add feedback

Poor Hardware Utilization Puts Squeeze on AI Compression

#artificialintelligenceJul-24-2022, 15:55:30 GMT

One of the most pressing challenges in deploying deep learning at scale, especially for social media giant, Meta, is making full use of hardware for inference as well as training. Researchers have been chipping away at this problem via various compression and pruning techniques, the most recent of which is MetaPruning, which in 2019 represented the state of the art in pruning for maximum hardware efficiency. This has been in use at Meta (although oddly, the techniques were developed by a collection of universities in Asia and are not connected with Facebook/Meta efforts). Despite hardware efficiency gains, there is still plenty of room for improvement, according to researchers from Meta and Rice University. The team is taking a closer look at the hardware efficiencies left on the table using more traditional compression techniques for deep learning training tasks, all without sacrificing accuracy.

accuracy, depthshrinker, hardware efficiency, (7 more...)

#artificialintelligence

Country: Asia (0.26)

Technology:

Information Technology > Communications > Social Media (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback