AITopics | Lin, Sihao

Collaborating Authors

Lin, Sihao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Training of Large Vision Models via Advanced Automated Progressive Learning

Li, Changlin, Zhang, Jiawei, Lin, Sihao, Yang, Zongxin, Liang, Junwei, Liang, Xiaodan, Chang, Xiaojun

arXiv.org Artificial IntelligenceSep-6-2024

The rapid advancements in Large Vision Models (LVMs), such as Vision Transformers (ViTs) and diffusion models, have led to an increasing demand for computational resources, resulting in substantial financial and environmental costs. This growing challenge highlights the necessity of developing efficient training methods for LVMs. Progressive learning, a training strategy in which model capacity gradually increases during training, has shown potential in addressing these challenges. In this paper, we present an advanced automated progressive learning (AutoProg) framework for efficient training of LVMs. We begin by focusing on the pre-training of LVMs, using ViTs as a case study, and propose AutoProg-One, an AutoProg scheme featuring momentum growth (MoGrow) and a one-shot growth schedule search. Beyond pre-training, we extend our approach to tackle transfer learning and fine-tuning of LVMs. We expand the scope of AutoProg to cover a wider range of LVMs, including diffusion models. First, we introduce AutoProg-Zero, by enhancing the AutoProg framework with a novel zero-shot unfreezing schedule search, eliminating the need for one-shot supernet training. Second, we introduce a novel Unique Stage Identifier (SID) scheme to bridge the gap during network growth. These innovations, integrated with the core principles of AutoProg, offer a comprehensive solution for efficient training across various LVM scenarios. Extensive experiments show that AutoProg accelerates ViT pre-training by up to 1.85x on ImageNet and accelerates fine-tuning of diffusion models by up to 2.86x, with comparable or even higher performance. This work provides a robust and scalable approach to efficient training of LVMs, with potential applications in a wide range of vision tasks. Code: https://github.com/changlin31/AutoProg-Zero

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.0035

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Self-Supervised Multi-Frame Neural Scene Flow

Liu, Dongrui, Liu, Daqi, Li, Xueqian, Lin, Sihao, xie, Hongwei, Wang, Bing, Chang, Xiaojun, Chu, Lei

arXiv.org Artificial IntelligenceMar-24-2024

Neural Scene Flow Prior (NSFP) and Fast Neural Scene Flow (FNSF) have shown remarkable adaptability in the context of large out-of-distribution autonomous driving. Despite their success, the underlying reasons for their astonishing generalization capabilities remain unclear. Our research addresses this gap by examining the generalization capabilities of NSFP through the lens of uniform stability, revealing that its performance is inversely proportional to the number of input point clouds. This finding sheds light on NSFP's effectiveness in handling large-scale point cloud scene flow estimation tasks. Motivated by such theoretical insights, we further explore the improvement of scene flow estimation by leveraging historical point clouds across multiple frames, which inherently increases the number of point clouds. Consequently, we propose a simple and effective method for multi-frame point cloud scene flow estimation, along with a theoretical evaluation of its generalization abilities. Our analysis confirms that the proposed method maintains a limited generalization error, suggesting that adding multiple frames to the scene flow optimization process does not detract from its generalizability. Extensive experimental results on large-scale autonomous driving Waymo Open and Argoverse lidar datasets demonstrate that the proposed method achieves state-of-the-art performance.

artificial intelligence, machine learning, point cloud, (17 more...)

arXiv.org Artificial Intelligence

2403.16116

Country:

Oceania > Australia (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.87)
Transportation > Ground (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Robots (0.87)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation

Liu, Li, Huang, Qingle, Lin, Sihao, Xie, Hongwei, Wang, Bing, Chang, Xiaojun, Liang, Xiaodan

arXiv.org Artificial IntelligenceFeb-8-2022

Knowledge Distillation has shown very promising ability in transferring learned representation from the larger model (teacher) to the smaller one (student). Despite many efforts, prior methods ignore the important role of retaining inter-channel correlation of features, leading to the lack of capturing intrinsic distribution of the feature space and sufficient diversity properties of features in the teacher network. To solve the issue, we propose the novel Inter-Channel Correlation for Knowledge Distillation (ICKD), with which the diversity and homology of the feature Figure 1: Illustration of inter-channel correlation. The space of the student network can align with that of channels orderly extracted from the second layer of the teacher network. The correlation between these two ResNet18 have been visualized. The channels denoted by channels is interpreted as diversity if they are irrelevant red boxes are homologous both perceptually and mathematically to each other, otherwise homology. Then the student is (e.g., inner-product), while the channels denoted by required to mimic the correlation within its own embedding orange boxes are diverse. We show the inter-channel correlation space. In addition, we introduce the grid-level interchannel can effectively measure that each channel is homologous correlation, making it capable of dense prediction or diverse to others, which further reflects the richness tasks.

artificial intelligence, distillation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2202.0368

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback