AITopics | Lee, Jinwon

Collaborating Authors

Lee, Jinwon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

APOLLO: SGD-like Memory, AdamW-level Performance

Zhu, Hanqing, Zhang, Zhenyu, Cong, Wenyan, Liu, Xi, Park, Sem, Chandra, Vikas, Long, Bo, Pan, David Z., Wang, Zhangyang, Lee, Jinwon

arXiv.org Artificial IntelligenceDec-9-2024

Large language models (LLMs) are notoriously memory-intensive during training, particularly with the popular AdamW optimizer. This memory burden necessitates using more or higher-end GPUs or reducing batch sizes, limiting training scalability and throughput. To address this, various memory-efficient optimizers have been proposed to reduce optimizer memory usage. However, they face critical challenges: (i) reliance on costly SVD operations; (ii) significant performance trade-offs compared to AdamW; and (iii) still substantial optimizer memory overhead to maintain competitive performance. In this work, we identify that AdamW's learning rate adaptation rule can be effectively coarsened as a structured learning rate update. Based on this insight, we propose Approximated Gradient Scaling for Memory-Efficient LLM Optimization (APOLLO), which approximates learning rate scaling using an auxiliary low-rank optimizer state based on pure random projection. This structured learning rate update rule makes APOLLO highly tolerant to further memory reductions while delivering comparable pre-training performance. Even its rank-1 variant, APOLLO-Mini, achieves superior pre-training performance compared to AdamW with SGD-level memory costs. Extensive experiments demonstrate that the APOLLO series performs on-par with or better than AdamW, while achieving greater memory savings by nearly eliminating the optimization states of AdamW. These savings provide significant system-level benefits: (1) Enhanced Throughput: 3x throughput on an 8xA100-80GB setup compared to AdamW by supporting 4x larger batch sizes. (2) Improved Model Scalability: Pre-training LLaMA-13B with naive DDP on A100-80GB GPUs without system-level optimizations. (3) Low-End GPU Friendly Pre-training: Pre-training LLaMA-7B on a single GPU using less than 12 GB of memory with weight quantization.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.0527

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Generative Model-based Simulation of Driver Behavior when Using Control Input Interface for Teleoperated Driving in Unstructured Canyon Terrains

Yun, Hyeonggeun, Cho, Younggeol, Lee, Jinwon, Ha, Arim, Yun, Jihyeok

arXiv.org Artificial IntelligenceMay-16-2023

Unmanned ground vehicles (UGVs) in unstructured environments mostly operate through teleoperation. To enable stable teleoperated driving in unstructured environments, some research has suggested driver assistance and evaluation methods that involve user studies, which can be costly and require lots of time and effort. A simulation model-based approach has been proposed to complement the user study; however, the models on teleoperated driving do not account for unstructured environments. Our proposed solution involves simulation models of teleoperated driving for drivers that utilize a deep generative model. Initially, we build a teleoperated driving simulator to imitate unstructured environments based on previous research and collect driving data from drivers. Then, we design and implement the simulation models based on a conditional variational autoencoder (CVAE). Our evaluation results demonstrate that the proposed teleoperated driving model can generate data by simulating the driver appropriately in unstructured canyon terrains.

artificial intelligence, machine learning, simulation model, (16 more...)

arXiv.org Artificial Intelligence

2305.09874

Genre: Research Report > New Finding (1.00)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

GoonDAE: Denoising-Based Driver Assistance for Off-Road Teleoperation

Cho, Younggeol, Yun, Hyeonggeun, Lee, Jinwon, Ha, Arim, Yun, Jihyeok

arXiv.org Artificial IntelligenceFeb-28-2023

Because of the limitations of autonomous driving technologies, teleoperation is widely used in dangerous environments such as military operations. However, the teleoperated driving performance depends considerably on the driver's skill level. Moreover, unskilled drivers need extensive training time for teleoperations in unusual and harsh environments. To address this problem, we propose a novel denoising-based driver assistance method, namely GoonDAE, for real-time teleoperated off-road driving. The unskilled driver control input is assumed to be the same as the skilled driver control input but with noise. We designed a skip-connected long short-term memory (LSTM)-based denoising autoencoder (DAE) model to assist the unskilled driver control input by denoising. The proposed GoonDAE was trained with skilled driver control input and sensor data collected from our simulated off-road driving environment. To evaluate GoonDAE, we conducted an experiment with unskilled drivers in the simulated environment. The results revealed that the proposed system considerably enhanced driving performance in terms of driving stability.

artificial intelligence, control input, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3250008

2209.03568

Genre: Research Report > Experimental Study (0.46)

Industry:

Transportation > Ground > Road (1.00)
Government > Military (1.00)
Automobiles & Trucks > Manufacturer (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

Bhalgat, Yash, Lee, Jinwon, Nagel, Markus, Blankevoort, Tijmen, Kwak, Nojun

arXiv.org Machine LearningApr-20-2020

Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that are frequently employed in popular efficient architectures can also result in negative activation values, with skewed positive and negative ranges. Typical learnable quantization schemes [PACT, LSQ] assume unsigned quantization for activations and quantize all negative activations to zero which leads to significant loss in performance. Naively using signed quantization to accommodate these negative values requires an extra sign bit which is expensive for low-bit (2-, 3-, 4-bit) quantization. To solve this problem, we propose LSQ+, a natural extension of LSQ, wherein we introduce a general asymmetric quantization scheme with trainable scale and offset parameters that can learn to accommodate the negative activations. Gradient-based learnable quantization schemes also commonly suffer from high instability or variance in the final training performance, hence requiring a great deal of hyper-parameter tuning to reach a satisfactory performance. LSQ+ alleviates this problem by using an MSE-based initialization scheme for the quantization parameters. We show that this initialization leads to significantly lower variance in final performance across multiple training runs. Overall, LSQ+ shows state-of-the-art results for EfficientNet and MixNet and also significantly outperforms LSQ for low-bit quantization of neural nets with Swish activations (e.g.: 1.8% gain with W4A4 quantization and upto 5.6% gain with W2A2 quantization of EfficientNet-B0 on ImageNet dataset). To the best of our knowledge, ours is the first work to quantize such architectures to extremely low bit-widths.

deep learning, neural network, quantization, (21 more...)

arXiv.org Machine Learning

2004.09576

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback