AITopics | Jeong, Doo Seok

Collaborating Authors

Jeong, Doo Seok

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

IterNorm: Fast Iterative Normalization

Ye, ChangMin, Sim, Yonguk, Kim, Youngchae, Jin, SeongMin, Jeong, Doo Seok

arXiv.org Artificial IntelligenceDec-6-2024

Transformer-based large language models are a memory-bound model whose operation is based on a large amount of data that are marginally reused. Thus, the data movement between a host and accelerator likely dictates the total wall-clock time. Layer normalization is one of the key workloads in the transformer model, following each of multi-head attention and feed-forward network blocks. To reduce data movement, layer normalization needs to be performed on the same chip as the matrix-matrix multiplication engine. To this end, we introduce an iterative L2-normalization method for 1D input (IterNorm), ensuring fast convergence to the steady-state solution within five iteration steps and high precision, outperforming the fast inverse square root algorithm in six out of nine cases for FP32 and five out of nine for BFloat16 across the embedding lengths used in the OPT models. Implemented in 32/28nm CMOS, the IterNorm macro normalizes $d$-dimensional vectors, where $64 \leq d \leq 1024$, with a latency of 112-227 cycles at 100MHz/1.05V.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.04778

Genre: Research Report (0.64)

Industry: Semiconductors & Electronics (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Training Spiking Neural Networks Using Lessons From Deep Learning

Eshraghian, Jason K., Ward, Max, Neftci, Emre, Wang, Xinxin, Lenz, Gregor, Dwivedi, Girish, Bennamoun, Mohammed, Jeong, Doo Seok, Lu, Wei D.

arXiv.org Artificial IntelligenceAug-13-2023

The brain is the perfect place to look for inspiration to develop more efficient neural networks. The inner workings of our synapses and neurons provide a glimpse at what the future of deep learning might look like. This paper serves as a tutorial and perspective showing how to apply the lessons learnt from several decades of research in deep learning, gradient descent, backpropagation and neuroscience to biologically plausible spiking neural neural networks. We also explore the delicate interplay between encoding data as spikes and the learning process; the challenges and solutions of applying gradient-based learning to spiking neural networks (SNNs); the subtle link between temporal backpropagation and spike timing dependent plasticity, and how deep learning might move towards biologically plausible online learning. Some ideas are well accepted and commonly used amongst the neuromorphic engineering community, while others are presented or justified for the first time here. The fields of deep learning and spiking neural networks evolve very rapidly. We endeavour to treat this document as a 'dynamic' manuscript that will continue to be updated as the common practices in training SNNs also change. A series of companion interactive tutorials complementary to this paper using our Python package, snnTorch, are also made available. See https://snntorch.readthedocs.io/en/latest/tutorials/index.html .

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2109.12894

Country: North America > United States (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.85)
Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CBP: Backpropagation with constraint on weight precision using a pseudo-Lagrange multiplier method

Kim, Guhyun, Jeong, Doo Seok

arXiv.org Artificial IntelligenceOct-25-2021

Backward propagation of errors (backpropagation) is a method to minimize objective functions (e.g., loss functions) of deep neural networks by identifying optimal sets of weights and biases. Imposing constraints on weight precision is often required to alleviate prohibitive workloads on hardware. Despite the remarkable success of backpropagation, the algorithm itself is not capable of considering such constraints unless additional algorithms are applied simultaneously. To address this issue, we propose the constrained backpropagation (CBP) algorithm based on a pseudo-Lagrange multiplier method to obtain the optimal set of weights that satisfy a given set of constraints. The defining characteristic of the proposed CBP algorithm is the utilization of a Lagrangian function (loss function plus constraint function) as its objective function. We considered various types of constraints--binary, ternary, one-bit shift, and two-bit shift weight constraints. As a post-training method, CBP applied to AlexNet, ResNet-18, ResNet-50, and GoogLeNet on ImageNet, which were pre-trained using the conventional backpropagation. For all cases, the proposed algorithm outperforms the state-of-the-art methods on ImageNet, e.g., 66.6%, 74.4%, and 64.0% top-1 accuracy for ResNet-18, ResNet-50, and GoogLeNet with binary weights, respectively. This highlights CBP as a learning algorithm to address diverse constraints with the minimal performance loss by employing appropriate constraint functions.

machine learning, teaching medhods, teaching method, (20 more...)

arXiv.org Artificial Intelligence

2110.0255

Country: Oceania > Australia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)

Add feedback