AITopics | Kang, Mingu

Collaborating Authors

Kang, Mingu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Cho, Jungbin, Kim, Junwan, Kim, Jisoo, Kim, Minseo, Kang, Mingu, Hong, Sungeun, Oh, Tae-Hyun, Yu, Youngjae

arXiv.org Artificial IntelligenceDec-1-2024

Human motion, inherently continuous and dynamic, presents significant challenges for generative models. Despite their dominance, discrete quantization methods, such as VQ-VAEs, suffer from inherent limitations, including restricted expressiveness and frame-wise noise artifacts. Continuous approaches, while producing smoother and more natural motions, often falter due to high-dimensional complexity and limited training data. To resolve this "discord" between discrete and continuous representations, we introduce DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding, a novel method that decodes discrete motion tokens into continuous motion through rectified flow. By employing an iterative refinement process in the continuous space, DisCoRD captures fine-grained dynamics and ensures smoother and more natural motions. Compatible with any discrete-based framework, our method enhances naturalness without compromising faithfulness to the conditioning signals. Extensive evaluations demonstrate that DisCoRD achieves state-of-the-art performance, with FID of 0.032 on HumanML3D and 0.169 on KIT-ML. These results solidify DisCoRD as a robust solution for bridging the divide between discrete efficiency and continuous realism. Our project page is available at: https://whwjdqls.github.io/discord.github.io/.

artificial intelligence, machine learning, sjpe, (16 more...)

arXiv.org Artificial Intelligence

2411.19527

Country: Asia (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism

Bhatnagar, Priyansh, Wen, Linfeng, Kang, Mingu

arXiv.org Artificial IntelligenceNov-15-2024

Extensive efforts have been made to boost the performance in the domain of language models by introducing various attention-based transformers. However, the inclusion of linear layers with large dimensions contributes to significant computational and memory overheads. The escalating computational demands of these models necessitate the development of various compression techniques to ensure their deployment on devices, particularly in resource-constrained environments. In this paper, we propose a novel compression methodology that dynamically determines the rank of each layer using a soft thresholding mechanism, which clips the singular values with a small magnitude in a differentiable form. This approach automates the decision-making process to identify the optimal degree of compression for each layer. We have successfully applied the proposed technique to attention-based architectures, including BERT for discriminative tasks and GPT2 and TinyLlama for generative tasks. Additionally, we have validated our method on Mamba, a recently proposed state-space model. Our experiments demonstrate that the proposed technique achieves a speed-up of 1.33X to 1.72X in the encoder/ decoder with a 50% reduction in total parameters.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.10543

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

MaD-Scientist: AI-based Scientist solving Convection-Diffusion-Reaction Equations Using Massive PINN-Based Prior Data

Kang, Mingu, Lee, Dongseok, Cho, Woojin, Park, Jaehyeon, Lee, Kookjin, Gruber, Anthony, Hong, Youngjoon, Park, Noseong

arXiv.org Artificial IntelligenceOct-8-2024

Large language models (LLMs), like ChatGPT, have shown that even trained with noisy prior data, they can generalize effectively to new tasks through in-context learning (ICL) and pre-training techniques. Motivated by this, we explore whether a similar approach can be applied to scientific foundation models (SFMs). Our methodology is structured as follows: (i) we collect low-cost physics-informed neural network (PINN)-based approximated prior data in the form of solutions to partial differential equations (PDEs) constructed through an arbitrary linear combination of mathematical dictionaries; (ii) we utilize Transformer architectures with self and cross-attention mechanisms to predict PDE solutions without knowledge of the governing equations in a zero-shot setting; (iii) we provide experimental evidence on the one-dimensional convection-diffusion-reaction equation, which demonstrate that pre-training remains robust even with approximated prior data, with only marginal impacts on test accuracy. Notably, this finding opens the path to pre-training SFMs with realistic, low-cost data instead of (or in conjunction with) numerical high-cost data. These results support the conjecture that SFMs can improve in a manner similar to LLMs, where fully cleaning the vast set of sentences crawled from the Internet is nearly impossible. In developing large-scale models, one fundamental challenge is the inherent noisiness of the data used for training. Whether dealing with natural language, scientific data, or other domains, large datasets almost inevitably contain noise. Large language models (LLMs), such as ChatGPT, present an interesting paradox: despite being trained on noisy datasets, they consistently produce remarkably clean and coherent output.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.06442

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PillarAcc: Sparse PointPillars Accelerator for Real-Time Point Cloud 3D Object Detection on Edge Devices

Lee, Minjae, Kim, Hyungmin, Park, Seongmin, Yoon, Minyong, Lee, Janghwan, Choi, Junwon, Kang, Mingu, Choi, Jungwook

arXiv.org Artificial IntelligenceMay-15-2023

PointPillars, a widely adopted bird's-eye view (BEV) encoding, aggregates 3D point cloud data into 2D pillars for high-accuracy 3D object detection. However, most state-of-the-art methods employing PointPillar overlook the inherent sparsity of pillar encoding, missing opportunities for significant computational reduction. In this study, we propose a groundbreaking algorithm-hardware co-design that accelerates sparse convolution processing and maximizes sparsity utilization in pillar-based 3D object detection networks. We investigate sparsification opportunities using an advanced pillar-pruning method, achieving an optimal balance between accuracy and sparsity. We introduce PillarAcc, a state-ofthe-art sparsity support mechanism that enhances sparse pillar convolution through linear complexity input-output mapping generation and conflict-free gather-scatter memory access. Additionally, we propose dataflow optimization techniques, Figure 1: Challenges in PointPillars acceleration and improvements dynamically adjusting the pillar processing schedule by this work: (a) up to three orders of magnitude for optimal hardware utilization under diverse sparsity increase in frames per second at equivalent accuracy by proposed operations. We evaluate PillarAcc on various cutting-edge PillarAcc, (b) degraded sparsity across layers by convolution 3D object detection networks and benchmarks, achieving (conv.) vs. maintained sparsity (this work), (c) significant remarkable speedup and energy savings compared to representative sparsity mapping overhead in conventional system edge platforms, demonstrating record-breaking (conv.) vs. reduced mapping overhead and enhanced computing PointPillars speed of 500FPS with minimal compromise in efficiency (this work).

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.07522

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Benchmarking Self-Supervised Learning on Diverse Pathology Datasets

Kang, Mingu, Song, Heon, Park, Seonwook, Yoo, Donggeun, Pereira, Sérgio

arXiv.org Artificial IntelligenceApr-18-2023

Computational pathology can lead to saving human lives, but models are annotation hungry and pathology images are notoriously expensive to annotate. Self-supervised learning has shown to be an effective method for utilizing unlabeled data, and its application to pathology could greatly benefit its downstream tasks. Yet, there are no principled studies that compare SSL methods and discuss how to adapt them for pathology. To address this need, we execute the largest-scale study of SSL pre-training on pathology image data, to date. Our study is conducted using 4 representative SSL methods on diverse downstream tasks. We establish that large-scale domain-aligned pre-training in pathology consistently out-performs ImageNet pre-training in standard SSL settings such as linear and fine-tuning evaluations, as well as in low-label regimes. Moreover, we propose a set of domain-specific techniques that we experimentally show leads to a performance boost. Lastly, for the first time, we apply SSL to the challenging task of nuclei instance segmentation and show large and consistent performance improvements under diverse settings.

artificial intelligence, inductive learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2212.0469

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback