AITopics | kim

Collaborating Authors

kim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PLLay: Efficient Topological Layer based on Persistent Landscapes

Neural Information Processing SystemsDec-24-2025, 12:06:29 GMT

We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure. In this work, we show differentiability with respect to layer inputs, for a general persistent homology with arbitrary filtration. Thus, our proposed layer can be placed anywhere in the network and feed critical information on the topological features of input data into subsequent layers to improve the learnability of the networks toward a given task. A task-optimal structure of PLLay is learned during training via backpropagation, without requiring any input featurization or data preprocessing. We provide a novel adaptation for the DTM function-based filtration, and show that the proposed layer is robust against noise and outliers through a stability analysis. We demonstrate the effectiveness of our approach by classification experiments on various datasets.

efficient topological layer, name change, pllay, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Leveraging Early-Stage Robustness in Diffusion Models for Efficient and High-Quality Image Synthesis

Neural Information Processing SystemsDec-23-2025, 17:52:57 GMT

While diffusion models have demonstrated exceptional image generation capabilities, the iterative noise estimation process required for these models is compute-intensive and their practical implementation is limited by slow sampling speeds. In this paper, we propose a novel approach to speed up the noise estimation network by leveraging the robustness of early-stage diffusion models. Our findings indicate that inaccurate computation during the early-stage of the reverse diffusion process has minimal impact on the quality of generated images, as this stage primarily outlines the image while later stages handle the finer details that require more sensitive information. To improve computational efficiency, we combine our findings with post-training quantization (PTQ) to introduce a method that utilizes low-bit activation for the early reverse diffusion process while maintaining high-bit activation for the later stages. Experimental results show that the proposed method can accelerate the early-stage computation without sacrificing the quality of the generated images.

diffusion model, efficient and high-quality image synthesis, leveraging early-stage robustness, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

UniCLIP: Unified Framework for Contrastive Language-Image Pre-training

Neural Information Processing SystemsDec-23-2025, 17:32:37 GMT

Pre-training vision-language models with contrastive objectives has shown promising results that are both scalable to large uncurated datasets and transferable to many downstream applications. Some following works have targeted to improve data efficiency by adding self-supervision terms, but inter-domain (image-text) contrastive loss and intra-domain (image-image) contrastive loss are defined on individual spaces in those works, so many feasible combinations of supervision are overlooked. To overcome this issue, we propose UniCLIP, a Unified framework for Contrastive Language-Image Pre-training.

contrastive language-image pre-training, uniclip, unified framework, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.87)

Add feedback

Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation

Neural Information Processing SystemsJun-2-2025, 06:27:07 GMT

Recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech. However, an LLM-based strategy for modeling spoken dialogs remains elusive, calling for further investigation. This paper introduces an extensive speech-text LLM framework, the Unified Spoken Dialog Model (USDM), designed to generate coherent spoken responses with naturally occurring prosodic features relevant to the given input speech without relying on explicit automatic speech recognition (ASR) or text-to-speech (TTS) systems. We have verified the inclusion of prosody in speech tokens that predominantly contain semantic information and have used this foundation to construct a prosody-infused speech-text model. Additionally, we propose a generalized speech-text pretraining scheme that enhances the capture of cross-modal semantics.

language model, natural conversation, paralinguistic-aware speech-empowered, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Reviews: Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections

Neural Information Processing SystemsJan-20-2025, 06:16:37 GMT

The proposed method is very related to Kim et al.'s work, but the later one is not mentioned at all. At the time of submission, the arXiv version of Kim's paper was already online, which should be cited and discussed. In Kim et al.'s work, the model is with multiple flat convolution layers. The input is connected with the output to form a residual learning. Compared with Kim's model, half of convolutional layers are replaced with de-convolution layers in the proposed model and it has more skip connections between the layers.

deep convolutional encoder-decoder network, kim, symmetric skip connection, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Sensing and Signal Processing > Image Processing (0.43)

Add feedback