AITopics | Hwang, Kyumin

Collaborating Authors

Hwang, Kyumin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Intrinsic Image Decomposition for Robust Self-supervised Monocular Depth Estimation on Reflective Surfaces

Choi, Wonhyeok, Hwang, Kyumin, Choi, Minwoo, Han, Kiljoon, Choi, Wonjoon, Shin, Mingyu, Im, Sunghoon

arXiv.org Artificial IntelligenceMar-28-2025

Self-supervised monocular depth estimation (SSMDE) has gained attention in the field of deep learning as it estimates depth without requiring ground truth depth maps. This approach typically uses a photometric consistency loss between a synthesized image, generated from the estimated depth, and the original image, thereby reducing the need for extensive dataset acquisition. However, the conventional photometric consistency loss relies on the Lambertian assumption, which often leads to significant errors when dealing with reflective surfaces that deviate from this model. To address this limitation, we propose a novel framework that incorporates intrinsic image decomposition into SSMDE. Our method synergistically trains for both monocular depth estimation and intrinsic image decomposition. The accurate depth estimation facilitates multi-image consistency for intrinsic image decomposition by aligning different view coordinate systems, while the decomposition process identifies reflective areas and excludes corrupted gradients from the depth training process. Furthermore, our framework introduces a pseudo-depth generation and knowledge distillation technique to further enhance the performance of the student model across both reflective and non-reflective surfaces. Comprehensive evaluations on multiple datasets show that our approach significantly outperforms existing SSMDE baselines in depth prediction, especially on reflective surfaces.

artificial intelligence, distillation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.22209

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry:

Information Technology (0.48)
Education > Educational Technology > Educational Software (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining

Choi, Wonhyeok, Hwang, Kyumin, Peng, Wei, Choi, Minwoo, Im, Sunghoon

arXiv.org Artificial IntelligenceFeb-20-2025

Published as a conference paper at ICLR 2025S ELF-SUPERVISED M ONOCULAR D EPTH E STIMATION R OBUST TO R EFLECTIVE S URFACE L EVERAGED BY T RIPLET M INING Wonhyeok Choi 1,, Kyumin Hwang 1,, Wei Peng 2, Minwoo Choi 1, Sunghoon Im 1, Electrical Engineering and Computer Science 1, Psychiatry and Behavioral Sciences 2 Daegu Gyeongbuk Institute of Science and Technology 1, Stanford University 2 South Korea 1, USA 2 {smu06117,kyumin,subminu,sunghoonim} @dgist.ac.kr 1, wepeng@stanford.edu 2 A BSTRACT Self-supervised monocular depth estimation (SSMDE) aims to predict the dense depth map of a monocular image, by learning depth from RGB image sequences, eliminating the need for ground-truth depth labels. Although this approach simplifies data acquisition compared to supervised methods, it struggles with reflective surfaces, as they violate the assumptions of Lambertian reflectance, leading to inaccurate training on such surfaces. To tackle this problem, we propose a novel training strategy for an SSMDE by leveraging triplet mining to pinpoint reflective regions at the pixel level, guided by the camera geometry between different viewpoints. The proposed reflection-aware triplet mining loss specifically penalizes the inappropriate photometric error minimization on the localized reflective regions while preserving depth accuracy in non-reflective areas. We also incorporate a reflection-aware knowledge distillation method that enables a student model to selectively learn the pixel-level knowledge from reflective and non-reflective regions. Evaluation results on multiple datasets demonstrate that our method effectively enhances depth quality on reflective surfaces and outperforms state-of-the-art SSMDE baselines. This approach significantly simplifies data acquisition compared to traditional supervised methods (Fu et al., 2018; Lee et al., 2019; Bhat et al., 2021), which often involve high costs for annotation. As such, many SSMDE studies (Godard et al., 2019; Zhou et al., 2017; Garg et al., 2016; Guizilini et al., 2020) have explored its viability as a mainstay for applications such as autonomous driving, highlighting its potential in outdoor environments. Despite its advantages, SSMDE approaches typically challenge in accurate depth estimation on non-Lambertian surfaces such as mirrors, transparent objects, and specular surfaces. This difficulty primarily arises from the assumption of Lambertian reflectance (Basri & Jacobs, 2003) embedded in most SSMDE methods.

artificial intelligence, depth estimation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.14573

Country:

Europe (0.46)
North America > United States > California > Santa Clara County > Palo Alto (0.24)
Asia > South Korea > Daegu > Daegu (0.24)

Genre: Research Report > New Finding (0.68)

Industry:

Education > Educational Technology > Educational Software (0.48)
Information Technology (0.34)
Transportation > Ground (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.84)

Add feedback

A Study on the Generality of Neural Network Structures for Monocular Depth Estimation

Bae, Jinwoo, Hwang, Kyumin, Im, Sunghoon

arXiv.org Artificial IntelligenceDec-10-2023

Monocular depth estimation has been widely studied, and significant improvements in performance have been recently reported. However, most previous works are evaluated on a few benchmark datasets, such as KITTI datasets, and none of the works provide an in-depth analysis of the generalization performance of monocular depth estimation. In this paper, we deeply investigate the various backbone networks (e.g.CNN and Transformer models) toward the generalization of monocular depth estimation. First, we evaluate state-of-the-art models on both in-distribution and out-of-distribution datasets, which have never been seen during network training. Then, we investigate the internal properties of the representations from the intermediate layers of CNN-/Transformer-based models using synthetic texture-shifted datasets. Through extensive experiments, we observe that the Transformers exhibit a strong shape-bias rather than CNNs, which have a strong texture-bias. We also discover that texture-biased models exhibit worse generalization performance for monocular depth estimation than shape-biased models. We demonstrate that similar aspects are observed in real-world driving datasets captured under diverse environments. Lastly, we conduct a dense ablation study with various backbone networks which are utilized in modern strategies. The experiments demonstrate that the intrinsic locality of the CNNs and the self-attention of the Transformers induce texture-bias and shape-bias, respectively.

artificial intelligence, image understanding, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2301.03169

Country: Asia > South Korea (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback