Goto

Collaborating Authors

 Park, Jinsun


Deep Depth Estimation from Thermal Image: Dataset, Benchmark, and Challenges

arXiv.org Artificial Intelligence

--Achieving robust and accurate spatial perception under adverse weather and lighting conditions is crucial for the high-level autonomy of self-driving vehicles and robots. However, existing perception algorithms relying on the visible spectrum are highly affected by weather and lighting conditions. A long-wave infrared camera ( i.e., thermal imaging camera) can be a potential solution to achieve high-level robustness. However, the absence of large-scale datasets and standardized benchmarks remains a significant bottleneck to progress in active research for robust visual perception from thermal images. Lastly, we provide in-depth analyses and discuss the challenges revealed by the benchmark results, such as the performance variability for each modality under adverse conditions, domain shift between different sensor modalities, and potential research direction for thermal perception. AUTONOMOUS driving aims to develop intelligent vehicles capable of perceiving their surrounding environments, understanding current contextual information, and making decisions to drive safely without human intervention. Recent advancements in autonomous vehicles, such as Tesla and Waymo, have been driven by deep neural networks and large-scale vehicular datasets, such as KITTI [1], DDAD [2], and nuScenes [3]. Manuscript received March XX, 2025; revised April XX, 2025. This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(RS-2024-00358935). Ukcheol Shin is with the Robotics Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America (e-mail: ushin@andrew.cmu.edu). Jinsun Park is with the School of Computer Science and Engineering, Pusan National University, Busan, Republic of Korea (e-mail: jspark@pusan.ac.kr). Color versions of one or more figures in this article are available at https://doi.org/xx.xxxx/TIV However, a major drawback of existing vehicular datasets is their reliance on visible-spectrum images, which are easily affected by weather and lighting conditions such as rain, fog, dust, haze, and low light. Therefore, recent research has actively explored alternative sensors such as Near-Infrared (NIR) cameras [8], Li-DARs [9], [10], radars [11], [12], and long-wave infrared (LWIR) cameras [13], [14] to achieve reliable and robust visual perception in adverse weather and lighting conditions. Among these sensors, LWIR camera ( i.e., thermal camera) has gained popularity because of its competitive price, adverse weather robustness, and unique modality information ( i.e., temperature).


Federated Domain Generalization with Data-free On-server Gradient Matching

arXiv.org Artificial Intelligence

Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. One of the key approaches in DG is training an encoder which generates domain-invariant representations. However, this approach is not applicable in Federated Domain Generalization (FDG), where data from various domains are distributed across different clients. In this paper, we introduce a novel approach, dubbed Federated Learning via On-server Matching Gradient (FedOMG), which can \emph{efficiently leverage domain information from distributed domains}. Specifically, we utilize the local gradients as information about the distributed models to find an invariant gradient direction across all domains through gradient inner product maximization. The advantages are two-fold: 1) FedOMG can aggregate the characteristics of distributed models on the centralized server without incurring any additional communication cost, and 2) FedOMG is orthogonal to many existing FL/FDG methods, allowing for additional performance improvements by being seamlessly integrated with them. Extensive experimental evaluations on various settings to demonstrate the robustness of FedOMG compared to other FL/FDG baselines. Our method outperforms recent SOTA baselines on four FL benchmark datasets (MNIST, EMNIST, CIFAR-10, and CIFAR-100), and three FDG benchmark datasets (PACS, VLCS, and OfficeHome).