AITopics

2410.23629

Country: Asia (0.46)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

arXiv.org Artificial IntelligenceNov-29-2023

Hausdorff Distance Matching with Adaptive Query Denoising for Rotated Detection Transformer

Lee, Hakjin, Song, Minki, Koo, Jamyoung, Seo, Junghoon

The Detection Transformer (DETR) has emerged as a pivotal role in object detection tasks, setting new performance benchmarks due to its end-to-end design and scalability. Despite its advancements, the application of DETR in detecting rotated objects has demonstrated suboptimal performance relative to established oriented object detectors. Our analysis identifies a key limitation: the L1 cost used in Hungarian Matching leads to duplicate predictions due to the square-like problem in oriented object detection, thereby obstructing the training process of the detector. We introduce a Hausdorff distance-based cost for Hungarian matching, which more accurately quantifies the discrepancy between predictions and ground truths. Moreover, we note that a static denoising approach hampers the training of rotated DETR, particularly when the detector's predictions surpass the quality of noised ground truths. We propose an adaptive query denoising technique, employing Hungarian matching to selectively filter out superfluous noised queries that no longer contribute to model improvement. Our proposed modifications to DETR have resulted in superior performance, surpassing previous rotated DETR models and other alternatives. This is evidenced by our model's state-of-the-art achievements in benchmarks such as DOTA-v1.0/v1.5/v2.0, and DIOR-R.

artificial intelligence, detection, image understanding, (19 more...)

2305.07598

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision > Sketch Understanding (0.62)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.62)

arXiv.org Artificial IntelligenceMay-10-2023

On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective

Song, Junhwa, Cha, Keumgang, Seo, Junghoon

Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks, contradicting the original intent of ROAR. This occurrence is similarly observed in the recently introduced variant RemOve-And-Debias (ROAD), and we posit a persistent pattern of blurriness bias in ROAR attribution metrics. Our findings serve as a warning against indiscriminate use on ROAR metrics. The code is available as open source.

attribution method, machine learning, natural language, (15 more...)

2304.13836

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.68)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.68)
Energy > Oil & Gas > Midstream (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
(2 more...)

arXiv.org Artificial IntelligenceApr-11-2023

A Billion-scale Foundation Model for Remote Sensing Images

Cha, Keumgang, Seo, Junghoon, Lee, Taekyung

As the potential of foundation models in visual tasks has garnered significant attention, pretraining these models before downstream tasks has become a crucial step. The three key factors in pretraining foundation models are the pretraining method, the size of the pretraining dataset, and the number of model parameters. Recently, research in the remote sensing field has focused primarily on the pretraining method and the size of the dataset, with limited emphasis on the number of model parameters. This paper addresses this gap by examining the effect of increasing the number of model parameters on the performance of foundation models in downstream tasks such as rotated object detection and semantic segmentation. We pretrained foundation models with varying numbers of parameters, including 86M, 605.26M, 1.3B, and 2.4B, to determine whether performance in downstream tasks improved with an increase in parameters. To the best of our knowledge, this is the first billion-scale foundation model in the remote sensing field. Furthermore, we propose an effective method for scaling up and fine-tuning a vision transformer in the remote sensing field. To evaluate general performance in downstream tasks, we employed the DOTA v2.0 and DIOR-R benchmark datasets for rotated object detection, and the Potsdam and LoveDA datasets for semantic segmentation. Experimental results demonstrated that, across all benchmark datasets and downstream tasks, the performance of the foundation models and data efficiency improved as the number of parameters increased. Moreover, our models achieve the state-of-the-art performance on several datasets including DIOR-R, Postdam, and LoveDA.

artificial intelligence, machine learning, natural language, (17 more...)

2304.05215

Country: Europe > Germany > Brandenburg > Potsdam (0.25)

Genre: Research Report > New Finding (0.67)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-22-2020

On the Power of Deep but Naive Partial Label Learning

Seo, Junghoon, Huh, Joon Suk

Partial label learning (PLL) is a class of weakly supervised learning where each training instance consists of a data and a set of candidate labels containing a unique ground truth label. To tackle this problem, a majority of current state-of-the-art methods employs either label disambiguation or averaging strategies. So far, PLL methods without such techniques have been considered impractical. In this paper, we challenge this view by revealing the hidden power of the oldest and naivest PLL method when it is instantiated with deep neural networks. Specifically, we show that, with deep neural networks, the naive model can achieve competitive performances against the other state-of-the-art methods, suggesting it as a strong baseline for PLL. We also address the question of how and why such a naive model works well with deep neural networks. Our empirical results indicate that deep neural networks trained on partially labeled examples generalize very well even in the over-parametrized regime and without label disambiguations or regularizations. We point out that existing learning theories on PLL are vacuous in the over-parametrized regime. Hence they cannot explain why the deep naive method works. We propose an alternative theory on how deep learning generalize in PLL problems.

deep learning, learning, neural network, (17 more...)

2010.116

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningOct-4-2019

Revisiting Classical Bagging with Modern Transfer Learning for On-the-fly Disaster Damage Detector

Seo, Junghoon, Lee, Seungwon, Kim, Beomsu, Jeon, Taegyun

Automatic post-disaster damage detection using aerial imagery is crucial for quick assessment of damage caused by disaster and development of a recovery plan. The main problem preventing us from creating an applicable model in practice is that damaged (positive) examples we are trying to detect are much harder to obtain than undamaged (negative) examples, especially in short time. In this paper, we revisit the classical bootstrap aggregating approach in the context of modern transfer learning for data-efficient disaster damage detection. Unlike previous classical ensemble learning articles, our work points out the effectiveness of simple bagging in deep transfer learning that has been underestimated in the context of imbalanced classification. Benchmark results on the AIST Building Change Detection dataset show that our approach significantly outperforms existing methodologies, including the recently proposed disentanglement learning.

artificial intelligence, neural network, transfer learning, (18 more...)

1910.01911

Country: Asia (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.85)

arXiv.org Machine LearningAug-25-2019

Deep Closed-Form Subspace Clustering

Seo, Junghoon, Koo, Jamyoung, Jeon, Taegyun

We propose Deep Closed-Form Subspace Clustering (DCFSC), a new embarrassingly simple model for subspace clustering with learning non-linear mapping. Compared with the previous deep subspace clustering (DSC) techniques, our DCFSC does not have any parameters at all for the self-expressive layer. Instead, DCFSC utilizes the implicit data-driven self-expressive layer derived from closed-form shallow auto-encoder. Moreover, DCFSC also has no complicated optimization scheme, unlike the other subspace clustering methods. With its extreme simplicity, DCFSC has significant memory-related benefits over the existing DSC method, especially on the large dataset. Several experiments showed that our DCFSC model had enough potential to be a new reference model for subspace clustering on large-scale high-dimensional dataset.

artificial intelligence, neural network, subspace, (18 more...)

1908.09419

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

arXiv.org Machine LearningAug-22-2019

NL-LinkNet: Toward Lighter but More Accurate Road Extraction with Non-Local Operations

Wang, Yooseung, Seo, Junghoon, Jeon, Taegyun

Road extraction from very high resolution satellite images is one of the most important topics in the field of remote sensing. For the road segmentation problem, spatial properties of the data can usually be captured using Convolutional Neural Networks. However, this approach only considers a few local neighborhoods at a time and has difficulty capturing long-range dependencies. In order to overcome the problem, we propose Non-Local LinkNet with non-local blocks that can grasp relations between global features. It enables each spatial feature point to refer to all other contextual information and results in more accurate road segmentation. In detail, our method achieved 65.00\% mIOU scores on the DeepGlobe 2018 Road Extraction Challenge dataset. Our best model outperformed D-LinkNet, 1st-ranked solution, by a significant gap of mIOU 0.88\% with much less number of parameters. We also present empirical analyses on proper usage of non-local blocks for the baseline model.

deep learning, neural network, non-local block, (18 more...)

1908.08223

Country: Asia > South Korea (0.15)

Genre: Research Report (0.82)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Artificial IntelligenceApr-19-2019

Bridging Adversarial Robustness and Gradient Interpretability

Kim, Beomsu, Seo, Junghoon, Jeon, Taegyun

Adversarial training is a training scheme designed to counter adversarial attacks by augmenting the training dataset with adversarial examples. Surprisingly, several studies have observed that loss gradients from adversarially trained DNNs are visually more interpretable than those from standard DNNs. Although this phenomenon is interesting, there are only few works that have offered an explanation. In this paper, we attempted to bridge this gap between adversarial robustness and gradient interpretability. To this end, we identified that loss gradients from adversarially trained DNNs align better with human perception because adversarial training restricts gradients closer to the image manifold. We then demonstrated that adversarial training causes loss gradients to be quantitatively meaningful. Finally, we showed that under the adversarial training framework, there exists an empirical tradeoff between test accuracy and loss gradient interpretability and proposed two potential approaches to resolving this tradeoff. Adversarial attack is an imperceptible perturbation to the input image which causes a deep neural network (DNN) to misclassify the perturbed input image with high confidence (Goodfellow et al., 2015).

deep learning, gradient, neural network, (19 more...)

1903.11626

Country: Asia (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningFeb-20-2019

Why are Saliency Maps Noisy? Cause of and Solution to Noisy Saliency Maps

Kim, Beomsu, Seo, Junghoon, Jeon, SeungHyun, Koo, Jamyoung, Choe, Jeongyeol, Jeon, Taegyun

Saliency Map, the gradient of the score function with respect to the input, is the most basic technique for interpreting deep neural network decisions. However, saliency maps are often visually noisy. Although several hypotheses were proposed to account for this phenomenon, there are few works that provide rigorous analyses of noisy saliency maps. In this paper, we identify that noise occurs in saliency maps when irrelevant features pass through ReLU activation functions. Then we propose Rectified Gradient, a method that solves this problem through layer-wise thresholding during backpropagation. Experiments with neural networks trained on CIFAR-10 and ImageNet showed effectiveness of our method and its superiority to other attribution methods.

deep learning, neural network, saliency map, (17 more...)

1902.04893

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)