Jin, Hyundong
Continual Learning for Multiple Modalities
Jin, Hyundong, Kim, Eunwoo
Continual learning aims to learn knowledge of tasks observed in sequential time steps while mitigating the forgetting of previously learned knowledge. Existing methods were proposed under the assumption of learning a single modality (e.g., image) over time, which limits their applicability in scenarios involving multiple modalities. In this work, we propose a novel continual learning framework that accommodates multiple modalities (image, video, audio, depth, and text). We train a model to align various modalities with text, leveraging its rich semantic information. However, this increases the risk of forgetting previously learned knowledge, exacerbated by the differing input traits of each task. To alleviate the overwriting of the previous knowledge of modalities, we propose a method for aggregating knowledge within and across modalities. The aggregated knowledge is obtained by assimilating new information through self-regularization within each modality and associating knowledge between modalities by prioritizing contributions from relevant modalities. Furthermore, we propose a strategy that re-aligns the embeddings of modalities to resolve biased alignment between modalities. We evaluate the proposed method in a wide range of continual learning scenarios using multiple datasets with different modalities. Extensive experiments demonstrate that ours outperforms existing methods in the scenarios, regardless of whether the identity of the modality is given.
Detection of LLM-Paraphrased Code and Identification of the Responsible LLM Using Coding Style Features
Park, Shinwoo, Jin, Hyundong, Cha, Jeong-won, Han, Yo-Sub
Recent progress in large language models (LLMs) for code generation has raised serious concerns about intellectual property protection. Malicious users can exploit LLMs to produce paraphrased versions of proprietary code that closely resemble the original. While the potential for LLM-assisted code paraphrasing continues to grow, research on detecting it remains limited, underscoring an urgent need for detection system. We respond to this need by proposing two tasks. The first task is to detect whether code generated by an LLM is a paraphrased version of original human-written code. The second task is to identify which LLM is used to paraphrase the original code. For these tasks, we construct a dataset LPcode consisting of pairs of human-written code and LLM-paraphrased code using various LLMs. We statistically confirm significant differences in the coding styles of human-written and LLM-paraphrased code, particularly in terms of naming consistency, code structure, and readability. Based on these findings, we develop LPcodedec, a detection method that identifies paraphrase relationships between human-written and LLM-generated code, and discover which LLM is used for the paraphrasing. LPcodedec outperforms the best baselines in two tasks, improving F1 scores by 2.64% and 15.17% while achieving speedups of 1,343x and 213x, respectively. Our code and data are available at https://github.com/Shinwoo-Park/detecting_llm_paraphrased_code_via_coding_style_features.
PAC-FNO: Parallel-Structured All-Component Fourier Neural Operators for Recognizing Low-Quality Images
Jeon, Jinsung, Jin, Hyundong, Choi, Jonghyun, Hong, Sanghyun, Lee, Dongeun, Lee, Kookjin, Park, Noseong
A standard practice in developing image recognition models is to train a model on a specific image resolution and then deploy it. However, in real-world inference, models often encounter images different from the training sets in resolution and/or subject to natural variations such as weather changes, noise types and compression artifacts. While traditional solutions involve training multiple models for different resolutions or input variations, these methods are computationally expensive and thus do not scale in practice. To this end, we propose a novel neural network model, parallel-structured and all-component Fourier neural operator (PAC-FNO), that addresses the problem. Unlike conventional feed-forward neural networks, PAC-FNO operates in the frequency domain, allowing it to handle images of varying resolutions within a single model. We also propose a twostage algorithm for training PAC-FNO with a minimal modification to the original, downstream model. Moreover, the proposed PAC-FNO is ready to work with existing image recognition models. Extensively evaluating methods with seven image recognition benchmarks, we show that the proposed PAC-FNO improves the performance of existing baseline models on images with various resolutions by up to 77.1% and various types of natural variations in the images at inference. Deep neural networks have enabled many breakthroughs in visual recognition (Simonyan & Zisserman, 2014; He et al., 2016; Szegedy et al., 2016; Krizhevsky et al., 2017; Dosovitskiy et al., 2020; Liu et al., 2022). A common practice of developing these models is to learn a model on training images with a fixed input resolution and then deploy the model to many applications. In practice, when these models are deployed to real world, they are likely to face low-quality inputs at inference, e.g., images with resolutions different from the training data and/or those with natural input variations such as weather changes, noise types, and compression artifacts. For example, Figure 1 shows that the ConvNeXt models (Liu et al., 2022) trained on ImageNet-1k (Russakovsky et al., 2015) suffer from (top-1) accuracy degradation when their inputs are of low-quality. 'resize' baselines which is resize-and-feed using interpolation.
Operator-learning-inspired Modeling of Neural Ordinary Differential Equations
Cho, Woojin, Cho, Seunghyeon, Jin, Hyundong, Jeon, Jinsung, Lee, Kookjin, Hong, Sanghyun, Lee, Dongeun, Choi, Jonghyun, Park, Noseong
Neural ordinary differential equations (NODEs), one of the most influential works of the differential equation-based deep learning, are to continuously generalize residual networks and opened a new field. They are currently utilized for various downstream tasks, e.g., image classification, time series classification, image generation, etc. Its key part is how to model the time-derivative of the hidden state, denoted dh(t)/dt. People have habitually used conventional neural network architectures, e.g., fully-connected layers followed by non-linear activations. In this paper, however, we present a neural operator-based method to define the time-derivative term. Neural operators were initially proposed to model the differential operator of partial differential equations (PDEs). Since the time-derivative of NODEs can be understood as a special type of the differential operator, our proposed method, called branched Fourier neural operator (BFNO), makes sense. In our experiments with general downstream tasks, our method significantly outperforms existing methods.