AITopics | Cho, Hyunsoo

Collaborating Authors

Cho, Hyunsoo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection

Park, Choonghyun, Kim, Hyuhng Joon, Kim, Junyeob, Kim, Youna, Kim, Taeuk, Cho, Hyunsoo, Jo, Hwiyeol, Lee, Sang-goo, Yoo, Kang Min

arXiv.org Artificial IntelligenceJun-23-2024

AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper, we analyze the impact of such shortcuts in AIGT detection. We propose Feedback-based Adversarial Instruction List Optimization (FAILOpt), an attack that searches for instructions deceptive to AIGT detectors exploiting prompt-specific shortcuts. FAILOpt effectively drops the detection performance of the target detector, comparable to other attacks based on adversarial in-context examples. We also utilize our method to enhance the robustness of the detector by mitigating the shortcuts. Based on the findings, we further train the classifier with the dataset augmented by FAILOpt prompt. The augmented classifier exhibits improvements across generation models, tasks, and attacks. Our code will be available at https://github.com/zxcvvxcz/FAILOpt.

detector, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2406.16275

Country:

North America > Canada (0.14)
Europe > Italy (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.67)
Media > News (0.46)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Revealing and Utilizing In-group Favoritism for Graph-based Collaborative Filtering

Jung, Hoin, Cho, Hyunsoo, Choi, Myungje, Lee, Joowon, Park, Jung Ho, Kang, Myungjoo

arXiv.org Artificial IntelligenceApr-23-2024

When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster to extract the in-group favoritism. Combining the features from the networks, we obtain rich and unified information about users. We experimented real world datasets considering two aspects: Finding the number of groups divided according to in-group preference, and measuring the quantity of improvement of the performance.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2404.17598

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model

Cho, Hyunsoo

arXiv.org Artificial IntelligenceApr-15-2024

Many recent studies endeavor to improve open-source language models through imitation learning, and re-training on the synthetic instruction data from state-of-the-art proprietary models like ChatGPT and GPT-4. However, the innate nature of synthetic data inherently contains noisy data, giving rise to a substantial presence of low-quality data replete with erroneous responses, and flawed reasoning. Although we intuitively grasp the potential harm of noisy data, we lack a quantitative understanding of its impact. To this end, this paper explores the correlation between the degree of noise and its impact on language models through instruction tuning. We first introduce the Falsity-Controllable (FACO) dataset, which comprises pairs of true answers with corresponding reasoning, as well as false pairs to manually control the falsity ratio of the dataset.Through our extensive experiments, we found multiple intriguing findings of the correlation between the factuality of the dataset and instruction tuning: Specifically, we verified falsity of the instruction is highly relevant to various benchmark scores. Moreover, when LLMs are trained with false instructions, they learn to lie and generate fake unfaithful answers, even though they know the correct answer for the user request. Additionally, we noted that once the language model is trained with a dataset contaminated by noise, restoring its original performance is possible, but it failed to reach full performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.09717

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

HyperCLOVA X Technical Report

Yoo, Kang Min, Han, Jaegeun, In, Sookyo, Jeon, Heewon, Jeong, Jisu, Kang, Jaewook, Kim, Hyunwook, Kim, Kyung-Min, Kim, Munhyong, Kim, Sungju, Kwak, Donghyun, Kwak, Hanock, Kwon, Se Jung, Lee, Bado, Lee, Dongsoo, Lee, Gichang, Lee, Jooho, Park, Baeseong, Shin, Seongjin, Yu, Joonsang, Baek, Seolki, Byeon, Sumin, Cho, Eungsup, Choe, Dooseok, Han, Jeesung, Jin, Youngkyun, Jun, Hyein, Jung, Jaeseung, Kim, Chanwoong, Kim, Jinhong, Kim, Jinuk, Lee, Dokyeong, Park, Dongwook, Sohn, Jeong Min, Han, Sujung, Heo, Jiae, Hong, Sungju, Jeon, Mina, Jung, Hyunhoon, Jung, Jungeun, Jung, Wangkyo, Kim, Chungjoon, Kim, Hyeri, Kim, Jonghyun, Kim, Min Young, Lee, Soeun, Park, Joonhee, Shin, Jieun, Yang, Sojin, Yoon, Jungsoon, Lee, Hwaran, Bae, Sanghwan, Cha, Jeehwan, Gylleus, Karl, Ham, Donghoon, Hong, Mihak, Hong, Youngki, Hong, Yunki, Jang, Dahyun, Jeon, Hyojun, Jeon, Yujin, Jeong, Yeji, Ji, Myunggeun, Jin, Yeguk, Jo, Chansong, Joo, Shinyoung, Jung, Seunghwan, Kim, Adrian Jungmyung, Kim, Byoung Hoon, Kim, Hyomin, Kim, Jungwhan, Kim, Minkyoung, Kim, Minseung, Kim, Sungdong, Kim, Yonghee, Kim, Youngjun, Kim, Youngkwan, Ko, Donghyeon, Lee, Dughyun, Lee, Ha Young, Lee, Jaehong, Lee, Jieun, Lee, Jonghyun, Lee, Jongjin, Lee, Min Young, Lee, Yehbin, Min, Taehong, Min, Yuri, Moon, Kiyoon, Oh, Hyangnam, Park, Jaesun, Park, Kyuyon, Park, Younghun, Seo, Hanbae, Seo, Seunghyun, Sim, Mihyun, Son, Gyubin, Yeo, Matt, Yeom, Kyung Hoon, Yoo, Wonjoon, You, Myungin, Ahn, Doheon, Ahn, Homin, Ahn, Joohee, Ahn, Seongmin, An, Chanwoo, An, Hyeryun, An, Junho, An, Sang-Min, Byun, Boram, Byun, Eunbin, Cha, Jongho, Chang, Minji, Chang, Seunggyu, Cho, Haesong, Cho, Youngdo, Choi, Dalnim, Choi, Daseul, Choi, Hyoseok, Choi, Minseong, Choi, Sangho, Choi, Seongjae, Choi, Wooyong, Chun, Sewhan, Go, Dong Young, Ham, Chiheon, Han, Danbi, Han, Jaemin, Hong, Moonyoung, Hong, Sung Bum, Hwang, Dong-Hyun, Hwang, Seongchan, Im, Jinbae, Jang, Hyuk Jin, Jang, Jaehyung, Jang, Jaeni, Jang, Sihyeon, Jang, Sungwon, Jeon, Joonha, Jeong, Daun, Jeong, Joonhyun, Jeong, Kyeongseok, Jeong, Mini, Jin, Sol, Jo, Hanbyeol, Jo, Hanju, Jo, Minjung, Jung, Chaeyoon, Jung, Hyungsik, Jung, Jaeuk, Jung, Ju Hwan, Jung, Kwangsun, Jung, Seungjae, Ka, Soonwon, Kang, Donghan, Kang, Soyoung, Kil, Taeho, Kim, Areum, Kim, Beomyoung, Kim, Byeongwook, Kim, Daehee, Kim, Dong-Gyun, Kim, Donggook, Kim, Donghyun, Kim, Euna, Kim, Eunchul, Kim, Geewook, Kim, Gyu Ri, Kim, Hanbyul, Kim, Heesu, Kim, Isaac, Kim, Jeonghoon, Kim, Jihye, Kim, Joonghoon, Kim, Minjae, Kim, Minsub, Kim, Pil Hwan, Kim, Sammy, Kim, Seokhun, Kim, Seonghyeon, Kim, Soojin, Kim, Soong, Kim, Soyoon, Kim, Sunyoung, Kim, Taeho, Kim, Wonho, Kim, Yoonsik, Kim, You Jin, Kim, Yuri, Kwon, Beomseok, Kwon, Ohsung, Kwon, Yoo-Hwan, Lee, Anna, Lee, Byungwook, Lee, Changho, Lee, Daun, Lee, Dongjae, Lee, Ha-Ram, Lee, Hodong, Lee, Hwiyeong, Lee, Hyunmi, Lee, Injae, Lee, Jaeung, Lee, Jeongsang, Lee, Jisoo, Lee, Jongsoo, Lee, Joongjae, Lee, Juhan, Lee, Jung Hyun, Lee, Junghoon, Lee, Junwoo, Lee, Se Yun, Lee, Sujin, Lee, Sungjae, Lee, Sungwoo, Lee, Wonjae, Lee, Zoo Hyun, Lim, Jong Kun, Lim, Kun, Lim, Taemin, Na, Nuri, Nam, Jeongyeon, Nam, Kyeong-Min, Noh, Yeonseog, Oh, Biro, Oh, Jung-Sik, Oh, Solgil, Oh, Yeontaek, Park, Boyoun, Park, Cheonbok, Park, Dongju, Park, Hyeonjin, Park, Hyun Tae, Park, Hyunjung, Park, Jihye, Park, Jooseok, Park, Junghwan, Park, Jungsoo, Park, Miru, Park, Sang Hee, Park, Seunghyun, Park, Soyoung, Park, Taerim, Park, Wonkyeong, Ryu, Hyunjoon, Ryu, Jeonghun, Ryu, Nahyeon, Seo, Soonshin, Seo, Suk Min, Shim, Yoonjeong, Shin, Kyuyong, Shin, Wonkwang, Sim, Hyun, Sim, Woongseob, Soh, Hyejin, Son, Bokyong, Son, Hyunjun, Son, Seulah, Song, Chi-Yun, Song, Chiyoung, Song, Ka Yeon, Song, Minchul, Song, Seungmin, Wang, Jisung, Yeo, Yonggoo, Yi, Myeong Yeon, Yim, Moon Bin, Yoo, Taehwan, Yoo, Youngjoon, Yoon, Sungmin, Yoon, Young Jin, Yu, Hangyeol, Yu, Ui Seon, Zuo, Xingdong, Bae, Jeongin, Bae, Joungeun, Cho, Hyunsoo, Cho, Seonghyun, Cho, Yongjin, Choi, Taekyoon, Choi, Yera, Chung, Jiwan, Han, Zhenghui, Heo, Byeongho, Hong, Euisuk, Hwang, Taebaek, Im, Seonyeol, Jegal, Sumin, Jeon, Sumin, Jeong, Yelim, Jeong, Yonghyun, Jiang, Can, Jiang, Juyong, Jin, Jiho, Jo, Ara, Jo, Younghyun, Jung, Hoyoun, Jung, Juyoung, Kang, Seunghyeong, Kim, Dae Hee, Kim, Ginam, Kim, Hangyeol, Kim, Heeseung, Kim, Hyojin, Kim, Hyojun, Kim, Hyun-Ah, Kim, Jeehye, Kim, Jin-Hwa, Kim, Jiseon, Kim, Jonghak, Kim, Jung Yoon, Kim, Rak Yeong, Kim, Seongjin, Kim, Seoyoon, Kim, Sewon, Kim, Sooyoung, Kim, Sukyoung, Kim, Taeyong, Ko, Naeun, Koo, Bonseung, Kwak, Heeyoung, Kwon, Haena, Kwon, Youngjin, Lee, Boram, Lee, Bruce W., Lee, Dagyeong, Lee, Erin, Lee, Euijin, Lee, Ha Gyeong, Lee, Hyojin, Lee, Hyunjeong, Lee, Jeeyoon, Lee, Jeonghyun, Lee, Jongheok, Lee, Joonhyung, Lee, Junhyuk, Lee, Mingu, Lee, Nayeon, Lee, Sangkyu, Lee, Se Young, Lee, Seulgi, Lee, Seung Jin, Lee, Suhyeon, Lee, Yeonjae, Lee, Yesol, Lee, Youngbeom, Lee, Yujin, Li, Shaodong, Liu, Tianyu, Moon, Seong-Eun, Moon, Taehong, Nihlenramstroem, Max-Lasse, Oh, Wonseok, Oh, Yuri, Park, Hongbeen, Park, Hyekyung, Park, Jaeho, Park, Nohil, Park, Sangjin, Ryu, Jiwon, Ryu, Miru, Ryu, Simo, Seo, Ahreum, Seo, Hee, Seo, Kangdeok, Shin, Jamin, Shin, Seungyoun, Sin, Heetae, Wang, Jiangping, Wang, Lei, Xiang, Ning, Xiao, Longxiang, Xu, Jing, Yi, Seonyeong, Yoo, Haanju, Yoo, Haneul, Yoo, Hwanhee, Yu, Liang, Yu, Youngjae, Yuan, Weijie, Zeng, Bo, Zhou, Qian, Cho, Kyunghyun, Ha, Jung-Woo, Park, Joonsuk, Hwang, Jihyun, Kwon, Hyoung Jo, Kwon, Soonyong, Lee, Jungyeon, Lee, Seungho, Lim, Seonghyeon, Noh, Hyunkyung, Choi, Seungho, Lee, Sang-Woo, Lim, Jung Hwa, Sung, Nako

arXiv.org Artificial IntelligenceApr-13-2024

We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.

benchmark, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.01954

Country:

Asia (0.67)
Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine (1.00)
Government (0.93)
Law (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Universal Domain Adaptation for Robust Handling of Distributional Shifts in NLP

Kim, Hyuhng Joon, Cho, Hyunsoo, Lee, Sang-Woo, Kim, Junyeob, Park, Choonghyun, Lee, Sang-goo, Yoo, Kang Min, Kim, Taeuk

arXiv.org Artificial IntelligenceOct-23-2023

When deploying machine learning systems to the wild, it is highly desirable for them to effectively leverage prior knowledge to the unfamiliar domain while also firing alarms to anomalous inputs. In order to address these requirements, Universal Domain Adaptation (UniDA) has emerged as a novel research area in computer vision, focusing on achieving both adaptation ability and robustness (i.e., the ability to detect out-of-distribution samples). While UniDA has led significant progress in computer vision, its application on language input still needs to be explored despite its feasibility. In this paper, we propose a comprehensive benchmark for natural language that offers thorough viewpoints of the model's generalizability and robustness. Our benchmark encompasses multiple datasets with varying difficulty levels and characteristics, including temporal shifts and diverse domains. On top of our testbed, we validate existing UniDA methods from computer vision and state-of-the-art domain adaptation techniques from NLP literature, yielding valuable findings: We observe that UniDA methods originally designed for image input can be effectively transferred to the natural language domain while also underscoring the effect of adaptation difficulty in determining the model's performance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.14849

Country:

North America > United States > Minnesota (0.28)
Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Instruction Tuning with Human Curriculum

Lee, Bruce W., Cho, Hyunsoo, Yoo, Kang Min

arXiv.org Artificial IntelligenceOct-14-2023

The dominant paradigm for instruction tuning is the random-shuffled training of maximally diverse instruction-response pairs. This paper explores the potential benefits of applying a structured cognitive learning approach to instruction tuning in contemporary large language models like ChatGPT and GPT-4. Unlike the previous conventional randomized instruction dataset, we propose a highly structured synthetic dataset that mimics the progressive and organized nature of human education. We curate our dataset by aligning it with educational frameworks, incorporating meta information including its topic and cognitive rigor level for each sample. Our dataset covers comprehensive fine-grained topics spanning diverse educational stages (from middle school to graduate school) with various questions for each topic to enhance conceptual depth using Bloom's taxonomy-a classification framework distinguishing various levels of human cognition for each concept. The results demonstrate that this cognitive rigorous training approach yields significant performance enhancements - +3.06 on the MMLU benchmark and an additional +1.28 on AI2 Reasoning Challenge (hard set) - compared to conventional randomized training, all while avoiding additional computational costs. This research highlights the potential of leveraging human learning principles to enhance the capabilities of language models in comprehending and responding to complex instructions and tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.09518

Country: North America > United States > Minnesota (0.28)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Research Report > New Finding (0.87)

Industry:

Education > Curriculum (0.93)
Education > Educational Setting > K-12 Education > Secondary School (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners

Cho, Hyunsoo, Kim, Hyuhng Joon, Kim, Junyeob, Lee, Sang-Woo, Lee, Sang-goo, Yoo, Kang Min, Kim, Taeuk

arXiv.org Artificial IntelligenceJun-13-2023

Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feature extractors, allowing them to be utilized in a black-box manner and enabling the linear probing paradigm, where lightweight discriminators are trained on top of the pre-extracted input representations. This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and ICL, which leverages the best of both worlds. PALP inherits the scalability of linear probing and the capability of enforcing language models to derive more meaningful representations via tailoring input into a more conceivable form. Throughout in-depth investigations on various datasets, we verified that PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead, potentially making PALP a strong alternative in a black-box scenario.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2212.10873

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning

Cho, Hyunsoo, Park, Choonghyun, Kim, Junyeop, Kim, Hyuhng Joon, Yoo, Kang Min, Lee, Sang-goo

arXiv.org Artificial IntelligenceJun-13-2023

As the size of the pre-trained language model (PLM) continues to increase, numerous parameter-efficient transfer learning methods have been proposed recently to compensate for the tremendous cost of fine-tuning. Despite the impressive results achieved by large pre-trained language models (PLMs) and various parameter-efficient transfer learning (PETL) methods on sundry benchmarks, it remains unclear if they can handle inputs that have been distributionally shifted effectively. In this study, we systematically explore how the ability to detect out-of-distribution (OOD) changes as the size of the PLM grows or the transfer methods are altered. Specifically, we evaluated various PETL techniques, including fine-tuning, Adapter, LoRA, and prefix-tuning, on three different intention classification tasks, each utilizing various language models with different scales.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.1166

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.82)
(2 more...)

Add feedback

CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels

Cho, Hyunsoo, Kim, Youna, Lee, Sang-goo

arXiv.org Artificial IntelligenceJun-9-2023

Utilizing language models (LMs) without internal access is becoming an attractive paradigm in the field of NLP as many cutting-edge LMs are released through APIs and boast a massive scale. The de-facto method in this type of black-box scenario is known as prompting, which has shown progressive performance enhancements in situations where data labels are scarce or unavailable. Despite their efficacy, they still fall short in comparison to fully supervised counterparts and are generally brittle to slight modifications. In this paper, we propose Clustering-enhanced Linear Discriminative Analysis, a novel approach that improves the text classification accuracy with a very weak-supervision signal (i.e., name of the labels). Our framework draws a precise decision boundary without accessing weights or gradients of the LM model or data labels. The core ideas of CELDA are twofold: (1) extracting a refined pseudo-labeled dataset from an unlabeled dataset, and (2) training a lightweight and robust model on the top of LM, which learns an accurate decision boundary from an extracted noisy dataset. Throughout in-depth investigations on various datasets, we demonstrated that CELDA reaches new state-of-the-art in weakly-supervised text classification and narrows the gap with a fully-supervised model. Additionally, our proposed methodology can be applied universally to any LM and has the potential to scale to larger models, making it a more viable option for utilizing large LMs.

machine learning, natural language, text classification, (16 more...)

arXiv.org Artificial Intelligence

2306.02693

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (0.68)
Transportation > Air (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)

Add feedback

Masked Contrastive Learning for Anomaly Detection

Cho, Hyunsoo, Seol, Jinseok, Lee, Sang-goo

arXiv.org Artificial IntelligenceJan-30-2023

Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have demonstrated their efficiencies. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additional labels. Among self-supervised learning tactics, contrastive learning is one specific framework validating their superiority in various fields, including anomaly detection. However, the primary objective of contrastive learning is to learn task-agnostic features without any labels, which is not entirely suited to discern anomalies. In this paper, we propose a task-specific variant of contrastive learning named masked contrastive learning, which is more befitted for anomaly detection. Moreover, we propose a new inference method dubbed self-ensemble inference that further boosts performance by leveraging the ability learned through auxiliary self-supervision tasks. By combining our models, we can outperform previous state-of-the-art methods by a significant margin on various benchmark datasets.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2105.08793

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback