AITopics | Choi, Minjin

Collaborating Authors

Choi, Minjin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression

Choi, Eunseong, Lee, Sunkyung, Choi, Minjin, Park, June, Lee, Jongwuk

arXiv.org Artificial IntelligenceDec-31-2024

Large language models (LLMs) have achieved significant performance gains using advanced prompting techniques over various tasks. However, the increasing length of prompts leads to high computational costs and often obscures crucial information. Prompt compression has been proposed to alleviate these issues, but it faces challenges in (i) capturing the global context and (ii) training the compressor effectively. To tackle these challenges, we introduce a novel prompt compression method, namely Reading To Compressing (R2C), utilizing the Fusion-in-Decoder (FiD) architecture to identify the important information in the prompt. Specifically, the cross-attention scores of the FiD are used to discern essential chunks and sentences from the prompt. R2C effectively captures the global context without compromising semantic consistency while detouring the necessity of pseudo-labels for training the compressor. Empirical results show that R2C retains key contexts, enhancing the LLM performance by 6% in out-of-domain evaluations while reducing the prompt length by 80%.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.04139

Country:

North America > United States (0.93)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Temporal Linear Item-Item Model for Sequential Recommendation

Park, Seongmin, Yoon, Mincheol, Choi, Minjin, Lee, Jongwuk

arXiv.org Artificial IntelligenceDec-10-2024

In sequential recommendation (SR), neural models have been actively explored due to their remarkable performance, but they suffer from inefficiency inherent to their complexity. On the other hand, linear SR models exhibit high efficiency and achieve competitive or superior accuracy compared to neural models. However, they solely deal with the sequential order of items (i.e., sequential information) and overlook the actual timestamp (i.e., temporal information). It is limited to effectively capturing various user preference drifts over time. To address this issue, we propose a novel linear SR model, named TemporAl LinEar item-item model (TALE), incorporating temporal information while preserving training/inference efficiency, with three key components. (i) Single-target augmentation concentrates on a single target item, enabling us to learn the temporal correlation for the target item. (ii) Time interval-aware weighting utilizes the actual timestamp to discern the item correlation depending on time intervals. (iii) Trend-aware normalization reflects the dynamic shift of item popularity over time. Our empirical studies show that TALE outperforms ten competing SR models by up to 18.71% gains on five benchmark datasets. It also exhibits remarkable effectiveness in evaluating long-tail items by up to 30.45% gains. The source code is available at https://github.com/psm1206/TALE.

artificial intelligence, machine learning, recommendation, (17 more...)

arXiv.org Artificial Intelligence

2412.07382

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Collaborative Distillation for Top-N Recommendation

Lee, Jae-woong, Choi, Minjin, Lee, Jongwuk, Shim, Hyunjung

arXiv.org Machine LearningNov-12-2019

--Knowledge distillation (KD) is a well-known method to reduce inference latency by compressing a cumbersome teacher model to a small student model. Despite the success of KD in the classification task, applying KD to recommender models is challenging due to the sparsity of positive feedback, the ambiguity of missing feedback, and the ranking problem associated with the top-N recommendation. T o address the issues, we propose a new KD model for the collaborative filtering approach, namely collaborative distillation ( CD). Specifically, (1) we reformulate a loss function to deal with the ambiguity of missing feedback. Via experimental results, we demonstrate that the proposed model outperforms the state-of-the-art method by 2.7-33.2% Moreover, the proposed model achieves the performance comparable to the teacher model. Neural recommender models [1]-[9] have achieved better performance than conventional latent factor models either by capturing nonlinear and complex correlation patterns among users/items, or by leveraging the hidden features extracted from auxiliary information such as texts and images. However, the number of model parameters of neural models is greater than that of conventional models by one or more orders of magnitude. This indicates a tradeoff between accuracy and efficiency. As a result, neural recommender models usually suffer from higher latency during the inference phase. Our primary goal is to develop a recommender model that achieves a balance between effectiveness and efficiency. In this paper, we employ knowledge distillation (KD) [10] which is a network compression technique by transferring the distilled knowledge of a large model (a.k.a., a teacher model) to a small model (a.k.a., a student model). As the student model can utilize the knowledge transferred from the teacher model, it naturally exhibits the properties of computational efficiency and low memory usage. Therefore, it is capable of achieving a balance between effectiveness and efficiency. Specifically, the training procedure for KD consists of two steps. In the offline training phase, the teacher model is supervised by a training dataset with labels.

deep learning, neural network, student model, (20 more...)

arXiv.org Machine Learning

1911.05276

Genre: Research Report > Promising Solution (0.48)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback