AITopics | Lee, Kyumin

Collaborating Authors

Lee, Kyumin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning

Li, Yichuan, Ma, Xiyao, Lu, Sixing, Lee, Kyumin, Liu, Xiaohu, Guo, Chenlei

arXiv.org Artificial IntelligenceMar-12-2024

Large Language models (LLMs) have demonstrated impressive in-context learning (ICL) capabilities, where a LLM makes predictions for a given test input together with a few input-output pairs (demonstrations). Nevertheless, the inclusion of demonstrations leads to a quadratic increase in the computational overhead of the self-attention mechanism. Existing solutions attempt to distill lengthy demonstrations into compact vectors. However, they often require task-specific retraining or compromise LLM's in-context learning performance. To mitigate these challenges, we present Meta dEmonstratioN Distillation (MEND), where a language model learns to distill any lengthy demonstrations into vectors without retraining for a new downstream task. We exploit the knowledge distillation to enhance alignment between MEND and LLM, achieving both efficiency and effectiveness simultaneously. MEND is endowed with the meta-knowledge of distilling demonstrations through a two-stage training process, which includes meta-distillation pretraining and fine-tuning. Comprehensive evaluations across seven diverse ICL task partitions using decoder-only (GPT-2) and encoder-decoder (T5) attest to MEND's prowess. It not only matches but often outperforms the Vanilla ICL as well as other state-of-the-art distillation models, while significantly reducing the computational demands. This innovation promises enhanced scalability and efficiency for the practical deployment of large language models

demonstration, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.06914

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs

Li, Yichuan, Ding, Kaize, Lee, Kyumin

arXiv.org Artificial IntelligenceOct-23-2023

Self-supervised representation learning on text-attributed graphs, which aims to create expressive and generalizable representations for various downstream tasks, has received increasing research attention lately. However, existing methods either struggle to capture the full extent of structural context information or rely on task-specific training labels, which largely hampers their effectiveness and generalizability in practice. To solve the problem of self-supervised representation learning on text-attributed graphs, we develop a novel Graph-Centric Language model -- GRENADE. Specifically, GRENADE exploits the synergistic effect of both pre-trained language model and graph neural network by optimizing with two specialized self-supervised learning algorithms: graph-centric contrastive learning and graph-centric knowledge alignment. The proposed graph-centric self-supervised learning algorithms effectively help GRENADE to capture informative textual semantics as well as structural context information on text-attributed graphs. Through extensive experiments, GRENADE shows its superiority over state-of-the-art methods. Implementation is available at \url{https://github.com/bigheiniu/GRENADE}.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.15109

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

KEPLET: Knowledge-Enhanced Pretrained Language Model with Topic Entity Awareness

Li, Yichuan, Han, Jialong, Lee, Kyumin, Ma, Chengyuan, Yao, Benjamin, Liu, Derek

arXiv.org Artificial IntelligenceMay-2-2023

In recent years, Pre-trained Language Models (PLMs) have shown their superiority by pre-training on unstructured text corpus and then fine-tuning on downstream tasks. On entity-rich textual resources like Wikipedia, Knowledge-Enhanced PLMs (KEPLMs) incorporate the interactions between tokens and mentioned entities in pre-training, and are thus more effective on entity-centric tasks such as entity linking and relation classification. Although exploiting Wikipedia's rich structures to some extent, conventional KEPLMs still neglect a unique layout of the corpus where each Wikipedia page is around a topic entity (identified by the page URL and shown in the page title). In this paper, we demonstrate that KEPLMs without incorporating the topic entities will lead to insufficient entity interaction and biased (relation) word semantics. We thus propose KEPLET, a novel Knowledge-Enhanced Pre-trained LanguagE model with Topic entity awareness. In an end-to-end manner, KEPLET identifies where to add the topic entity's information in a Wikipedia sentence, fuses such information into token and mentioned entities representations, and supervises the network learning, through which it takes topic entities back into consideration. Experiments demonstrated the generality and superiority of KEPLET which was applied to two representative KEPLMs, achieving significant improvements on four entity-centric tasks.

machine learning, natural language, topic entity, (17 more...)

arXiv.org Artificial Intelligence

2305.0181

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment (1.00)
Media (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hierarchical Multi-head Attentive Network for Evidence-aware Fake News Detection

Vo, Nguyen, Lee, Kyumin

arXiv.org Artificial IntelligenceFeb-4-2021

To detect fake news, researchers proposed to use The proliferation of biased news, misleading linguistics and textual content (Castillo et al., 2011; claims, disinformation and fake news has caused Zhao et al., 2015; Liu et al., 2015). Since textual heightened negative effects on modern society in claims are usually deliberately written to deceive various domains ranging from politics, economics readers, it is hard to detect fake news by solely to public health. A recent study showed that maliciously relying on the content claims. Therefore, multiple fabricated and partisan stories possibly works utilized other signals such as temporal caused citizens' misperception about political candidates spreading patterns (Liu and Wu, 2018), network (Allcott and Gentzkow, 2017) during the structures (Wu and Liu, 2018; Vo and Lee, 2018; 2016 U.S. presidential elections. In economics, the Shu et al., 2020) and users' feedbacks (Vo and spread of fake news has manipulated stock price Lee, 2019; Shu et al., 2019; Vo and Lee, 2020a).

deep learning, neural network, proceedings, (21 more...)

arXiv.org Artificial Intelligence

2102.0268

Country: North America > United States (0.66)

Genre: Research Report > New Finding (0.48)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HABERTOR: An Efficient and Effective Deep Hatespeech Detector

Tran, Thanh, Hu, Yifan, Hu, Changwei, Yen, Kevin, Tan, Fei, Lee, Kyumin, Park, Serim

arXiv.org Artificial IntelligenceOct-17-2020

We present our HABERTOR model for detecting hatespeech in large scale user-generated content. Inspired by the recent success of the BERT model, we propose several modifications to BERT to enhance the performance on the downstream hatespeech classification task. HABERTOR inherits BERT's architecture, but is different in four aspects: (i) it generates its own vocabularies and is pre-trained from the scratch using the largest scale hatespeech dataset; (ii) it consists of Quaternion-based factorized components, resulting in a much smaller number of parameters, faster training and inferencing, as well as less memory usage; (iii) it uses our proposed multi-source ensemble heads with a pooling layer for separate input sources, to further enhance its effectiveness; and (iv) it uses a regularized adversarial training with our proposed fine-grained and adaptive noise magnitude to enhance its robustness. Through experiments on the large-scale real-world hatespeech dataset with 1.4M annotated comments, we show that HABERTOR works better than 15 state-of-the-art hatespeech detection methods, including fine-tuning Language Models. In particular, comparing with BERT, our HABERTOR is 4~5 times faster in the training/inferencing phase, uses less than 1/3 of the memory, and has better performance, even though we pre-train it by using less than 1% of the number of words. Our generalizability analysis shows that HABERTOR transfers well to other unseen hatespeech datasets and is a more efficient and effective alternative to BERT for the hatespeech classification.

deep learning, habertor, neural network, (21 more...)

arXiv.org Artificial Intelligence

2010.08865

Genre: Research Report (1.00)

Industry:

Information Technology (0.46)
Law (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News

Vo, Nguyen, Lee, Kyumin

arXiv.org Artificial IntelligenceOct-7-2020

Although many fact-checking systems have been developed in academia and industry, fake news is still proliferating on social media. These systems mostly focus on fact-checking but usually neglect online users who are the main drivers of the spread of misinformation. How can we use fact-checked information to improve users' consciousness of fake news to which they are exposed? How can we stop users from spreading fake news? To tackle these questions, we propose a novel framework to search for fact-checking articles, which address the content of an original tweet (that may contain misinformation) posted by online users. The search can directly warn fake news posters and online users (e.g. the posters' followers) about misinformation, discourage them from spreading fake news, and scale up verified content on social media. Our framework uses both text and images to search for fact-checking articles, and achieves promising results on real-world datasets. Our code and datasets are released at https://github.com/nguyenvo09/EMNLP2020.

deep learning, neural network, tweet, (23 more...)

arXiv.org Artificial Intelligence

2010.03159

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Quaternion-Based Self-Attentive Long Short-Term User Preference Encoding for Recommendation

Tran, Thanh, You, Di, Lee, Kyumin

arXiv.org Artificial IntelligenceAug-30-2020

Quaternion space has brought several benefits over the traditional Euclidean space: Quaternions (i) consist of a real and three imaginary components, encouraging richer representations; (ii) utilize Hamilton product which better encodes the inter-latent interactions across multiple Quaternion components; and (iii) result in a model with smaller degrees of freedom and less prone to overfitting. Unfortunately, most of the current recommender systems rely on real-valued representations in Euclidean space to model either user's long-term or short-term interests. In this paper, we fully utilize Quaternion space to model both user's long-term and short-term preferences. We first propose a QUaternion-based self-Attentive Long term user Encoding (QUALE) to study the user's long-term intents. Then, we propose a QUaternion-based self-Attentive Short term user Encoding (QUASE) to learn the user's short-term interests. To enhance our models' capability, we propose to fuse QUALE and QUASE into one model, namely QUALSE, by using a Quaternion-based gating mechanism. We further develop Quaternion-based Adversarial learning along with the Bayesian Personalized Ranking (QABPR) to improve our model's robustness. Extensive experiments on six real-world datasets show that our fused QUALSE model outperformed 11 state-of-the-art baselines, improving 8.43% at HIT@1 and 10.27% at NDCG@1 on average compared with the best baseline.

deep learning, neural network, quaternion, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3340531.3411926

2008.13335

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Signed Distance-based Deep Memory Recommender

Tran, Thanh, Liu, Xinyue, Lee, Kyumin, Kong, Xiangnan

arXiv.org Artificial IntelligenceMay-1-2019

Personalized recommendation algorithms learn a user's preference for an item by measuring a distance/similarity between them. However, some of the existing recommendation models (e.g., matrix factorization) assume a linear relationship between the user and item. This approach limits the capacity of recommender systems, since the interactions between users and items in real-world applications are much more complex than the linear relationship. To overcome this limitation, in this paper, we design and propose a deep learning framework called Signed Distance-based Deep Memory Recommender, which captures non-linear relationships between users and items explicitly and implicitly, and work well in both general recommendation task and shopping basket-based recommendation task. Through an extensive empirical study on six real-world datasets in the two recommendation tasks, our proposed approach achieved significant improvement over ten state-of-the-art recommendation models.

dataset, deep learning, neural network, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3308558.3313460

1905.00453

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Services (0.46)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Regularizing Matrix Factorization with User and Item Embeddings for Recommendation

Tran, Thanh, Lee, Kyumin, Liao, Yiming, Lee, Dongwon

arXiv.org Artificial IntelligenceAug-31-2018

Following recent successes in exploiting both latent factor and word embedding models in recommendation, we propose a novel Regularized Multi-Embedding (RME) based recommendation model that simultaneously encapsulates the following ideas via decomposition: (1) which items a user likes, (2) which two users co-like the same items, (3) which two items users often co-liked, and (4) which two items users often co-disliked. In experimental validation, the RME outperforms competing state-of-the-art models in both explicit and implicit feedback datasets, significantly improving Recall@5 by 5.9~7.0%, NDCG@20 by 4.3~5.6%, and MAP@10 by 7.9~8.9%. In addition, under the cold-start scenario for users with the lowest number of interactions, against the competing models, the RME outperforms NDCG@5 by 20.2% and 29.4% in MovieLens-10M and MovieLens-20M datasets, respectively. Our datasets and source code are available at: https://github.com/thanhdtran/RME.git.

artificial intelligence, dataset, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3269206.3271730

1809.00979

Country:

North America > United States (0.14)
Europe > Italy (0.14)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Crowds, Gigs, and Super Sellers: A Measurement Study of a Supply-Driven Crowdsourcing Marketplace

Ge, Hancheng (Texas A&M University) | Caverlee, James (Texas A&M University) | Lee, Kyumin (Utah State University)

AAAI ConferencesApr-4-2015

The crowdsourcing movement has spawned a host of successful efforts that organize large numbers of globally-distributed participants to tackle a range of tasks. While many demand-driven crowd marketplaces have emerged (like Amazon Mechanical Turk, often resulting in workers that are essentially replace-able), we are witnessing the rise of supply-driven marketplaces where specialized workers offer their expertise. In this paper, we present a comprehensive data-driven measurement study of one prominent supply-driven marketplace -- Fiverr -- wherein we investigate the sellers and their offerings (called "gigs"). As part of this investigation, we identify the key features distinguishing "super sellers" from regular participants and develop a machine learning based approach for inferring the quality of gigs, which is especially important for the vast majority of gigs with little feedback.

measurement study, super seller, supply-driven crowdsourcing marketplace, (1 more...)

AAAI Conferences

Ninth International AAAI Conference on Web and Social Media

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.89)
Information Technology > Artificial Intelligence (0.87)

Add feedback