AITopics | Yu, Changlong

Collaborating Authors

Yu, Changlong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation

Xu, Chen, Lan, Tian, Yu, Changlong, Wang, Wei, Gao, Jun, Ji, Yu, Dong, Qunxi, Qian, Kun, Li, Piji, Bi, Wei, Hu, Bin

arXiv.org Artificial IntelligenceJul-7-2024

Constrained decoding approaches aim to control the meaning or style of text generated by a Pre-trained Language Model (PLM) using specific target words during inference. However, these methods often guide plausible continuations by greedily selecting targets, which, while completing the task, may disrupt the natural patterns of human language generation. In this work, we propose a novel decoding framework, DECIDER, which enables us to program rules on how we complete tasks to control a PLM. Differing from previous work, our framework transforms the encouragement of target words into the encouragement of all words that satisfy the rule. Specifically, DECIDER is a dual system where a PLM is equipped with a First-OrderLogic (FOL) reasoner to express and evaluate the rules, and a decision function to merge the outputs from both systems to steer the generation. Experiments on CommonGen and PersonaChat demonstrate that DECIDER can effectively follow given rules to achieve generation tasks in a more human-like manner.

computational linguistic, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2403.01954

Country:

Europe (1.00)
Asia > China (0.28)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.50)

Industry:

Education > Educational Setting (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction

Gao, Jun, Zhao, Huan, Zhang, Yice, Wang, Wei, Yu, Changlong, Xu, Ruifeng

arXiv.org Artificial IntelligenceOct-8-2023

Information Extraction (IE) is an essential task in Natural Language Processing. Traditional methods have relied on coarse-grained extraction with simple instructions. However, with the emergence of Large Language Models (LLMs), there is a need to adapt IE techniques to leverage the capabilities of these models. This paper introduces a fine-grained IE benchmark dataset tailored for LLMs, employing augmented instructions for each information type, which includes task descriptions, extraction rules, output formats, and examples. Through extensive evaluations, we observe that encoder-decoder models, particularly T5 and FLAN-T5, perform well in generalizing to unseen information types, while ChatGPT exhibits greater adaptability to new task forms. Our results also indicate that performance is not solely dictated by model scale, and highlight the significance of architecture, data diversity, and learning techniques. This work paves the way for a more refined and versatile utilization of LLMs in Information Extraction.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2310.05092

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery

Yu, Changlong, Wang, Weiqi, Liu, Xin, Bai, Jiaxin, Song, Yangqiu, Li, Zheng, Gao, Yifan, Cao, Tianyu, Yin, Bing

arXiv.org Artificial IntelligenceMay-11-2023

Understanding users' intentions in e-commerce platforms requires commonsense knowledge. In this paper, we present FolkScope, an intention knowledge graph construction framework to reveal the structure of humans' minds about purchasing items. As commonsense knowledge is usually ineffable and not expressed explicitly, it is challenging to perform information extraction. Thus, we propose a new approach that leverages the generation power of large language models~(LLMs) and human-in-the-loop annotation to semi-automatically construct the knowledge graph. LLMs first generate intention assertions via e-commerce-specific prompts to explain shopping behaviors, where the intention can be an open reason or a predicate falling into one of 18 categories aligning with ConceptNet, e.g., IsA, MadeOf, UsedFor, etc. Then we annotate plausibility and typicality labels of sampled intentions as training data in order to populate human judgments to all automatic generations. Last, to structurize the assertions, we propose pattern mining and conceptualization to form more condensed and abstract knowledge. Extensive evaluations and studies demonstrate that our constructed knowledge graph can well model e-commerce knowledge and have many potential applications.

artificial intelligence, knowledge, natural language, (17 more...)

arXiv.org Artificial Intelligence

2211.08316

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)

Genre: Research Report (0.82)

Industry:

Media (1.00)
Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Exploring the Feasibility of ChatGPT for Event Extraction

Gao, Jun, Zhao, Huan, Yu, Changlong, Xu, Ruifeng

arXiv.org Artificial IntelligenceMar-9-2023

Event extraction is a fundamental task in natural language processing that involves identifying and extracting information about events mentioned in text. However, it is a challenging task due to the lack of annotated data, which is expensive and time-consuming to obtain. The emergence of large language models (LLMs) such as ChatGPT provides an opportunity to solve language tasks with simple prompts without the need for task-specific datasets and fine-tuning. While ChatGPT has demonstrated impressive results in tasks like machine translation, text summarization, and question answering, it presents challenges when used for complex tasks like event extraction. Unlike other tasks, event extraction requires the model to be provided with a complex set of instructions defining all event types and their schemas. To explore the feasibility of ChatGPT for event extraction and the challenges it poses, we conducted a series of experiments. Our results show that ChatGPT has, on average, only 51.04% of the performance of a task-specific model such as EEQA in long-tail and complex scenarios. Our usability testing experiments indicate that ChatGPT is not robust enough, and continuous refinement of the prompt does not lead to stable performance improvements, which can result in a poor user experience. Besides, ChatGPT is highly sensitive to different prompt styles.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.03836

Country: Asia > Middle East > Republic of Türkiye (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction

Gao, Jun, Yu, Changlong, Wang, Wei, Zhao, Huan, Xu, Ruifeng

arXiv.org Artificial IntelligenceJan-6-2023

We present Mask-then-Fill, a flexible and effective data augmentation framework for event extraction. Our approach allows for more flexible manipulation of text and thus can generate more diverse data while keeping the original event structure unchanged as much as possible. Specifically, it first randomly masks out an adjunct sentence fragment and then infills a variable-length text span with a fine-tuned infilling model. The main advantage lies in that it can replace a fragment of arbitrary length in the text with another fragment of variable length, compared to the existing methods which can only replace a single word or a fixed-length fragment. On trigger and argument extraction tasks, the proposed framework is more effective than baseline methods and it demonstrates particularly strong results in the low-resource setting. Our further analysis shows that it achieves a good balance between diversity and distributional similarity.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2301.02427

Country:

Europe (1.00)
North America > United States (0.69)
Asia > China (0.48)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Add feedback

A Generative Approach for Script Event Prediction via Contrastive Fine-tuning

Zhu, Fangqi, Gao, Jun, Yu, Changlong, Wang, Wei, Xu, Chen, Mu, Xin, Yang, Min, Xu, Ruifeng

arXiv.org Artificial IntelligenceDec-9-2022

Script event prediction aims to predict the subsequent event given the context. This requires the capability to infer the correlations between events. Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge~(e.g., discourse relations). Though promising results have been achieved, some challenges still remain. First, the pretrained language models adopted by current works ignore event-level knowledge, resulting in an inability to capture the correlations between events well. Second, modeling correlations between events with discourse relations is limited because it can only capture explicit correlations between events with discourse markers, and cannot capture many implicit correlations. To this end, we propose a novel generative approach for this task, in which a pretrained language model is fine-tuned with an event-centric pretraining objective and predicts the next event within a generative paradigm. Specifically, we first introduce a novel event-level blank infilling strategy as the learning objective to inject event-level knowledge into the pretrained language model, and then design a likelihood-based contrastive loss for fine-tuning the generative model. Instead of using an additional prediction layer, we perform prediction by using sequence likelihoods generated by the generative model. Our approach models correlations between events in a soft way without any external knowledge. The likelihood-based prediction eliminates the need to use additional networks to make predictions and is somewhat interpretable since it scores each word in the event. Experimental results on the multi-choice narrative cloze~(MCNC) task demonstrate that our approach achieves better results than other state-of-the-art baselines. Our code will be available at https://github.com/zhufq00/mcnc.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2212.03496

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback