AITopics | Jiang, Haiyun

Collaborating Authors

Jiang, Haiyun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Benchmark for Text Expansion: Datasets, Metrics, and Baselines

Chen, Yi, Jiang, Haiyun, Bi, Wei, Wang, Rui, Wang, Longyue, Shi, Shuming, Xu, Ruifeng

arXiv.org Artificial IntelligenceSep-17-2023

This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. Different from existing insertion-based writing assistance tasks, TE requires the model to be more flexible in both locating and generation, and also more cautious in keeping basic semantics. We leverage four complementary approaches to construct a dataset with 12 million automatically generated instances and 2K human-annotated references for both English and Chinese. To facilitate automatic evaluation, we design various metrics from multiple perspectives. In particular, we propose Info-Gain to effectively measure the informativeness of expansions, which is an important quality dimension in TE. On top of a pre-trained text-infilling model, we build both pipelined and joint Locate&Infill models, which demonstrate the superiority over the Text2Text baselines, especially in expansion informativeness. Experiments verify the feasibility of the TE task and point out potential directions for future research toward better automatic text expansion.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2309.09198

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study

Chen, Yi, Wang, Rui, Jiang, Haiyun, Shi, Shuming, Xu, Ruifeng

arXiv.org Artificial IntelligenceSep-17-2023

Evaluating the quality of generated text is a challenging task in NLP, due to the inherent complexity and diversity of text. Recently, large language models (LLMs) have garnered significant attention due to their impressive performance in various tasks. Therefore, we present this paper to investigate the effectiveness of LLMs, especially ChatGPT, and explore ways to optimize their use in assessing text quality. We compared three kinds of reference-free evaluation methods. The experimental results prove that ChatGPT is capable of evaluating text quality effectively from various perspectives without reference and demonstrates superior performance than most existing automatic metrics. In particular, the Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches. However, directly comparing the quality of two texts may lead to suboptimal results. We believe this paper will provide valuable insights for evaluating text quality with LLMs and have released the used data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.00723

Country:

Asia > China (0.28)
North America > United States > Minnesota (0.14)
North America > United States > Colorado (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Visual Taxonomy Expansion

Zhu, Tinghui, Liu, Jingping, Liang, Jiaqing, Jiang, Haiyun, Xiao, Yanghua, Wang, Zongyu, Xie, Rui, Xian, Yunsen

arXiv.org Artificial IntelligenceSep-12-2023

Taxonomy expansion task is essential in organizing the ever-increasing volume of new concepts into existing taxonomies. Most existing methods focus exclusively on using textual semantics, leading to an inability to generalize to unseen terms and the "Prototypical Hypernym Problem." In this paper, we propose Visual Taxonomy Expansion (VTE), introducing visual features into the taxonomy expansion task. We propose a textual hypernymy learning task and a visual prototype learning task to cluster textual and visual semantics. In addition to the tasks on respective modalities, we introduce a hyper-proto constraint that integrates textual and visual semantics to produce fine-grained visual semantics. Our method is evaluated on two datasets, where we obtain compelling results. Specifically, on the Chinese taxonomy dataset, our method significantly improves accuracy by 8.75 %. Additionally, our approach performs better than ChatGPT on the Chinese taxonomy dataset.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3613845

2309.06105

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)
North America > United States > Texas (0.14)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design

Chen, Yongrui, Jiang, Haiyun, Huang, Xinting, Shi, Shuming, Qi, Guilin

arXiv.org Artificial IntelligenceSep-11-2023

High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collection methods are limited by unrealistic manual labeling costs or by the hallucination of relying solely on LLM generation. To address the problems, this paper presents a scalable method to automatically collect high-quality instructional adaptation data by training language models to automatically design tasks based on human-written texts. Intuitively, human-written text helps to help the model attenuate illusions during the generation of tasks. Unlike instruction back-translation-based methods that directly take the given text as a response, we require the model to generate the \textit{instruction}, \textit{input}, and \textit{output} simultaneously to filter the noise. The results of the automated and manual evaluation experiments demonstrate the quality of our dataset.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2309.05447

Country:

North America > United States > Hawaii (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Wang, Longyue, Du, Zefeng, Liu, Donghuai, Cai, Deng, Yu, Dian, Jiang, Haiyun, Wang, Yan, Cui, Leyang, Shi, Shuming, Tu, Zhaopeng

arXiv.org Artificial IntelligenceJul-21-2023

Modeling discourse -- the linguistic phenomena that go beyond individual sentences, is a fundamental yet challenging aspect of natural language processing (NLP). However, existing evaluation benchmarks primarily focus on the evaluation of inter-sentence properties and overlook critical discourse phenomena that cross sentences. To bridge the gap, we propose Disco-Bench, a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks, covering understanding, translation, and generation. Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena (e.g. cohesion and coherence) in Chinese and/or English. For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge. We totally evaluate 20 general-, in-domain and commercial models based on Transformer, advanced pretraining architectures and large language models (LLMs). Our results show (1) the challenge and necessity of our evaluation benchmark; (2) fine-grained pretraining based on literary document-level training data consistently improves the modeling of discourse information. We will release the datasets, pretrained models, and leaderboard, which we hope can significantly facilitate research in this field: https://github.com/longyuewangdcu/Disco-Bench.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2307.08074

Country:

Asia > China (0.46)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Education (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model

Shen, Lingfeng, Jiang, Haiyun, Liu, Lemao, Shi, Shuming

arXiv.org Artificial IntelligenceJun-3-2023

Sentence embedding is one of the most fundamental tasks in Natural Language Processing and plays an important role in various tasks. The recent breakthrough in sentence embedding is achieved by pre-trained language models (PLMs). Despite its success, an embedded vector (Sen2Vec) representing a point estimate does not naturally express uncertainty in a taskagnostic way. This paper thereby proposes an efficient framework on probabilistic sentence embedding (Sen2Pro) from PLMs, and it represents a sentence as a probability density distribution in an embedding space to reflect both model uncertainty and data uncertainty (i.e., many-to-one nature) in the sentence representation. The proposed framework performs in a plug-and-play way without retraining PLMs anymore, and it is easy to implement and generally applied on top of any PLM. The superiority of Sen2Pro over Sen2Vec has been theoretically verified and practically illustrated on different NLP tasks.

artificial intelligence, natural language, sen2pro, (17 more...)

arXiv.org Artificial Intelligence

2306.02247

Country: North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning

Lin, Hongzhan, Yi, Pengyao, Ma, Jing, Jiang, Haiyun, Luo, Ziyang, Shi, Shuming, Liu, Ruifang

arXiv.org Artificial IntelligenceMay-25-2023

The spread of rumors along with breaking events seriously hinders the truth in the era of social media. Previous studies reveal that due to the lack of annotated resources, rumors presented in minority languages are hard to be detected. Furthermore, the unforeseen breaking events not involved in yesterday's news exacerbate the scarcity of data resources. In this work, we propose a novel zero-shot framework based on prompt learning to detect rumors falling in different domains or presented in different languages. More specifically, we firstly represent rumor circulated on social media as diverse propagation threads, then design a hierarchical prompt encoding mechanism to learn language-agnostic contextual representations for both prompts and rumor data. To further enhance domain adaptation, we model the domain-invariant structural features from the propagation threads, to incorporate structural position representations of influential community response. In addition, a new virtual response augmentation method is used to improve model training. Extensive experiments conducted on three real-world datasets demonstrate that our proposed model achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2212.01117

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Media > News (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Frequency-aware Dimension Selection for Static Word Embedding by Mixed Product Distance

Shen, Lingfeng, Jiang, Haiyun, Liu, Lemao, Chen, Ying

arXiv.org Artificial IntelligenceMay-12-2023

Static word embedding is still useful, particularly for context-unavailable tasks, because in the case of no context available, pre-trained language models often perform worse than static word embeddings. Although dimension is a key factor determining the quality of static word embeddings, automatic dimension selection is rarely discussed. In this paper, we investigate the impact of word frequency on the dimension selection, and empirically find that word frequency is so vital that it needs to be taken into account during dimension selection. Based on such an empirical finding, this paper proposes a dimension selection method that uses a metric (Mixed Product Distance, MPD) to select a proper dimension for word embedding algorithms without training any word embedding. Through applying a post-processing function to oracle matrices, the MPD-based method can de-emphasize the impact of word frequency. Experiments on both context-unavailable and context-available tasks demonstrate the better efficiency-performance trade-off of our MPD-based dimension selection method over baselines.

machine learning, natural language, selection, (20 more...)

arXiv.org Artificial Intelligence

2305.07826

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Simple and Plug-and-play Method for Unsupervised Sentence Representation Enhancement

Shen, Lingfeng, Jiang, Haiyun, Liu, Lemao, Shi, Shuming

arXiv.org Artificial IntelligenceMay-12-2023

Generating proper embedding of sentences through an unsupervised way is beneficial to semantic matching and retrieval problems in real-world scenarios. This paper presents Representation ALchemy (RepAL), an extremely simple post-processing method that enhances sentence representations. The basic idea in RepAL is to de-emphasize redundant information of sentence embedding generated by pre-trained models. Through comprehensive experiments, we show that RepAL is free of training and is a plug-and-play method that can be combined with most existing unsupervised sentence learning models. We also conducted in-depth analysis to understand RepAL.

artificial intelligence, natural language, text processing, (17 more...)

arXiv.org Artificial Intelligence

2305.07824

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

TextShield: Beyond Successfully Detecting Adversarial Sentences in Text Classification

Shen, Lingfeng, Zhang, Ze, Jiang, Haiyun, Chen, Ying

arXiv.org Artificial IntelligenceFeb-3-2023

Adversarial attack serves as a major challenge for neural network models in NLP, which precludes the model's deployment in safety-critical applications. A recent line of work, detection-based defense, aims to distinguish adversarial sentences from benign ones. However, the core limitation of previous detection methods is being incapable of giving correct predictions on adversarial sentences unlike defense methods from other paradigms. To solve this issue, this paper proposes TextShield: (1) we discover a link between text attack and saliency information, and then we propose a saliency-based detector, which can effectively detect whether an input sentence is adversarial or not. By combining the saliency-based detector and corrector, TextShield extends the detection-only paradigm to a detection-correction paradigm, thus filling the gap in the existing detection-based defense. Comprehensive experiments show that (a) TextShield consistently achieves higher or comparable performance than state-ofthe-art defense methods across various attacks on different benchmarks. Deep Neural Networks (DNNs) have obtained great progress in the field of natural language processing (NLP) but are vulnerable to adversarial attacks, leading to security and safety concerns, and research on defense algorithms against such attacks is urgently needed. Specifically, the most common attack for NLP is word-level attack (Wang et al., 2019b; Garg & Ramakrishnan, 2020; Zang et al., 2020; Li et al., 2021), which is usually implemented by adding, deleting or substituting words within a sentence. Such an attack often brings catastrophic performance degradation to DNN-based models. Although a number of defense methods can be found in the literature of NLP (Jia et al., 2019; Ko et al., 2019; Jones et al., 2020; Wang et al., 2020b; Zhou et al., 2021; Dong et al., 2021; Bao et al., 2021), there are remaining several unsolved research problems. One problem lies in the ineffective application of the existing detection-based defense paradigm to the adversarial defense scenario, which consists of two steps: adversarial detection that detects whether an input sentence is adversarial or not, and a model prediction that predicts a label for the input.

adversarial sentence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2302.02023

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback