Yuan, Yunhao
Redefining Simplicity: Benchmarking Large Language Models from Lexical to Document Simplification
Qiang, Jipeng, Huang, Minjiang, Zhu, Yi, Yuan, Yunhao, Zhang, Chaowei, Yu, Kui
Text simplification (TS) refers to the process of reducing the complexity of a text while retaining its original meaning and key information. Existing work only shows that large language models (LLMs) have outperformed supervised non-LLM-based methods on sentence simplification. This study offers the first comprehensive analysis of LLM performance across four TS tasks: lexical, syntactic, sentence, and document simplification. We compare lightweight, closed-source and open-source LLMs against traditional non-LLM methods using automatic metrics and human evaluations. Our experiments reveal that LLMs not only outperform non-LLM approaches in all four tasks but also often generate outputs that exceed the quality of existing human-annotated references. Finally, we present some future directions of TS in the era of LLMs.
New Evaluation Paradigm for Lexical Simplification
Qiang, Jipeng, Huang, Minjiang, Zhu, Yi, Yuan, Yunhao, Zhang, Chaowei, Ouyang, Xiaoye
Lexical Simplification (LS) methods use a three-step pipeline: complex word identification, substitute generation, and substitute ranking, each with separate evaluation datasets. We found large language models (LLMs) can simplify sentences directly with a single prompt, bypassing the traditional pipeline. However, existing LS datasets are not suitable for evaluating these LLM-generated simplified sentences, as they focus on providing substitutes for single complex words without identifying all complex words in a sentence. To address this gap, we propose a new annotation method for constructing an all-in-one LS dataset through human-machine collaboration. Automated methods generate a pool of potential substitutes, which human annotators then assess, suggesting additional alternatives as needed. Additionally, we explore LLM-based methods with single prompts, in-context learning, and chain-of-thought techniques. We introduce a multi-LLMs collaboration approach to simulate each step of the LS task. Experimental results demonstrate that LS based on multi-LLMs approaches significantly outperforms existing baselines.
Progressive Document-level Text Simplification via Large Language Models
Fang, Dengzhao, Qiang, Jipeng, Zhu, Yi, Yuan, Yunhao, Li, Wei, Liu, Yan
Research on text simplification has primarily focused on lexical and sentence-level changes. Long document-level simplification (DS) is still relatively unexplored. Large Language Models (LLMs), like ChatGPT, have excelled in many natural language processing tasks. However, their performance on DS tasks is unsatisfactory, as they often treat DS as merely document summarization. For the DS task, the generated long sequences not only must maintain consistency with the original document throughout, but complete moderate simplification operations encompassing discourses, sentences, and word-level simplifications. Human editors employ a hierarchical complexity simplification strategy to simplify documents. This study delves into simulating this strategy through the utilization of a multi-stage collaboration using LLMs. We propose a progressive simplification method (ProgDS) by hierarchically decomposing the task, including the discourse-level, topic-level, and lexical-level simplification. Experimental results demonstrate that ProgDS significantly outperforms existing smaller models or direct prompting with LLMs, advancing the state-of-the-art in the document simplification task.
Prompt-tuning for Clickbait Detection via Text Summarization
Deng, Haoxiang, Zhu, Yi, Wang, Ye, Qiang, Jipeng, Yuan, Yunhao, Li, Yun, Zhang, Runmei
Clickbaits are surprising social posts or deceptive news headlines that attempt to lure users for more clicks, which have posted at unprecedented rates for more profit or commercial revenue. The spread of clickbait has significant negative impacts on the users, which brings users misleading or even click-jacking attacks. Different from fake news, the crucial problem in clickbait detection is determining whether the headline matches the corresponding content. Most existing methods compute the semantic similarity between the headlines and contents for detecting clickbait. However, due to significant differences in length and semantic features between headlines and contents, directly calculating semantic similarity is often difficult to summarize the relationship between them. To address this problem, we propose a prompt-tuning method for clickbait detection via text summarization in this paper, text summarization is introduced to summarize the contents, and clickbait detection is performed based on the similarity between the generated summary and the contents. Specifically, we first introduce a two-stage text summarization model to produce high-quality news summaries based on pre-trained language models, and then both the headlines and new generated summaries are incorporated as the inputs for prompt-tuning. Additionally, a variety of strategies are conducted to incorporate external knowledge for improving the performance of clickbait detection. The extensive experiments on well-known clickbait detection datasets demonstrate that our method achieved state-of-the-art performance.
Clickbait Detection via Large Language Models
Wang, Han, Zhu, Yi, Wang, Ye, Li, Yun, Yuan, Yunhao, Qiang, Jipeng
Clickbait, which aims to induce users with some surprising and even thrilling headlines for increasing click-through rates, permeates almost all online content publishers, such as news portals and social media. Recently, Large Language Models (LLMs) have emerged as a powerful instrument and achieved tremendous success in a series of NLP downstream tasks. However, it is not yet known whether LLMs can be served as a high-quality clickbait detection system. In this paper, we analyze the performance of LLMs in the few-shot and zero-shot scenarios on several English and Chinese benchmark datasets. Experimental results show that LLMs cannot achieve the best results compared to the state-of-the-art deep and fine-tuning PLMs methods. Different from human intuition, the experiments demonstrated that LLMs cannot make satisfied clickbait detection just by the headlines.
Multilingual Lexical Simplification via Paraphrase Generation
Liu, Kang, Qiang, Jipeng, Li, Yun, Yuan, Yunhao, Zhu, Yi, Hua, Kaixun
Lexical simplification (LS) methods based on pretrained language models have made remarkable progress, generating potential substitutes for a complex word through analysis of its contextual surroundings. However, these methods require separate pretrained models for different languages and disregard the preservation of sentence meaning. In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence's meaning. We regard paraphrasing as a zero-shot translation task within multilingual neural machine translation that supports hundreds of languages. After feeding the input sentence into the encoder of paraphrase modeling, we generate the substitutes based on a novel decoding strategy that concentrates solely on the lexical variations of the complex word. Experimental results demonstrate that our approach surpasses BERT-based methods and zero-shot GPT3-based method significantly on English, Spanish, and Portuguese.
ParaLS: Lexical Substitution via Pretrained Paraphraser
Qiang, Jipeng, Liu, Kang, Li, Yun, Yuan, Yunhao, Zhu, Yi
Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence's meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence's meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.
Minority Stress Experienced by LGBTQ Online Communities during the COVID-19 Pandemic
Yuan, Yunhao, Verma, Gaurav, Keller, Barbara, Aledavood, Talayeh
The COVID-19 pandemic has disproportionately impacted the lives of minorities, such as members of the LGBTQ community (lesbian, gay, bisexual, transgender, and queer) due to pre-existing social disadvantages and health disparities. Although extensive research has been carried out on the impact of the COVID-19 pandemic on different aspects of the general population's lives, few studies are focused on the LGBTQ population. In this paper, we develop and evaluate two sets of machine learning classifiers using a pre-pandemic and a during-pandemic dataset to identify Twitter posts exhibiting minority stress, which is a unique pressure faced by the members of the LGBTQ population due to their sexual and gender identities. We demonstrate that our best pre- and during-pandemic models show strong and stable performance for detecting posts that contain minority stress. We investigate the linguistic differences in minority stress posts across pre- and during-pandemic periods. We find that anger words are strongly associated with minority stress during the COVID-19 pandemic. We explore the impact of the pandemic on the emotional states of the LGBTQ population by adopting propensity score-based matching to perform a causal analysis. The results show that the LGBTQ population have a greater increase in the usage of cognitive words and worsened observable attribute in the usage of positive emotion words than the group of the general population with similar pre-pandemic behavioral attributes. Our findings have implications for the public health domain and policy-makers to provide adequate support, especially with respect to mental health, to the LGBTQ population during future crises.
Sentence Simplification via Large Language Models
Feng, Yutao, Qiang, Jipeng, Li, Yun, Yuan, Yunhao, Zhu, Yi
Nevertheless, it remains unclear how LLMs perform in SS task compared to current SS methods. To Sentence Simplification aims to rephrase address this gap in research, we undertake a systematic complex sentences into simpler sentences evaluation of the Zero-/Few-Shot learning capability of while retaining original meaning. Large Language LLMs, by assessing their performance on existing SS models (LLMs) have demonstrated the benchmarks. We carry out an empirical comparison of ability to perform a variety of natural language the performance of ChatGPT and the most advanced processing tasks. However, it is not GPT3.5 model (text-davinci-003).
Mental Health Coping Stories on Social Media: A Causal-Inference Study of Papageno Effect
Yuan, Yunhao, Saha, Koustuv, Keller, Barbara, Isometsä, Erkki Tapio, Aledavood, Talayeh
A considerable amount of literature [16, 25, 49] has studied The Papageno effect concerns how media can play a positive role and re-confirmed the harmful effect of media, dubbed the "Werther in preventing and mitigating suicidal ideation and behaviors. With effect" [38], describing a spike in suicides after a heavily publicized the increasing ubiquity and widespread use of social media, individuals suicide. However, there is much less research about the beneficial often express and share lived experiences and struggles effects of media, referred to as the "Papageno effect", describing a decrease with mental health. However, there is a gap in our understanding in suicides after reporting alternatives to suicide. Niederkrotenthaler about the existence and effectiveness of the Papageno effect in social et al. explored the possible protective effect of media media, which we study in this paper. In particular, we adopt a reporting about suicide [34]. This study finds a decrease in suicides, causal-inference framework to examine the impact of exposure to if reports of suicide related content portray ways of overcoming mental health coping stories on individuals on Twitter. We obtain suicidal ideation without narrating suicidal behaviors.