AITopics | Yuan, Yunhao

Collaborating Authors

Yuan, Yunhao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Redefining Simplicity: Benchmarking Large Language Models from Lexical to Document Simplification

Qiang, Jipeng, Huang, Minjiang, Zhu, Yi, Yuan, Yunhao, Zhang, Chaowei, Yu, Kui

arXiv.org Artificial IntelligenceFeb-12-2025

Text simplification (TS) refers to the process of reducing the complexity of a text while retaining its original meaning and key information. Existing work only shows that large language models (LLMs) have outperformed supervised non-LLM-based methods on sentence simplification. This study offers the first comprehensive analysis of LLM performance across four TS tasks: lexical, syntactic, sentence, and document simplification. We compare lightweight, closed-source and open-source LLMs against traditional non-LLM methods using automatic metrics and human evaluations. Our experiments reveal that LLMs not only outperform non-LLM approaches in all four tasks but also often generate outputs that exceed the quality of existing human-annotated references. Finally, we present some future directions of TS in the era of LLMs.

large language model, machine learning, simplification, (17 more...)

arXiv.org Artificial Intelligence

2502.08281

Country:

Asia > China (0.14)
Europe > France (0.14)
Europe > Denmark (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

New Evaluation Paradigm for Lexical Simplification

Qiang, Jipeng, Huang, Minjiang, Zhu, Yi, Yuan, Yunhao, Zhang, Chaowei, Ouyang, Xiaoye

arXiv.org Artificial IntelligenceJan-25-2025

Lexical Simplification (LS) methods use a three-step pipeline: complex word identification, substitute generation, and substitute ranking, each with separate evaluation datasets. We found large language models (LLMs) can simplify sentences directly with a single prompt, bypassing the traditional pipeline. However, existing LS datasets are not suitable for evaluating these LLM-generated simplified sentences, as they focus on providing substitutes for single complex words without identifying all complex words in a sentence. To address this gap, we propose a new annotation method for constructing an all-in-one LS dataset through human-machine collaboration. Automated methods generate a pool of potential substitutes, which human annotators then assess, suggesting additional alternatives as needed. Additionally, we explore LLM-based methods with single prompts, in-context learning, and chain-of-thought techniques. We introduce a multi-LLMs collaboration approach to simulate each step of the LS task. Experimental results demonstrate that LS based on multi-LLMs approaches significantly outperforms existing baselines.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.15268

Country:

Asia > China (0.28)
South America > Brazil > Rio de Janeiro > South Atlantic Ocean (0.24)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Progressive Document-level Text Simplification via Large Language Models

Fang, Dengzhao, Qiang, Jipeng, Zhu, Yi, Yuan, Yunhao, Li, Wei, Liu, Yan

arXiv.org Artificial IntelligenceJan-7-2025

Research on text simplification has primarily focused on lexical and sentence-level changes. Long document-level simplification (DS) is still relatively unexplored. Large Language Models (LLMs), like ChatGPT, have excelled in many natural language processing tasks. However, their performance on DS tasks is unsatisfactory, as they often treat DS as merely document summarization. For the DS task, the generated long sequences not only must maintain consistency with the original document throughout, but complete moderate simplification operations encompassing discourses, sentences, and word-level simplifications. Human editors employ a hierarchical complexity simplification strategy to simplify documents. This study delves into simulating this strategy through the utilization of a multi-stage collaboration using LLMs. We propose a progressive simplification method (ProgDS) by hierarchically decomposing the task, including the discourse-level, topic-level, and lexical-level simplification. Experimental results demonstrate that ProgDS significantly outperforms existing smaller models or direct prompting with LLMs, advancing the state-of-the-art in the document simplification task.

large language model, machine learning, simplification, (21 more...)

arXiv.org Artificial Intelligence

2501.03857

Country: North America > United States > Virginia (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Prompt-tuning for Clickbait Detection via Text Summarization

Deng, Haoxiang, Zhu, Yi, Wang, Ye, Qiang, Jipeng, Yuan, Yunhao, Li, Yun, Zhang, Runmei

arXiv.org Artificial IntelligenceApr-17-2024

Clickbaits are surprising social posts or deceptive news headlines that attempt to lure users for more clicks, which have posted at unprecedented rates for more profit or commercial revenue. The spread of clickbait has significant negative impacts on the users, which brings users misleading or even click-jacking attacks. Different from fake news, the crucial problem in clickbait detection is determining whether the headline matches the corresponding content. Most existing methods compute the semantic similarity between the headlines and contents for detecting clickbait. However, due to significant differences in length and semantic features between headlines and contents, directly calculating semantic similarity is often difficult to summarize the relationship between them. To address this problem, we propose a prompt-tuning method for clickbait detection via text summarization in this paper, text summarization is introduced to summarize the contents, and clickbait detection is performed based on the similarity between the generated summary and the contents. Specifically, we first introduce a two-stage text summarization model to produce high-quality news summaries based on pre-trained language models, and then both the headlines and new generated summaries are incorporated as the inputs for prompt-tuning. Additionally, a variety of strategies are conducted to incorporate external knowledge for improving the performance of clickbait detection. The extensive experiments on well-known clickbait detection datasets demonstrate that our method achieved state-of-the-art performance.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.11206

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Marketing (1.00)
Media > News (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Clickbait Detection via Large Language Models

Wang, Han, Zhu, Yi, Wang, Ye, Li, Yun, Yuan, Yunhao, Qiang, Jipeng

arXiv.org Artificial IntelligenceDec-6-2023

Clickbait, which aims to induce users with some surprising and even thrilling headlines for increasing click-through rates, permeates almost all online content publishers, such as news portals and social media. Recently, Large Language Models (LLMs) have emerged as a powerful instrument and achieved tremendous success in a series of NLP downstream tasks. However, it is not yet known whether LLMs can be served as a high-quality clickbait detection system. In this paper, we analyze the performance of LLMs in the few-shot and zero-shot scenarios on several English and Chinese benchmark datasets. Experimental results show that LLMs cannot achieve the best results compared to the state-of-the-art deep and fine-tuning PLMs methods. Different from human intuition, the experiments demonstrated that LLMs cannot make satisfied clickbait detection just by the headlines.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2306.09597

Genre: Research Report > New Finding (0.86)

Industry:

Marketing (1.00)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multilingual Lexical Simplification via Paraphrase Generation

Liu, Kang, Qiang, Jipeng, Li, Yun, Yuan, Yunhao, Zhu, Yi, Hua, Kaixun

arXiv.org Artificial IntelligenceJul-27-2023

Lexical simplification (LS) methods based on pretrained language models have made remarkable progress, generating potential substitutes for a complex word through analysis of its contextual surroundings. However, these methods require separate pretrained models for different languages and disregard the preservation of sentence meaning. In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence's meaning. We regard paraphrasing as a zero-shot translation task within multilingual neural machine translation that supports hundreds of languages. After feeding the input sentence into the encoder of paraphrase modeling, we generate the substitutes based on a novel decoding strategy that concentrates solely on the lexical variations of the complex word. Experimental results demonstrate that our approach surpasses BERT-based methods and zero-shot GPT3-based method significantly on English, Spanish, and Portuguese.

artificial intelligence, complex word, natural language, (15 more...)

arXiv.org Artificial Intelligence

2307.15286

Country: North America > United States > Florida > Hillsborough County > Tampa (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
Law (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

ParaLS: Lexical Substitution via Pretrained Paraphraser

Qiang, Jipeng, Liu, Kang, Li, Yun, Yuan, Yunhao, Zhu, Yi

arXiv.org Artificial IntelligenceMay-14-2023

Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence's meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence's meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.

artificial intelligence, natural language, substitute, (19 more...)

arXiv.org Artificial Intelligence

2305.08146

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Minority Stress Experienced by LGBTQ Online Communities during the COVID-19 Pandemic

Yuan, Yunhao, Verma, Gaurav, Keller, Barbara, Aledavood, Talayeh

arXiv.org Artificial IntelligenceMay-10-2023

The COVID-19 pandemic has disproportionately impacted the lives of minorities, such as members of the LGBTQ community (lesbian, gay, bisexual, transgender, and queer) due to pre-existing social disadvantages and health disparities. Although extensive research has been carried out on the impact of the COVID-19 pandemic on different aspects of the general population's lives, few studies are focused on the LGBTQ population. In this paper, we develop and evaluate two sets of machine learning classifiers using a pre-pandemic and a during-pandemic dataset to identify Twitter posts exhibiting minority stress, which is a unique pressure faced by the members of the LGBTQ population due to their sexual and gender identities. We demonstrate that our best pre- and during-pandemic models show strong and stable performance for detecting posts that contain minority stress. We investigate the linguistic differences in minority stress posts across pre- and during-pandemic periods. We find that anger words are strongly associated with minority stress during the COVID-19 pandemic. We explore the impact of the pandemic on the emotional states of the LGBTQ population by adopting propensity score-based matching to perform a causal analysis. The results show that the LGBTQ population have a greater increase in the usage of cognitive words and worsened observable attribute in the usage of positive emotion words than the group of the general population with similar pre-pandemic behavioral attributes. Our findings have implications for the public health domain and policy-makers to provide adequate support, especially with respect to mental health, to the LGBTQ population during future crises.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2205.09511

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Sentence Simplification via Large Language Models

Feng, Yutao, Qiang, Jipeng, Li, Yun, Yuan, Yunhao, Zhu, Yi

arXiv.org Artificial IntelligenceFeb-23-2023

Nevertheless, it remains unclear how LLMs perform in SS task compared to current SS methods. To Sentence Simplification aims to rephrase address this gap in research, we undertake a systematic complex sentences into simpler sentences evaluation of the Zero-/Few-Shot learning capability of while retaining original meaning. Large Language LLMs, by assessing their performance on existing SS models (LLMs) have demonstrated the benchmarks. We carry out an empirical comparison of ability to perform a variety of natural language the performance of ChatGPT and the most advanced processing tasks. However, it is not GPT3.5 model (text-davinci-003).

artificial intelligence, natural language, simplification, (15 more...)

arXiv.org Artificial Intelligence

2302.11957

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback

Mental Health Coping Stories on Social Media: A Causal-Inference Study of Papageno Effect

Yuan, Yunhao, Saha, Koustuv, Keller, Barbara, Isometsä, Erkki Tapio, Aledavood, Talayeh

arXiv.org Artificial IntelligenceFeb-20-2023

A considerable amount of literature [16, 25, 49] has studied The Papageno effect concerns how media can play a positive role and re-confirmed the harmful effect of media, dubbed the "Werther in preventing and mitigating suicidal ideation and behaviors. With effect" [38], describing a spike in suicides after a heavily publicized the increasing ubiquity and widespread use of social media, individuals suicide. However, there is much less research about the beneficial often express and share lived experiences and struggles effects of media, referred to as the "Papageno effect", describing a decrease with mental health. However, there is a gap in our understanding in suicides after reporting alternatives to suicide. Niederkrotenthaler about the existence and effectiveness of the Papageno effect in social et al. explored the possible protective effect of media media, which we study in this paper. In particular, we adopt a reporting about suicide [34]. This study finds a decrease in suicides, causal-inference framework to examine the impact of exposure to if reports of suicide related content portray ways of overcoming mental health coping stories on individuals on Twitter. We obtain suicidal ideation without narrating suicidal behaviors.

artificial intelligence, machine learning, story post, (13 more...)

arXiv.org Artificial Intelligence

2302.09885

Country: Europe (0.69)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback