AITopics | Yu, Xiaoyan

Collaborating Authors

Yu, Xiaoyan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SocialED: A Python Library for Social Event Detection

Zhang, Kun, Yu, Xiaoyan, Li, Pu, Peng, Hao, Yu, Philip S.

arXiv.org Artificial IntelligenceDec-17-2024

SocialED is a comprehensive, open-source Python library designed to support social event detection (SED) tasks, integrating 19 detection algorithms and 14 diverse datasets. It provides a unified API with detailed documentation, offering researchers and practitioners a complete solution for event detection in social media. The library is designed with modularity in mind, allowing users to easily adapt and extend components for various use cases. SocialED supports a wide range of preprocessing techniques, such as graph construction and tokenization, and includes standardized interfaces for training models and making predictions. By integrating popular deep learning frameworks, SocialED ensures high efficiency and scalability across both CPU and GPU environments. The library is built adhering to high code quality standards, including unit testing, continuous integration, and code coverage, ensuring that SocialED delivers robust, maintainable software.

data mining, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.13472

Country:

Asia (0.28)
North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Social Events (0.67)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(2 more...)

Add feedback

Towards Effective, Efficient and Unsupervised Social Event Detection in the Hyperbolic Space

Yu, Xiaoyan, Wei, Yifan, Zhou, Shuaishuai, Yang, Zhiwei, Sun, Li, Peng, Hao, Zhu, Liehuang, Yu, Philip S.

arXiv.org Artificial IntelligenceDec-14-2024

The vast, complex, and dynamic nature of social message data has posed challenges to social event detection (SED). Despite considerable effort, these challenges persist, often resulting in inadequately expressive message representations (ineffective) and prolonged learning durations (inefficient). In response to the challenges, this work introduces an unsupervised framework, HyperSED (Hyperbolic SED). Specifically, the proposed framework first models social messages into semantic-based message anchors, and then leverages the structure of the anchor graph and the expressiveness of the hyperbolic space to acquire structure- and geometry-aware anchor representations. Finally, HyperSED builds the partitioning tree of the anchor message graph by incorporating differentiable structural information as the reflection of the detected events. Extensive experiments on public datasets demonstrate HyperSED's competitive performance, along with a substantial improvement in efficiency compared to the current state-of-the-art unsupervised paradigm. Statistically, HyperSED boosts incremental SED by an average of 2%, 2%, and 25% in NMI, AMI, and ARI, respectively; enhancing efficiency by up to 37.41 times and at least 12.10 times, illustrating the advancement of the proposed framework. Our code is publicly available at https://github.com/XiaoyanWork/HyperSED.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.10712

Country:

Asia > China (0.29)
North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Social Events (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications > Social Media (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Multi-View Incongruity Learning for Multimodal Sarcasm Detection

Guo, Diandian, Cao, Cong, Yuan, Fangfang, Liu, Yanbing, Zeng, Guangjie, Yu, Xiaoyan, Peng, Hao, Yu, Philip S.

arXiv.org Artificial IntelligenceDec-8-2024

Multimodal sarcasm detection (MSD) is essential for various downstream tasks. Existing MSD methods tend to rely on spurious correlations. These methods often mistakenly prioritize non-essential features yet still make correct predictions, demonstrating poor generalizability beyond training environments. Regarding this phenomenon, this paper undertakes several initiatives. Firstly, we identify two primary causes that lead to the reliance of spurious correlations. Secondly, we address these challenges by proposing a novel method that integrate Multimodal Incongruities via Contrastive Learning (MICL) for multimodal sarcasm detection. Specifically, we first leverage incongruity to drive multi-view learning from three views: token-patch, entity-object, and sentiment. Then, we introduce extensive data augmentation to mitigate the biased learning of the textual modality. Additionally, we construct a test set, SPMSD, which consists potential spurious correlations to evaluate the the model's generalizability. Experimental results demonstrate the superiority of MICL on benchmark datasets, along with the analyses showcasing MICL's advancement in mitigating the effect of spurious correlation.

artificial intelligence, incongruity, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.00756

Country:

North America > United States (0.46)
Asia (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models

Qin, Zhanyue, Wang, Haochuan, Liu, Deyuan, Song, Ziyang, Fan, Cunhang, Lv, Zhao, Wu, Jinlin, Lei, Zhen, Tu, Zhiying, Chu, Dianhui, Yu, Xiaoyan, Sui, Dianbo

arXiv.org Artificial IntelligenceJun-24-2024

Sequential decision-making refers to algorithms that take into account the dynamics of the environment, where early decisions affect subsequent decisions. With large language models (LLMs) demonstrating powerful capabilities between tasks, we can't help but ask: Can Current LLMs Effectively Make Sequential Decisions? In order to answer this question, we propose the UNO Arena based on the card game UNO to evaluate the sequential decision-making capability of LLMs and explain in detail why we choose UNO. In UNO Arena, We evaluate the sequential decision-making capability of LLMs dynamically with novel metrics based Monte Carlo methods. We set up random players, DQN-based reinforcement learning players, and LLM players (e.g. GPT-4, Gemini-pro) for comparison testing. Furthermore, in order to improve the sequential decision-making capability of LLMs, we propose the TUTRI player, which can involves having LLMs reflect their own actions wtih the summary of game history and the game strategy. Numerous experiments demonstrate that the TUTRI player achieves a notable breakthrough in the performance of sequential decision-making compared to the vanilla LLM player.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2406.16382

Country: Asia (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Relational Prompt-based Pre-trained Language Models for Social Event Detection

Li, Pu, Yu, Xiaoyan, Peng, Hao, Xian, Yantuan, Wang, Linqin, Sun, Li, Zhang, Jingyun, Yu, Philip S.

arXiv.org Artificial IntelligenceApr-12-2024

Social Event Detection (SED) aims to identify significant events from social streams, and has a wide application ranging from public opinion analysis to risk management. In recent years, Graph Neural Network (GNN) based solutions have achieved state-of-the-art performance. However, GNN-based methods often struggle with noisy and missing edges between messages, affecting the quality of learned message embedding. Moreover, these methods statically initialize node embedding before training, which, in turn, limits the ability to learn from message texts and relations simultaneously. In this paper, we approach social event detection from a new perspective based on Pre-trained Language Models (PLMs), and present RPLM_SED (Relational prompt-based Pre-trained Language Models for Social Event Detection). We first propose a new pairwise message modeling strategy to construct social messages into message pairs with multi-relational sequences. Secondly, a new multi-relational prompt-based pairwise message learning mechanism is proposed to learn more comprehensive message representation from message pairs with multi-relational prompts using PLMs. Thirdly, we design a new clustering constraint to optimize the encoding process by enhancing intra-cluster compactness and inter-cluster dispersion, making the message representation more distinguishable. We evaluate the RPLM_SED on three real-world datasets, demonstrating that the RPLM_SED model achieves state-of-the-art performance in offline, online, low-resource, and long-tail distribution scenarios for social event detection tasks.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.08263

Country:

Asia > China (0.29)
North America > United States > Illinois (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Social Events (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

Yu, Xiaoyan, Luo, Tongxu, Wei, Yifan, Lei, Fangyu, Huang, Yiming, Peng, Hao, Zhu, Liehuang

arXiv.org Artificial IntelligenceMar-1-2024

Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse characters. Our framework breaks down the role-playing process into agent pre-training, multiple characters playing, and character incremental learning, effectively handling both seen and unseen roles. This dynamic approach, coupled with distinct LoRA blocks for each character, enhances Neeko's adaptability to unique attributes, personalities, and speaking patterns. As a result, Neeko demonstrates superior performance in MCRP over most existing methods, offering more engaging and versatile user interaction experiences. Code and data are available at https://github.com/weiyifan1023/Neeko.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.13717

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Assessing Knowledge Editing in Language Models via Relation Perspective

Wei, Yifan, Yu, Xiaoyan, Ma, Huanhuan, Lei, Fangyu, Weng, Yixuan, Song, Ran, Liu, Kang

arXiv.org Artificial IntelligenceNov-15-2023

Knowledge Editing (KE) for modifying factual knowledge in Large Language Models (LLMs) has been receiving increasing attention. However, existing knowledge editing methods are entity-centric, and it is unclear whether this approach is suitable for a relation-centric perspective. To address this gap, this paper constructs a new benchmark named RaKE, which focuses on Relation based Knowledge Editing. In this paper, we establish a suite of innovative metrics for evaluation and conduct comprehensive experiments involving various knowledge editing baselines. We notice that existing knowledge editing methods exhibit the potential difficulty in their ability to edit relations. Therefore, we further explore the role of relations in factual triplets within the transformer. Our research results confirm that knowledge related to relations is not only stored in the FFN network but also in the attention layers. This provides experimental support for future relation-based knowledge editing methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.09053

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models

Wei, Yifan, Su, Yisong, Ma, Huanhuan, Yu, Xiaoyan, Lei, Fangyu, Zhang, Yuanzhe, Zhao, Jun, Liu, Kang

arXiv.org Artificial IntelligenceOct-8-2023

Large language models (LLMs) have shown nearly saturated performance on many natural language processing (NLP) tasks. As a result, it is natural for people to believe that LLMs have also mastered abilities such as time understanding and reasoning. However, research on the temporal sensitivity of LLMs has been insufficiently emphasized. To fill this gap, this paper constructs Multiple Sensitive Factors Time QA (MenatQA), which encompasses three temporal factors (scope factor, order factor, counterfactual factor) with total 2,853 samples for evaluating the time comprehension and reasoning abilities of LLMs. This paper tests current mainstream LLMs with different parameter sizes, ranging from billions to hundreds of billions. The results show most LLMs fall behind smaller temporal reasoning models with different degree on these factors. In specific, LLMs show a significant vulnerability to temporal biases and depend heavily on the temporal information provided in questions. Furthermore, this paper undertakes a preliminary investigation into potential improvement strategies by devising specific prompts and leveraging external tools. These approaches serve as valuable baselines or references for future research endeavors.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.05157

Country:

North America > Canada (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback