AITopics | Xu, Liyan

Collaborating Authors

Xu, Liyan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DBudgetKV: Dynamic Budget in KV Cache Compression for Ensuring Optimal Performance

Ni, Xuanfan, Xu, Liyan, Lyu, Chenyang, Wang, Longyue, Yu, Mo, Liu, Lemao, Meng, Fandong, Zhou, Jie, Li, Piji

arXiv.org Artificial IntelligenceFeb-24-2025

To alleviate memory burden during inference of large language models (LLMs), numerous studies have focused on compressing the KV cache by exploring aspects such as attention sparsity. However, these techniques often require a pre-defined cache budget; as the optimal budget varies with different input lengths and task types, it limits their practical deployment accepting open-domain instructions. To address this limitation, we propose a new KV cache compression objective: to always ensure the full-cache performance regardless of specific inputs, while maximizing KV cache pruning as much as possible. To achieve this goal, we introduce a novel KV cache compression method dubbed DBudgetKV, which features an attention-based metric to signal when the remaining KV cache is unlikely to match the full-cache performance, then halting the pruning process. Empirical evaluation spanning diverse context lengths, task types, and model sizes suggests that our method achieves lossless KV pruning effectively and robustly, exceeding 25% compression ratio on average. Furthermore, our method is easy to integrate within LLM inference, not only optimizing memory space, but also showing reduced inference time compared to existing methods.

dbudgetkv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.16886

Country:

Asia (0.67)
Europe > Austria > Vienna (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Vehicles, Pedestrians, and E-bikes: a Three-party Game at Right-turn-on-red Crossroads Revealing the Dual and Irrational Role of E-bikes that Risks Traffic Safety

Zhang, Gangcheng, Shu, Yeshuo, Liu, Keyi, Wang, Yuxuan, Li, Donghang, Xu, Liyan

arXiv.org Artificial IntelligenceNov-4-2024

The widespread use of e-bikes has facilitated short-distance travel yet led to confusion and safety problems in road traffic. This study focuses on the dual characteristics of e-bikes in traffic conflicts: they resemble pedestrians when interacting with motor vehicles and behave like motor vehicles when in conflict with pedestrians, which raises the right of way concerns when potential conflicts are at stake. Using the Quantal Response Equilibrium model, this research analyzes the behavioral choice differences of three groups of road users (vehicle-pedestrian, vehicle-e-bike, e-bike-pedestrian) at right-turn-on-red crossroads in right-turning lines and straight-going lines conflict scenarios. The results show that the behavior of e-bikes is more similar to that of motor vehicles than pedestrians overall, and their interactions with either pedestrians or motor vehicles do not establish a reasonable order, increasing the likelihood of confusion and conflict. In contrast, a mutual understanding has developed between motor vehicles and pedestrians, where motor vehicles tend to yield, and pedestrians tend to cross. By clarifying the game theoretical model and introducing the rationality parameter, this study precisely locates the role of e-bikes among road users, which provides a reliable theoretical basis for optimizing traffic regulations.

artificial intelligence, machine learning, vehicle, (13 more...)

arXiv.org Artificial Intelligence

2411.02183

Country: Asia > China (0.30)

Genre: Research Report > New Finding (0.88)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy

Xu, Liyan, Su, Zhenlin, Yu, Mo, Xu, Jin, Choi, Jinho D., Zhou, Jie, Liu, Fei

arXiv.org Artificial IntelligenceJun-19-2024

Factual inconsistencies pose a significant hurdle for the faithful summarization by generative models. While a major direction to enhance inconsistency detection is to derive stronger Natural Language Inference (NLI) models, we propose an orthogonal aspect that underscores the importance of incorporating task-specific taxonomy into the inference. To this end, we consolidate key error types of inconsistent facts in summaries, and incorporate them to facilitate both the zero-shot and supervised paradigms of LLMs. Extensive experiments on ten datasets of five distinct domains suggest that, zero-shot LLM inference could benefit from the explicit solution space depicted by the error type taxonomy, and achieves state-of-the-art performance overall, surpassing specialized non-LLM baselines, as well as recent LLM baselines. We further distill models that fuse the taxonomy into parameters through our designed prompt completions and supervised training strategies, efficiently substituting state-of-the-art zero-shot inference with much larger LLMs.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2402.12821

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective Questions

Xu, Liyan, Li, Jiangnan, Yu, Mo, Zhou, Jie

arXiv.org Artificial IntelligenceJun-3-2024

This work introduces an original and practical paradigm for narrative comprehension, stemming from the characteristics that individual passages within narratives tend to be more cohesively related than isolated. Complementary to the common end-to-end paradigm, we propose a fine-grained modeling of narrative context, by formulating a graph dubbed NarCo, which explicitly depicts task-agnostic coherence dependencies that are ready to be consumed by various downstream tasks. In particular, edges in NarCo encompass free-form retrospective questions between context snippets, inspired by human cognitive perception that constantly reinstates relevant events from prior context. Importantly, our graph formalism is practically instantiated by LLMs without human annotations, through our designed two-stage prompting scheme. To examine the graph properties and its utility, we conduct three studies in narratives, each from a unique angle: edge relation efficacy, local context enrichment, and broader application in QA. All tasks could benefit from the explicit coherence captured by NarCo.

computational linguistic, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2402.13551

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
(2 more...)

Add feedback

Previously on the Stories: Recap Snippet Identification for Story Reading

Li, Jiangnan, Wang, Qiujing, Xu, Liyan, Pang, Wenjie, Yu, Mo, Lin, Zheng, Wang, Weiping, Zhou, Jie

arXiv.org Artificial IntelligenceFeb-11-2024

Similar to the "previously-on" scenes in TV shows, recaps can help book reading by recalling the readers' memory about the important elements in previous texts to better understand the ongoing plot. Despite its usefulness, this application has not been well studied in the NLP community. We propose the first benchmark on this useful task called Recap Snippet Identification with a hand-crafted evaluation dataset. Our experiments show that the proposed task is challenging to PLMs, LLMs, and proposed methods as the task requires a deep understanding of the plot correlation between snippets.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2402.07271

Country:

Asia > China (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Media > Television (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

SIG: Speaker Identification in Literature via Prompt-Based Generation

Su, Zhenlin, Xu, Liyan, Xu, Jin, Li, Jiangnan, Huangfu, Mingdu

arXiv.org Artificial IntelligenceDec-22-2023

Identifying speakers of quotations in narratives is an important task in literary analysis, with challenging scenarios including the out-of-domain inference for unseen speakers, and non-explicit cases where there are no speaker mentions in surrounding context. In this work, we propose a simple and effective approach SIG, a generation-based method that verbalizes the task and quotation input based on designed prompt templates, which also enables easy integration of other auxiliary tasks that further bolster the speaker identification performance. The prediction can either come from direct generation by the model, or be determined by the highest generation probability of each speaker candidate. Based on our approach design, SIG supports out-of-domain evaluation, and achieves open-world classification paradigm that is able to accept any forms of candidate input. We perform both cross-domain evaluation and in-domain evaluation on PDNC, the largest dataset of this task, where empirical results suggest that SIG outperforms previous baselines of complicated designs, as well as the zero-shot ChatGPT, especially excelling at those hard non-explicit scenarios by up to 17% improvement. Additional experiments on another dataset WP further corroborate the efficacy of SIG.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.1459

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

A framework for mining lifestyle profiles through multi-dimensional and high-order mobility feature clustering

Shu, Yeshuo, Zhang, Gangcheng, Liu, Keyi, Tang, Jintong, Xu, Liyan

arXiv.org Artificial IntelligenceDec-1-2023

Human mobility demonstrates a high degree of regularity, which facilitates the discovery of lifestyle profiles. Existing research has yet to fully utilize the regularities embedded in high-order features extracted from human mobility records in such profiling. This study proposes a progressive feature extraction strategy that mines high-order mobility features from users' moving trajectory records from the spatial, temporal, and semantic dimensions. Specific features are extracted such as travel motifs, rhythms decomposed by discrete Fourier transform (DFT) of mobility time series, and vectorized place semantics by word2vec, respectively to the three dimensions, and they are further clustered to reveal the users' lifestyle characteristics. An experiment using a trajectory dataset of over 500k users in Shenzhen, China yields seven user clusters with different lifestyle profiles that can be well interpreted by common sense. The results suggest the possibility of fine-grained user profiling through cross-order trajectory feature engineering and clustering.

machine learning, natural language, trajectory, (23 more...)

arXiv.org Artificial Intelligence

2312.00411

Country: Asia > China > Guangdong Province > Shenzhen (0.24)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.46)
Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Data Science > Data Quality > Data Transformation (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach

Xu, Liyan, Zhang, Chenwei, Li, Xian, Shang, Jingbo, Choi, Jinho D.

arXiv.org Artificial IntelligenceMay-26-2023

We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention. Our supervision comes from a high-quality seed attribute set bootstrapped from existing resources, and we aim to expand the attribute vocabulary of existing seed types, and also to discover any new attribute types automatically. A new dataset is created to support our setting, and our approach Amacer is proposed specifically to tackle the limited supervision. Especially, given that no direct supervision is available for those unseen new attributes, our novel formulation exploits self-supervised heuristic and unsupervised latent attributes, which attains implicit semantic signals as additional supervision by leveraging product context. Experiments suggest that our approach surpasses various baselines by 12 F1, expanding attributes of existing types significantly by up to 12 times, and discovering values from 39% new types.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2305.1835

Country:

Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.34)

Industry:

Information Technology (1.00)
Health & Medicine > Consumer Health (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback