AITopics | Zeng, Qingcheng

Collaborating Authors

Zeng, Qingcheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility

Lam, Suet-Ying, Zeng, Qingcheng, Wu, Jingyi, Voigt, Rob

arXiv.org Artificial IntelligenceMar-21-2025

Whether large language models (LLMs) process language similarly to humans has been the subject of much theoretical and practical debate. We examine this question through the lens of the production-interpretation distinction found in human sentence processing and evaluate the extent to which instruction-tuned LLMs replicate this distinction. Using an empirically documented asymmetry between production and interpretation in humans for implicit causality verbs as a testbed, we find that some LLMs do quantitatively and qualitatively reflect human-like asymmetries between production and interpretation. We demonstrate that whether this behavior holds depends upon both model size - with larger models more likely to reflect human-like patterns and the choice of meta-linguistic prompts used to elicit the behavior.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.17579

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

Add feedback

MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

Xuan, Weihao, Yang, Rui, Qi, Heli, Zeng, Qingcheng, Xiao, Yunze, Xing, Yun, Wang, Junjue, Li, Huitao, Li, Xin, Yu, Kunyu, Liu, Nan, Chen, Qingyu, Teodoro, Douglas, Marrese-Taylor, Edison, Lu, Shijian, Iwasawa, Yusuke, Matsuo, Yutaka, Li, Irene

arXiv.org Artificial IntelligenceMar-13-2025

Traditional benchmarks struggle to evaluate increasingly sophisticated language models in multilingual and culturally diverse contexts. To address this gap, we introduce MMLU-ProX, a comprehensive multilingual benchmark covering 13 typologically diverse languages with approximately 11,829 questions per language. Building on the challenging reasoning-focused design of MMLU-Pro, our framework employs a semi-automatic translation process: translations generated by state-of-the-art large language models (LLMs) are rigorously evaluated by expert annotators to ensure conceptual accuracy, terminological consistency, and cultural relevance. We comprehensively evaluate 25 state-of-the-art LLMs using 5-shot chain-of-thought (CoT) and zero-shot prompting strategies, analyzing their performance across linguistic and cultural boundaries. Our experiments reveal consistent performance degradation from high-resource languages to lower-resource ones, with the best models achieving over 70% accuracy on English but dropping to around 40% for languages like Swahili, highlighting persistent gaps in multilingual capabilities despite recent advances. MMLU-ProX is an ongoing project; we are expanding our benchmark by incorporating additional languages and evaluating more language models to provide a more comprehensive assessment of multilingual capabilities.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.10497

Country:

Asia > Japan (0.14)
Asia > Thailand (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

SCORE: Saturated Consensus Relocalization in Semantic Line Maps

Jiang, Haodong, Zheng, Xiang, Zhang, Yanglin, Zeng, Qingcheng, Li, Yiqian, Hong, Ziyang, Wu, Junfeng

arXiv.org Artificial IntelligenceMar-5-2025

This is the arxiv version for our paper submitted to IEEE/RSJ IROS 2025. We propose a scene-agnostic and light-weight visual relocalization framework that leverages semantically labeled 3D lines as a compact map representation. In our framework, the robot localizes itself by capturing a single image, extracting 2D lines, associating them with semantically similar 3D lines in the map, and solving a robust perspective-n-line problem. To address the extremely high outlier ratios~(exceeding 99.5\%) caused by one-to-many ambiguities in semantic matching, we introduce the Saturated Consensus Maximization~(Sat-CM) formulation, which enables accurate pose estimation when the classic Consensus Maximization framework fails. We further propose a fast global solver to the formulated Sat-CM problems, leveraging rigorous interval analysis results to ensure both accuracy and computational efficiency. Additionally, we develop a pipeline for constructing semantic 3D line maps using posed depth images. To validate the effectiveness of our framework, which integrates our innovations in robust estimation and practical engineering insights, we conduct extensive experiments on the ScanNet++ dataset.

machine learning, natural language, outlier ratio, (19 more...)

arXiv.org Artificial Intelligence

2503.03254

Country: Asia > China > Guangdong Province (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Huang, Shulin, Yang, Linyi, Song, Yan, Chen, Shuang, Cui, Leyang, Wan, Ziyu, Zeng, Qingcheng, Wen, Ying, Shao, Kun, Zhang, Weinan, Wang, Jun, Zhang, Yue

arXiv.org Artificial IntelligenceFeb-22-2025

Evaluating large language models (LLMs) poses significant challenges, particularly due to issues of data contamination and the leakage of correct answers. To address these challenges, we introduce ThinkBench, a novel evaluation framework designed to evaluate LLMs' reasoning capability robustly. ThinkBench proposes a dynamic data generation method for constructing out-of-distribution (OOD) datasets and offers an OOD dataset that contains 2,912 samples drawn from reasoning tasks. ThinkBench unifies the evaluation of reasoning models and non-reasoning models. We evaluate 16 LLMs and 4 PRMs under identical experimental conditions and show that most of the LLMs' performance are far from robust and they face a certain level of data leakage. By dynamically generating OOD datasets, ThinkBench effectively provides a reliable evaluation of LLMs and reduces the impact of data contamination.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.16268

Country: Asia (0.14)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Materials > Chemicals (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Toward Equitable Access: Leveraging Crowdsourced Reviews to Investigate Public Perceptions of Health Resource Accessibility

Xue, Zhaoqian, Liu, Guanhong, Wei, Kai, Zhang, Chong, Zeng, Qingcheng, Hu, Songhua, Hua, Wenyue, Fan, Lizhou, Zhang, Yongfeng, Li, Lingyao

arXiv.org Artificial IntelligenceFeb-14-2025

Access to health resources is a critical determinant of public well-being and societal resilience, particularly during public health crises when demand for medical services and preventive care surges. However, disparities in accessibility persist across demographic and geographic groups, raising concerns about equity. Traditional survey methods often fall short due to limitations in coverage, cost, and timeliness. This study leverages crowdsourced data from Google Maps reviews, applying advanced natural language processing techniques, specifically ModernBERT, to extract insights on public perceptions of health resource accessibility in the United States during the COVID-19 pandemic. Additionally, we employ Partial Least Squares regression to examine the relationship between accessibility perceptions and key socioeconomic and demographic factors including political affiliation, racial composition, and educational attainment. Our findings reveal that public perceptions of health resource accessibility varied significantly across the U.S., with disparities peaking during the pandemic and slightly easing post-crisis. Political affiliation, racial demographics, and education levels emerged as key factors shaping these perceptions. These findings underscore the need for targeted interventions and policy measures to address inequities, fostering a more inclusive healthcare infrastructure that can better withstand future public health challenges.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.10641

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Public Health (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Sympathy over Polarization: A Computational Discourse Analysis of Social Media Posts about the July 2024 Trump Assassination Attempt

Zeng, Qingcheng, Liu, Guanhong, Xue, Zhaoqian, Ford, Diego, Voigt, Rob, Hagen, Loni, Li, Lingyao

arXiv.org Artificial IntelligenceJan-16-2025

On July 13, 2024, at the Trump rally in Pennsylvania, someone attempted to assassinate Republican Presidential Candidate Donald Trump. This attempt sparked a large-scale discussion on social media. We collected posts from X (formerly known as Twitter) one week before and after the assassination attempt and aimed to model the short-term effects of such a ``shock'' on public opinions and discussion topics. Specifically, our study addresses three key questions: first, we investigate how public sentiment toward Donald Trump shifts over time and across regions (RQ1) and examine whether the assassination attempt itself significantly affects public attitudes, independent of the existing political alignments (RQ2). Finally, we explore the major themes in online conversations before and after the crisis, illustrating how discussion topics evolved in response to this politically charged event (RQ3). By integrating large language model-based sentiment analysis, difference-in-differences modeling, and topic modeling techniques, we find that following the attempt the public response was broadly sympathetic to Trump rather than polarizing, despite baseline ideological and regional disparities.

large language model, machine learning, sentiment, (22 more...)

arXiv.org Artificial Intelligence

2501.0995

Country:

North America > United States > Pennsylvania (0.34)
North America > United States > New Jersey (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

Causal Micro-Narratives

Heddaya, Mourad, Zeng, Qingcheng, Tan, Chenhao, Voigt, Rob, Zentefis, Alexander

arXiv.org Artificial IntelligenceOct-7-2024

We present a novel approach to classify causal micro-narratives from text. These narratives are sentence-level explanations of the cause(s) and/or effect(s) of a target subject. The approach requires only a subject-specific ontology of causes and effects, and we demonstrate it with an application to inflation narratives. Using a human-annotated dataset spanning historical and contemporary US news articles for training, we evaluate several large language models (LLMs) on this multi-label classification task. The best-performing model--a fine-tuned Llama 3.1 8B--achieves F1 scores of 0.87 on narrative detection and 0.71 on narrative classification. Comprehensive error analysis reveals challenges arising from linguistic ambiguity and highlights how model errors often mirror human annotator disagreements. This research establishes a framework for extracting causal micro-narratives from real-world data, with wide-ranging applications to social science research.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.05252

Country:

North America > United States > California (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Government (1.00)
Banking & Finance > Economy (1.00)
Banking & Finance > Trading (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis

Li, Daoyang, Jin, Mingyu, Zeng, Qingcheng, Zhao, Haiyan, Du, Mengnan

arXiv.org Artificial IntelligenceSep-22-2024

Probing techniques for large language models (LLMs) have primarily focused on English, overlooking the vast majority of the world's languages. In this paper, we extend these probing methods to a multilingual context, investigating the behaviors of LLMs across diverse languages. We conduct experiments on several open-source LLM models, analyzing probing accuracy, trends across layers, and similarities between probing vectors for multiple languages. Our key findings reveal: (1) a consistent performance gap between high-resource and low-resource languages, with high-resource languages achieving significantly higher probing accuracy; (2) divergent layer-wise accuracy trends, where high-resource languages show substantial improvement in deeper layers similar to English; and (3) higher representational similarities among high-resource languages, with low-resource languages demonstrating lower similarities both among themselves and with high-resource languages. These results highlight significant disparities in LLMs' multilingual capabilities and emphasize the need for improved modeling of low-resource languages.

english, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2409.14459

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Distributed Invariant Kalman Filter for Object-level Multi-robot Pose SLAM

Li, Haoying, Zeng, Qingcheng, Li, Haoran, Zhang, Yanglin, Wu, Junfeng

arXiv.org Artificial IntelligenceSep-14-2024

Cooperative localization and target tracking are essential for multi-robot systems to implement high-level tasks. To this end, we propose a distributed invariant Kalman filter based on covariance intersection for effective multi-robot pose estimation. The paper utilizes the object-level measurement models, which have condensed information further reducing the communication burden. Besides, by modeling states on special Lie groups, the better linearity and consistency of the invariant Kalman filter structure can be stressed. We also use a combination of CI and KF to avoid overly confident or conservative estimates in multi-robot systems with intricate and unknown correlations, and some level of robot degradation is acceptable through multi-robot collaboration. The simulation and real data experiment validate the practicability and superiority of the proposed algorithm.

artificial intelligence, estimation, robot, (17 more...)

arXiv.org Artificial Intelligence

2409.0941

Country: Asia > China (0.46)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models

Zeng, Qingcheng, Jin, Mingyu, Yu, Qinkai, Wang, Zhenting, Hua, Wenyue, Zhou, Zihao, Sun, Guangyan, Meng, Yanda, Ma, Shiqing, Wang, Qifan, Juefei-Xu, Felix, Ding, Kaize, Yang, Fan, Tang, Ruixiang, Zhang, Yongfeng

arXiv.org Artificial IntelligenceJul-16-2024

Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial. One commonly used method to assess the reliability of LLMs' responses is uncertainty estimation, which gauges the likelihood of their answers being correct. While many studies focus on improving the accuracy of uncertainty estimations for LLMs, our research investigates the fragility of uncertainty estimation and explores potential attacks. We demonstrate that an attacker can embed a backdoor in LLMs, which, when activated by a specific trigger in the input, manipulates the model's uncertainty without affecting the final output. Specifically, the proposed backdoor attack method can alter an LLM's output probability distribution, causing the probability distribution to converge towards an attacker-predefined distribution while ensuring that the top-1 prediction remains unchanged. Our experimental results demonstrate that this attack effectively undermines the model's self-evaluation reliability in multiple-choice questions. For instance, we achieved a 100 attack success rate (ASR) across three different triggering strategies in four models. Further, we investigate whether this manipulation generalizes across different prompts and domains. This work highlights a significant threat to the reliability of LLMs and underscores the need for future defenses against such attacks. The code is available at https://github.com/qcznlp/uncertainty_attack.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2407.11282

Country:

Asia (0.68)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback