AITopics | Chen, Hanjie

Collaborating Authors

Chen, Hanjie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Sui, Yang, Chuang, Yu-Neng, Wang, Guanchu, Zhang, Jiamu, Zhang, Tianyi, Yuan, Jiayi, Liu, Hongyi, Wen, Andrew, Shaochen, null, Zhong, null, Chen, Hanjie, Hu, Xia

arXiv.org Artificial IntelligenceMar-20-2025

Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks. Recent advancements in Large Reasoning Models (LRMs), such as OpenAI o1 and DeepSeek-R1, have further improved performance in System-2 reasoning domains like mathematics and programming by harnessing supervised fine-tuning (SFT) and reinforcement learning (RL) techniques to enhance the Chain-of-Thought (CoT) reasoning. However, while longer CoT reasoning sequences improve performance, they also introduce significant computational overhead due to verbose and redundant outputs, known as the "overthinking phenomenon". In this paper, we provide the first structured survey to systematically investigate and explore the current progress toward achieving efficient reasoning in LLMs. Overall, relying on the inherent mechanism of LLMs, we categorize existing works into several key directions: (1) model-based efficient reasoning, which considers optimizing full-length reasoning models into more concise reasoning models or directly training efficient reasoning models; (2) reasoning output-based efficient reasoning, which aims to dynamically reduce reasoning steps and length during inference; (3) input prompts-based efficient reasoning, which seeks to enhance reasoning efficiency based on input prompt properties such as difficulty or length control. Additionally, we introduce the use of efficient data for training reasoning models, explore the reasoning capabilities of small language models, and discuss evaluation methods and benchmarking.

large language model, machine learning, natural language, (5 more...)

arXiv.org Artificial Intelligence

2503.16419

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Luo, Feng, Yang, Rui, Sun, Hao, Deng, Chunyuan, Yao, Jiarui, Shen, Jingyan, Zhang, Huan, Chen, Hanjie

arXiv.org Artificial IntelligenceFeb-18-2025

Understanding human preferences is crucial for improving foundation models and building personalized AI systems. However, preferences are inherently diverse and complex, making it difficult for traditional reward models to capture their full range. While fine-grained preference data can help, collecting it is expensive and hard to scale. In this paper, we introduce Decomposed Reward Models (DRMs), a novel approach that extracts diverse human preferences from binary comparisons without requiring fine-grained annotations. Our key insight is to represent human preferences as vectors and analyze them using Principal Component Analysis (PCA). By constructing a dataset of embedding differences between preferred and rejected responses, DRMs identify orthogonal basis vectors that capture distinct aspects of preference. These decomposed rewards can be flexibly combined to align with different user needs, offering an interpretable and scalable alternative to traditional reward models. We demonstrate that DRMs effectively extract meaningful preference dimensions (e.g., helpfulness, safety, humor) and adapt to new users without additional training. Our results highlight DRMs as a powerful framework for personalized and interpretable LLM alignment.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.13131

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Add feedback

COPU: Conformal Prediction for Uncertainty Quantification in Natural Language Generation

Wang, Sean, Jiang, Yicheng, Tang, Yuxin, Cheng, Lu, Chen, Hanjie

arXiv.org Artificial IntelligenceFeb-18-2025

Uncertainty Quantification (UQ) for Natural Language Generation (NLG) is crucial for assessing the performance of Large Language Models (LLMs), as it reveals confidence in predictions, identifies failure modes, and gauges output reliability. Conformal Prediction (CP), a model-agnostic method that generates prediction sets with a specified error rate, has been adopted for UQ in classification tasks, where the size of the prediction set indicates the model's uncertainty. However, when adapting CP to NLG, the sampling-based method for generating candidate outputs cannot guarantee the inclusion of the ground truth, limiting its applicability across a wide range of error rates. To address this, we propose \ourmethod, a method that explicitly adds the ground truth to the candidate outputs and uses logit scores to measure nonconformity. Our experiments with six LLMs on four NLG tasks show that \ourmethod outperforms baseline methods in calibrating error rates and empirical cover rates, offering accurate UQ across a wide range of user-specified error rates.

large language model, natural language, prediction, (16 more...)

arXiv.org Artificial Intelligence

2502.12601

Country:

North America > United States (1.00)
Europe (1.00)
Africa (0.68)
North America > Canada > Nunavut (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (1.00)
Government (0.68)
Automobiles & Trucks > Manufacturer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)

Add feedback

Personality Structured Interview for Large Language Model Simulation in Personality Research

Wang, Pengda, Zou, Huiqi, Chen, Hanjie, Sun, Tianjun, Xiao, Ziang, Oswald, Frederick L.

arXiv.org Artificial IntelligenceFeb-17-2025

Although psychometrics researchers have recently explored the use of large language models (LLMs) as proxies for human participants, LLMs often fail to generate heterogeneous data with human-like diversity, which diminishes their value in advancing social science research. To address these challenges, we explored the potential of the theory-informed Personality Structured Interview (PSI) as a tool for simulating human responses in personality research. In this approach, the simulation is grounded in nuanced real-human interview transcripts that target the personality construct of interest. We have provided a growing set of 357 structured interview transcripts from a representative sample, each containing an individual's response to 32 open-ended questions carefully designed to gather theory-based personality evidence. Additionally, grounded in psychometric research, we have summarized an evaluation framework to systematically validate LLM-generated psychometric data. Results from three experiments demonstrate that well-designed structured interviews could improve human-like heterogeneity in LLM-simulated personality data and predict personality-related behavioral outcomes (i.e., organizational citizenship behaviors and counterproductive work behavior). We further discuss the role of theory-informed structured interviews in LLM-based simulation and outline a general framework for designing structured interviews to simulate human-like data for psychometric research.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.12109

Genre: Personal > Interview (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

SAFR: Neuron Redistribution for Interpretability

Chang, Ruidi, Deng, Chunyuan, Chen, Hanjie

arXiv.org Artificial IntelligenceFeb-10-2025

Superposition refers to encoding representations of multiple features within a single neuron, which is common in deep neural networks. This property allows neurons to combine and represent multiple features, enabling the model to capture intricate information and handle complex tasks. Despite promising performance, the model's interpretability has been diminished. This paper presents a novel approach to enhance model interpretability by regularizing feature superposition. We introduce SAFR, which simply applies regularizations to the loss function to promote monosemantic representations for important tokens while encouraging polysemanticity for correlated token pairs, where important tokens and correlated token pairs are identified via VMASK and attention weights respectively. We evaluate SAFR with a transformer model on two classification tasks. Experiments demonstrate the effectiveness of SAFR in improving model interpretability without compromising prediction performance. Besides, SAFR provides explanations by visualizing the neuron allocation within the intermediate layers.

interpretability, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.16374

Country: North America > United States > Oregon (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

ECG-guided individual identification via PPG

Wei, Riling, Chen, Hanjie, Yao, Kelu, Yang, Chuanguang, Wang, Jun, Li, Chao

arXiv.org Artificial IntelligenceDec-30-2024

Photoplethsmography (PPG)-based individual identification aiming at recognizing humans via intrinsic cardiovascular activities has raised extensive attention due to its high security and resistance to mimicry. However, this kind of technology witnesses unpromising results due to the limitation of low information density. To this end, electrocardiogram (ECG) signals have been introduced as a novel modality to enhance the density of input information. Specifically, a novel cross-modal knowledge distillation framework is implemented to propagate discriminate knowledge from ECG modality to PPG modality without incurring additional computational demands at the inference phase. Furthermore, to ensure efficient knowledge propagation, Contrastive Language-Image Pre-training (CLIP)-based knowledge alignment and cross-knowledge assessment modules are proposed respectively. Comprehensive experiments are conducted and results show our framework outperforms the baseline model with the improvement of 2.8% and 3.0% in terms of overall accuracy on seen- and unseen individual recognitions.

artificial intelligence, knowledge distillation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2501.01983

Country: Asia > China (0.49)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Political-LLM: Large Language Models in Political Science

Li, Lincan, Li, Jiaqi, Chen, Catherine, Gui, Fred, Yang, Hongjia, Yu, Chenxiao, Wang, Zhengguang, Cai, Jianing, Zhou, Junlong Aaron, Shen, Bolin, Qian, Alex, Chen, Weixin, Xue, Zhongkai, Sun, Lichao, He, Lifang, Chen, Hanjie, Ding, Kaize, Du, Zijian, Mu, Fangzhou, Pei, Jiaxin, Zhao, Jieyu, Swayamdipta, Swabha, Neiswanger, Willie, Wei, Hua, Hu, Xiyang, Zhu, Shixiang, Chen, Tianlong, Lu, Yingzhou, Shi, Yang, Qin, Lianhui, Fu, Tianfan, Tu, Zhengzhong, Yang, Yuzhe, Yoo, Jaemin, Zhang, Jiaheng, Rossi, Ryan, Zhan, Liang, Zhao, Liang, Ferrara, Emilio, Liu, Yan, Huang, Furong, Zhang, Xiangliang, Rothenberg, Lawrence, Ji, Shuiwang, Yu, Philip S., Zhao, Yue, Dong, Yushun

arXiv.org Artificial IntelligenceDec-9-2024

In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer science and political science--present the first principled framework termed Political-LLM to advance the comprehensive understanding of integrating LLMs into computational political science. Specifically, we first introduce a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies. In particular, from the political science perspective, we highlight the role of LLMs in automating predictive and generative tasks, simulating behavior dynamics, and improving causal inference through tools like counterfactual generation; from a computational perspective, we introduce advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts. We identify key challenges and future directions, emphasizing the development of domain-specific datasets, addressing issues of bias and fairness, incorporating human expertise, and redefining evaluation criteria to align with the unique requirements of computational political science. Political-LLM seeks to serve as a guidebook for researchers to foster an informed, ethical, and impactful use of Artificial Intelligence in political science. Our online resource is available at: http://political-llm.org/. Corresponding authors: Yushun Dong (yd24f@fsu.edu) is with the Department of Computer Science, Florida State University; Yue Zhao (yzhao010@usc.edu) is with the Department of Computer Science, University of Southern California; Fred Gui (pgui@lsu.edu) is with the Department of Political Science, Louisiana State University; Catherine Chen (catherinechen@lsu.edu) is with the Manship School of Mass Communication and the Department of Political Science, Louisiana State University.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.06864

Country:

North America > United States > California (0.86)
North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.44)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media > News (1.00)
Law (1.00)
Health & Medicine (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models

Xia, Haotian, Yang, Zhengbang, Zou, Junbo, Tracy, Rhys, Wang, Yuqing, Lu, Chi, Lai, Christopher, He, Yanjun, Shao, Xun, Xie, Zhuoqing, Wang, Yuan-fang, Shen, Weining, Chen, Hanjie

arXiv.org Artificial IntelligenceDec-3-2024

Multimodal Large Language Models (MLLMs) are advancing the ability to reason about complex sports scenarios by integrating textual and visual information. To comprehensively evaluate their capabilities, we introduce SPORTU, a benchmark designed to assess MLLMs across multi-level sports reasoning tasks. SPORTU comprises two key components: SPORTU-text, featuring 900 multiple-choice questions with human-annotated explanations for rule comprehension and strategy understanding. This component focuses on testing models' ability to reason about sports solely through question-answering (QA), without requiring visual inputs; SPORTU-video, consisting of 1,701 slow-motion video clips across 7 different sports and 12,048 QA pairs, designed to assess multi-level reasoning, from simple sports recognition to complex tasks like foul detection and rule application. We evaluate four prevalent LLMs mainly utilizing few-shot learning paradigms supplemented by chain-of-thought (CoT) prompting on the SPORTU-text part. We evaluate four LLMs using few-shot learning and chain-of-thought (CoT) prompting on SPORTU-text. GPT-4o achieves the highest accuracy of 71%, but still falls short of human-level performance, highlighting room for improvement in rule comprehension and reasoning. The evaluation for the SPORTU-video part includes 7 proprietary and 6 open-source MLLMs. Experiments show that models fall short on hard tasks that require deep reasoning and rule-based understanding. Claude-3.5-Sonnet performs the best with only 52.6% accuracy on the hard task, showing large room for improvement. We hope that SPORTU will serve as a critical step toward evaluating models' capabilities in sports understanding and reasoning.

explanation, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.08474

Country: North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Sports > Football (1.00)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps

Zhou, Xiongtao, He, Jie, Chen, Lanyu, Li, Jingyu, Chen, Haojing, Gutiérrez-Basulto, Víctor, Pan, Jeff Z., Chen, Hanjie

arXiv.org Artificial IntelligenceNov-16-2024

Multimodal Chain of Thought (MCoT) is a popular prompting strategy for improving the performance of multimodal large language models (MLLMs) across a range of complex reasoning tasks. Despite its popularity, there is a notable absence of automated methods for evaluating the quality of reasoning steps in MCoT. To address this gap, we propose Multimodal Chain-of-Thought Evaluation (MiCEval), a framework designed to assess the correctness of reasoning chains by evaluating the quality of both the description and each reasoning step. The evaluation of the description component focuses on the accuracy of the image descriptions, while the reasoning step evaluates the quality of each step as it is conditionally generated based on the preceding steps. MiCEval is built upon a fine-grained dataset with annotations that rate each step according to correctness, relevance, and informativeness. Extensive experiments on four state-of-the-art MLLMs show that step-wise evaluations using MiCEval align more closely with human judgments compared to existing methods based on cosine similarity or fine-tuning approaches. MiCEval datasets and code can be found in https://github.com/alenai97/MiCEval.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.14668

Country:

North America > United States (0.46)
Asia > Middle East (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

Language Models are Symbolic Learners in Arithmetic

Deng, Chunyuan, Li, Zhiqi, Xie, Roy, Chang, Ruidi, Chen, Hanjie

arXiv.org Artificial IntelligenceOct-20-2024

Large Language Models (LLMs) are thought to struggle with arithmetic learning due to the inherent differences between language modeling and numerical computation, but concrete evidence has been lacking. This work responds to this claim through a two-side experiment. We first investigate whether LLMs leverage partial products during arithmetic learning. We find that although LLMs can identify some partial products after learning, they fail to leverage them for arithmetic tasks, conversely. We then explore how LLMs approach arithmetic symbolically by breaking tasks into subgroups, hypothesizing that difficulties arise from subgroup complexity and selection. Our results show that when subgroup complexity is fixed, LLMs treat a collection of different arithmetic operations similarly. By analyzing position-level accuracy across different training sizes, we further observe that it follows a U-shaped pattern: LLMs quickly learn the easiest patterns at the first and last positions, while progressively learning the more difficult patterns in the middle positions. This suggests that LLMs select subgroup following an easy-to-hard paradigm during learning. Our work confirms that LLMs are pure symbolic learners in arithmetic tasks and underscores the importance of understanding them deeply through subgroup-level quantification.

large language model, machine learning, multiplication, (18 more...)

arXiv.org Artificial Intelligence

2410.1558

Country: Europe (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback