AITopics | Zhou, Dian

Plotting

Zhou, Dian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Hong, Yihuai, Zhou, Dian, Cao, Meng, Yu, Lei, Jin, Zhijing

arXiv.org Artificial IntelligenceMar-29-2025

Large language models (LLMs) excel on a variety of reasoning benchmarks, but previous studies suggest they sometimes struggle to generalize to unseen questions, potentially due to over-reliance on memorized training examples. However, the precise conditions under which LLMs switch between reasoning and memorization during text generation remain unclear. In this work, we provide a mechanistic understanding of LLMs' reasoning-memorization dynamics by identifying a set of linear features in the model's residual stream that govern the balance between genuine reasoning and memory recall. These features not only distinguish reasoning tasks from memory-intensive ones but can also be manipulated to causally influence model performance on reasoning tasks. Additionally, we show that intervening in these reasoning features helps the model more accurately activate the most relevant problem-solving capabilities during answer generation. Our findings offer new insights into the underlying mechanisms of reasoning and memory in LLMs and pave the way for the development of more robust and interpretable generative AI systems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.23084

Country:

Asia (0.68)
North America > Canada (0.46)
Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Projection based Active Gaussian Process Regression for Pareto Front Modeling

Gao, Zhengqi, Tao, Jun, Su, Yangfeng, Zhou, Dian, Zeng, Xuan

arXiv.org Machine LearningJan-20-2020

Pareto Front (PF) modeling is essential in decision making problems across all domains such as economics, medicine or engineering. In Operation Research literature, this task has been addressed based on multi-objective optimization algorithms. However, without learning models for PF, these methods cannot examine whether a new provided point locates on PF or not. In this paper, we reconsider the task from Data Mining perspective. A novel projection based active Gaussian process regression (P- aGPR) method is proposed for efficient PF modeling. First, P- aGPR chooses a series of projection spaces with dimensionalities ranking from low to high. Next, in each projection space, a Gaussian process regression (GPR) model is trained to represent the constraint that PF should satisfy in that space. Moreover, in order to improve modeling efficacy and stability, an active learning framework has been developed by exploiting the uncertainty information obtained in the GPR models. Different from all existing methods, our proposed P-aGPR method can not only provide a generative PF model, but also fast examine whether a provided point locates on PF or not. The numerical results demonstrate that compared to state-of-the-art passive learning methods the proposed P-aGPR method can achieve higher modeling accuracy and stability.

artificial intelligence, optimization problem, pf point, (17 more...)

arXiv.org Machine Learning

2001.07072

Country: North America > United States (0.46)

Genre: Research Report (0.70)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback