AITopics

2503.23383

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.47)

FReM: A Flexible Reasoning Mechanism for Balancing Quick and Slow Thinking in Long-Context Question Answering

Zhao, Zhengyi, Zhang, Shubo, Wang, Zezhong, Liang, Bin, Li, Binyang, Wong, Kam-Fai

Long-context question-answering (LCQA) systems have greatly benefited from the powerful reasoning capabilities of large language models (LLMs), which can be categorized into slow and quick reasoning modes. However, both modes have their limitations. Slow thinking generally leans to explore every possible reasoning path, which leads to heavy overthinking and wastes time. Quick thinking usually relies on pattern matching rather than truly understanding the query logic, which misses proper understanding. To address these issues, we propose FReM: Flexible Reasoning Mechanism, a method that adjusts reasoning depth according to the complexity of each question. Specifically, FReM leverages synthetic reference QA examples to provide an explicit chain of thought, enabling efficient handling of simple queries while allowing deeper reasoning for more complex ones. By doing so, FReM helps quick-thinking models move beyond superficial pattern matching and narrows the reasoning space for slow-thinking models to avoid unnecessary exploration. Experiments on seven QA datasets show that FReM improves reasoning accuracy and scalability, particularly for complex multihop questions, indicating its potential to advance LCQA methodologies.

large language model, machine learning, natural language, (18 more...)

2503.22985

Country:

Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Education (0.68)
Media (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?

Liang, Tuo, Hu, Zhe, Li, Jing, Zhang, Hao, Lu, Yiren, Zhou, Yunlai, Qiao, Yiran, Liu, Disheng, Peng, Jeirui, Ma, Jing, Yin, Yu

Understanding humor-particularly when it involves complex, contradictory narratives that require comparative reasoning-remains a significant challenge for large vision-language models (VLMs). This limitation hinders AI's ability to engage in human-like reasoning and cultural expression. In this paper, we investigate this challenge through an in-depth analysis of comics that juxtapose panels to create humor through contradictions. We introduce the YesBut (V2), a novel benchmark with 1,262 comic images from diverse multilingual and multicultural contexts, featuring comprehensive annotations that capture various aspects of narrative understanding. Using this benchmark, we systematically evaluate a wide range of VLMs through four complementary tasks spanning from surface content comprehension to deep narrative reasoning, with particular emphasis on comparative reasoning between contradictory elements. Our extensive experiments reveal that even the most advanced models significantly underperform compared to humans, with common failures in visual perception, key element identification, comparative analysis and hallucinations. We further investigate text-based training strategies and social knowledge augmentation methods to enhance model performance. Our findings not only highlight critical weaknesses in VLMs' understanding of cultural and creative expressions but also provide pathways toward developing context-aware models capable of deeper narrative understanding though comparative reasoning.

large language model, machine learning, natural language, (19 more...)

2503.23137

Country:

North America > United States > New York (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering

Ji, Yuelyu, Meng, Rui, Li, Zhuochun, He, Daqing

Multi-hop question answering (QA) requires models to retrieve and reason over multiple pieces of evidence. While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation.

large language model, machine learning, natural language, (17 more...)

2503.23095

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Heilongjiang Province > Daqing (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Efficient Inference for Large Reasoning Models: A Survey

Liu, Yue, Wu, Jiaying, He, Yufei, Gao, Hongcheng, Chen, Hongyu, Bi, Baolong, Zhang, Jiaheng, Huang, Zhiqi, Hooi, Bryan

Large Reasoning Models (LRMs) significantly improve the reasoning ability of Large Language Models (LLMs) by learning to reason, exhibiting promising performance in complex task-solving. However, their deliberative reasoning process leads to inefficiencies in token usage, memory consumption, and inference time. Thus, this survey provides a review of efficient inference methods designed specifically for LRMs, focusing on mitigating token inefficiency while preserving the reasoning quality. First, we introduce a taxonomy to group the recent methods into two main categories: (a) explicit compact Chain-of-Thought (CoT), which reduces tokens while keeping the explicit reasoning structure, and (b) implicit latent CoT, which encodes reasoning steps within hidden representations instead of explicit tokens. Meanwhile, we discuss their strengths and weaknesses. Then, we conduct empirical analyses on existing methods from performance and efficiency aspects. Besides, we present open challenges in this field, including human-centric controllable reasoning, trade-off between interpretability and efficiency of reasoning, ensuring safety of efficient reasoning, and broader applications of efficient reasoning. In addition, we highlight key insights for enhancing LRMs' inference efficiency via techniques such as model merging, new architectures, and agent routers. We hope this work serves as a valuable guide, helping researchers overcome challenges in this vibrant field\footnote{https://github.com/yueliu1999/Awesome-Efficient-Inference-for-LRMs}.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

2503.23077

Country:

Asia > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

arXiv.org Artificial IntelligenceMar-28-2025

Reasoning of Large Language Models over Knowledge Graphs with Super-Relations

Wang, Song, Lin, Junhong, Guo, Xiaojie, Shun, Julian, Li, Jundong, Zhu, Yada

While large language models (LLMs) have made significant progress in processing and reasoning over knowledge graphs, current methods suffer from a high non-retrieval rate. This limitation reduces the accuracy of answering questions based on these graphs. Our analysis reveals that the combination of greedy search and forward reasoning is a major contributor to this issue. To overcome these challenges, we introduce the concept of super-relations, which enables both forward and backward reasoning by summarizing and connecting various relational paths within the graph. This holistic approach not only expands the search space, but also significantly improves retrieval efficiency. In this paper, we propose the ReKnoS framework, which aims to Reason over Knowledge Graphs with Super-Relations. Our framework's key advantages include the inclusion of multiple relation paths through super-relations, enhanced forward and backward reasoning capabilities, and increased efficiency in querying LLMs. These enhancements collectively lead to a substantial improvement in the successful retrieval rate and overall reasoning performance. We conduct extensive experiments on nine real-world datasets to evaluate ReKnoS, and the results demonstrate the superior performance of ReKnoS over existing state-of-the-art baselines, with an average accuracy gain of 2.92%.

large language model, machine learning, relation, (19 more...)

2503.22166

Country:

North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Qu, Xiaoye, Li, Yafu, Su, Zhaochen, Sun, Weigao, Yan, Jianhao, Liu, Dongrui, Cui, Ganqu, Liu, Daizong, Liang, Shuxian, He, Junxian, Li, Peng, Wei, Wei, Shao, Jing, Lu, Chaochao, Zhang, Yue, Hua, Xian-Sheng, Zhou, Bowen, Cheng, Yu

Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference. However, a growing concern lies in their tendency to produce excessively long reasoning traces, which are often filled with redundant content (e.g., repeated definitions), over-analysis of simple problems, and superficial exploration of multiple reasoning paths for harder tasks. This inefficiency introduces significant challenges for training, inference, and real-world deployment (e.g., in agent-based systems), where token economy is critical. In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm. We identify common patterns of inefficiency, examine methods proposed across the LRM lifecycle, i.e., from pretraining to inference, and discuss promising future directions for research. To support ongoing development, we also maintain a real-time GitHub repository tracking recent progress in the field. We hope this survey serves as a foundation for further exploration and inspires innovation in this rapidly evolving area.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2503.21614

Country:

Europe > Austria > Vienna (0.14)
North America > United States (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(4 more...)

Genre: Overview > Growing Problem (0.34)

Industry:

Education (0.67)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models

Zhao, Qingqing, Lu, Yao, Kim, Moo Jin, Fu, Zipeng, Zhang, Zhuoyang, Wu, Yecheng, Li, Zhaoshuo, Ma, Qianli, Han, Song, Finn, Chelsea, Handa, Ankur, Liu, Ming-Yu, Xiang, Donglai, Wetzstein, Gordon, Lin, Tsung-Yi

Vision-language-action models (VLAs) have shown potential in leveraging pretrained vision-language models and diverse robot demonstrations for learning generalizable sensorimotor control. While this paradigm effectively utilizes large-scale data from both robotic and non-robotic sources, current VLAs primarily focus on direct input--output mappings, lacking the intermediate reasoning steps crucial for complex manipulation tasks. As a result, existing VLAs lack temporal planning or reasoning capabilities. In this paper, we introduce a method that incorporates explicit visual chain-of-thought (CoT) reasoning into vision-language-action models (VLAs) by predicting future image frames autoregressively as visual goals before generating a short action sequence to achieve these goals. We introduce CoT-VLA, a state-of-the-art 7B VLA that can understand and generate visual and action tokens. Our experimental results demonstrate that CoT-VLA achieves strong performance, outperforming the state-of-the-art VLA model by 17% in real-world manipulation tasks and 6% in simulation benchmarks. Project website: https://cot-vla.github.io/

arxiv preprint arxiv, machine learning, natural language, (18 more...)

2503.2202

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Towards an intelligent assessment system for evaluating the development of algorithmic thinking skills: An exploratory study in Swiss compulsory schools

Adorni, Giorgia

artificial intelligence, development and implementation figure 7, machine learning, (23 more...)

The rapid digitalisation of contemporary society has profoundly impacted various facets of our lives, including healthcare, communication, business, and education. The ability to engage with new technologies and solve problems has become crucial, making CT skills, such as pattern recognition, decomposition, and algorithm design, essential competencies. In response, Switzerland is conducting research and initiatives to integrate CT into its educational system. This study aims to develop a comprehensive framework for large-scale assessment of CT skills, particularly focusing on AT, the ability to design algorithms. To achieve this, we first developed a competence model capturing the situated and developmental nature of CT, guiding the design of activities tailored to cognitive abilities, age, and context. This framework clarifies how activity characteristics influence CT development and how to assess these competencies. Additionally, we developed an activity for large-scale assessment of AT skills, offered in two variants: one based on non-digital artefacts (unplugged) and manual expert assessment, and the other based on digital artefacts (virtual) and automatic assessment. To provide a more comprehensive evaluation of students' competencies, we developed an IAS based on BNs with noisy gates, which offers real-time probabilistic assessment for each skill rather than a single overall score. The results indicate that the proposed instrument can measure AT competencies across different age groups and educational contexts in Switzerland, demonstrating its applicability for large-scale use. AT competencies exhibit a progressive development, with no overall gender differences, though variations are observed at the school level, significantly influenced by the artefact-based environment and its context, underscoring the importance of creating accessible and adaptable assessment tools.

2503.22756

Country:

Europe > Ireland (0.14)
North America > United States > California > San Francisco County > San Francisco (0.13)
Europe > Austria > Vienna (0.13)
(46 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
(3 more...)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (1.00)
(7 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Human Computer Interaction (1.00)
(10 more...)

ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback

Yun, Taewon, Oh, Jihwan, Min, Hyangsuk, Lee, Yuho, Bang, Jihwan, Cai, Jason, Song, Hwanjun

Summarization refinement faces challenges when extending to multi-dimension. In this paper, we introduce ReFeed, a powerful summarization refinement pipeline that enhances multiple dimensions through reflective reasoning on feedback. To achieve this, we release SumFeed-CoT, a large-scale Long-CoT-based dataset optimized for training a lightweight model with reflective reasoning. Our experiments reveal how the number of dimensions, feedback exposure, and reasoning policy influence refinement performance, highlighting reflective reasoning and simultaneously addressing multiple feedback is crucial to mitigate trade-off between dimensions. Furthermore, ReFeed is robust to noisy feedback and feedback order. Lastly, our finding emphasizes that creating data with a proper goal and guideline constitutes a fundamental pillar of effective reasoning. The dataset and model will be released.

large language model, machine learning, natural language, (22 more...)

2503.21332

Country:

North America > United States (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)