AITopics

2509.21027

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

arXiv.org Artificial IntelligenceSep-26-2025

Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices

Han, Qingyu, Li, Qian, Yang, Linxin, Chen, Qian, Shi, Qingjiang, Sun, Ruoyu

Integer Linear Programs (ILPs) are central to real-world optimizations but notoriously difficult to solve. Learning to Optimize (L2O) has emerged as a promising paradigm, with Graph Neural Networks (GNNs) serving as the standard backbone. However, standard anonymous GNNs are limited in expressiveness for ILPs, and the common enhancement of augmenting nodes with globally unique identifiers (UIDs) typically introduces spurious correlations that severely harm generalization. To address this tradeoff, we propose a parsimonious Local-UID scheme based on d-hop uniqueness coloring, which ensures identifiers are unique only within each node's d-hop neighborhood. Building on this scheme, we introduce ColorGNN, which incorporates color information via color-conditioned embeddings, and ColorUID, a lightweight feature-level variant. We prove that for d-layer networks, Local-UIDs achieve the expressive power of Global-UIDs while offering stronger generalization. Extensive experiments show that our approach (i) yields substantial gains on three ILP benchmarks, (ii) exhibits strong OOD generalization on linear programming datasets, and (iii) further improves a general graph-level task when paired with a state-of-the-art method.

artificial intelligence, machine learning, optimization problem, (18 more...)

2509.21

Country: Asia > China (0.47)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
(3 more...)

arXiv.org Artificial IntelligenceSep-26-2025

StyleBench: Evaluating thinking styles in Large Language Models

Guo, Junyu, Gu, Shangding, Jin, Ming, Spanos, Costas, Lavaei, Javad

The effectiveness of Large Language Models (LLMs) is heavily influenced by the reasoning strategies, or styles of thought, employed in their prompts. However, the interplay between these reasoning styles, model architecture, and task type remains poorly understood. To address this, we introduce StyleBench, a comprehensive benchmark for systematically evaluating reasoning styles across diverse tasks and models. We assess five representative reasoning styles--Chain-of-Thought (CoT), Tree-of-Thought (ToT), Algorithm-of-Thought (AoT), Sketch-of-Thought (SoT), and Chain-of-Draft (CoD)--on five reasoning tasks, using 15 open-source models from major families (LLaMA, Qwen, Mistral, Gemma, GPT -OSS, Phi, and DeepSeek) ranging from 270M to 120B parameters. Our large-scale analysis reveals that no single style is universally optimal. We demonstrate that strategy efficacy is highly contingent on both model scale and task type: search-based methods (AoT, ToT) excel in open-ended problems but require large-scale models, while concise styles (SoT, CoD) achieve radical efficiency gains on well-defined tasks. Furthermore, we identify key behavioral patterns: smaller models frequently fail to follow output instructions and default to guessing, while reasoning robustness emerges as a function of scale. Our findings offer a crucial roadmap for selecting optimal reasoning strategies based on specific constraints, We open source the benchmark in https://github.com/JamesJunyuGuo/Style_Bench. Large Language Models (LLMs) have demonstrated impressive capabilities across a diverse range of tasks, including mathematical reasoning, code generation, and complex question answering (Imani et al., 2023; Wang & Chen, 2023; Tan et al., 2023). A key insight from prior work is that their performance on challenging problems is not merely a function of scale, but is critically dependent on the methods used to guide reasoning (Huang & Y ang, 2025). This has spurred the development of sophisticated prompting techniques designed to structure the model's internal reasoning process. Notable among these are Chain-of-Thought (CoT) (Wei et al., 2022), which decomposes problems into sequential steps, and more advanced paradigms like Tree-of-Thought (ToT) (Y ao et al., 2023), which explores multiple reasoning paths in parallel, and Rea-sonflux (Y ang et al., 2025b), employing high-level templates to explore potential solutions. Performance remains highly sensitive to prompt phrasing and frequently necessitates iterative feedback to achieve robust results (Sel et al., 2023). In response, recent work has sought to automate reasoning strategy selection.

large language model, machine learning, natural language, (18 more...)

2509.20868

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Spiridonov, Alexander, Zaech, Jan-Nico, Nikolov, Nikolay, Van Gool, Luc, Paudel, Danda Pani

Generalist Robot Manipulation beyond Action Labeled Data

arXiv.org Artificial IntelligenceSep-25-2025

Recent advances in generalist robot manipulation leverage pre-trained Vision-Language Models (VLMs) and large-scale robot demonstrations to tackle diverse tasks in a zero-shot manner. A key challenge remains: scaling high-quality, action-labeled robot demonstration data, which existing methods rely on for robustness and generalization. To address this, we propose a method that benefits from videos without action labels - featuring humans and/or robots in action - enhancing open-vocabulary performance and enabling data-efficient learning of new tasks. Our method extracts dense, dynamic 3D point clouds at the hand or gripper location and uses a proposed 3D dynamics predictor for self-supervision. This predictor is then tuned to an action predictor using a smaller labeled dataset for action alignment. We show that our method not only learns from unlabeled human and robot demonstrations - improving downstream generalist robot policies - but also enables robots to learn new tasks without action labels (i.e., out-of-action generalization) in both real-world and simulated settings.

large language model, natural language, predictor, (18 more...)

2509.19958

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.61)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.46)

arXiv.org Artificial IntelligenceSep-25-2025

Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning

Wang, Guoxin, Zhao, Jun, Liu, Xinyi, Liu, Yanbo, Cao, Xuyang, Li, Chao, Liu, Zhuoyun, Sun, Qintian, Zhou, Fangru, Xing, Haoqiang, Yang, Zhenhong

Medical imaging provides critical evidence for clinical diagnosis, treatment planning, and surgical decisions, yet most existing imaging models are narrowly focused and require multiple specialized networks, limiting their generalization. Although large-scale language and multimodal models exhibit strong reasoning and multi-task capabilities, real-world clinical applications demand precise visual grounding, multimodal integration, and chain-of-thought reasoning. We introduce Citrus-V, a multimodal medical foundation model that combines image analysis with textual reasoning. The model integrates detection, segmentation, and multimodal chain-of-thought reasoning, enabling pixel-level lesion localization, structured report generation, and physician-like diagnostic inference in a single framework. We propose a novel multimodal training approach and release a curated open-source data suite covering reasoning, detection, segmentation, and document understanding tasks. Evaluations demonstrate that Citrus-V outperforms existing open-source medical models and expert-level imaging systems across multiple benchmarks, delivering a unified pipeline from visual grounding to clinical reasoning and supporting precise lesion quantification, automated reporting, and reliable second opinions.

large language model, machine learning, natural language, (20 more...)

2509.1909

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

arXiv.org Artificial IntelligenceSep-25-2025

CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models

Chen, Zhuofan, He, Jiyuan, Zhang, Yichi, Hu, Xing, Wen, Haoxing, Bai, Jun, Rong, Wenge

Mathematical reasoning poses significant challenges for Large Language Models (LLMs) due to its demand for multi-step reasoning and abstract conceptual integration. While recent test-time scaling techniques rely heavily on high-quality, challenging problems, the scarcity of Olympiad-level math problems remains a bottleneck. We introduce CogAtom, a novel cognitive atom-based framework for synthesizing mathematically rigorous and cognitively diverse problems. Unlike prior approaches, CogAtom models problem construction as a process of selecting and recombining fundamental reasoning units, cognitive atoms, extracted from human-authored solutions. A diversity-promoting random walk algorithm enables exploration of the cognitive atom space, while a constraint-based recombination mechanism ensures logical soundness and structural validity. The combinatorial nature of the graph structure provides a near-infinite space of reasoning paths, and the walk algorithm systematically explores this space to achieve large-scale synthesis of high-quality problems; meanwhile, by controlling the number of cognitive atoms, we can precisely adjust problem difficulty, ensuring diversity, scalability, and controllability of the generated problems. Experimental results demonstrate that CogAtom outperforms existing methods in accuracy, reasoning depth, and diversity, generating problems that closely match the difficulty of AIME while exceeding it in structural variation. Our work offers a cognitively grounded pathway toward scalable, high-quality math problem generation.Our code is publicly available at https://github.com/Icarus-1111/CogAtom.

artificial intelligence, large language model, natural language, (19 more...)

2509.17318

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

A Chain-of-thought Reasoning Breast Ultrasound Dataset Covering All Histopathology Categories

Yu, Haojun, Li, Youcheng, Niu, Zihan, Zhang, Nan, Gong, Xuantong, Li, Huan, Zou, Zhiying, Qi, Haifeng, Cao, Zhenxiao, Lan, Zijie, Yuan, Xingjian, He, Jiating, Zhang, Haokai, Zhang, Shengtao, Wang, Zicheng, Wang, Dong, Zhao, Ziwei, Chen, Congying, Wang, Yong, Qin, Wangyan, Zhu, Qingli, Wang, Liwei

Breast ultrasound (BUS) is an essential tool for diagnosing breast lesions, with millions of examinations per year. However, publicly available high-quality BUS benchmarks for AI development are limited in data scale and annotation richness. In this work, we present BUS-CoT, a BUS dataset for chain-of-thought (CoT) reasoning analysis, which contains 11,439 images of 10,019 lesions from 4,838 patients and covers all 99 histopathology types. To facilitate research on incentivizing CoT reasoning, we construct the reasoning processes based on observation, feature, diagnosis and pathology labels, annotated and verified by experienced experts. Moreover, by covering lesions of all histopathology types, we aim to facilitate robust AI systems in rare cases, which can be error-prone in clinical practice.

large language model, machine learning, natural language, (16 more...)

2509.17046

Country: Asia > China (0.75)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

CayleyPy Growth: Efficient growth computations and hundreds of new conjectures on Cayley graphs (Brief version)

Chervov, A., Fedoriaka, D., Konstantinova, E., Naumov, A., Kiselev, I., Sheveleva, A., Koltsov, I., Lytkin, S., Smolensky, A., Soibelman, A., Levkovich-Maslyuk, F., Grimov, R., Volovich, D., Isakov, A., Kostin, A., Litvinov, M., Vilkin-Krom, N., Bidzhiev, A., Krasnyi, A., Evseev, M., Geraseva, E., Grunwald, L., Galkin, S., Koldunov, E., Diner, S., Chevychelov, A., Kudasheva, E., Sychev, A., Kravchenko, A., Kogan, Z., Natyrova, A., Shishina, L., Cheldieva, L., Zamkovoy, V., Kovalenko, D., Papulov, O., Kudashev, S., Shiltsov, D., Turtayev, R., Nikitina, O., Mamayeva, D., Nikolenko, S., Obozov, M., Titarenko, A., Dolgorukova, A., Aparnev, A., Debeaupuis, O., C., S. Alami, Isambert, H.

This is the third paper of the CayleyPy project applying artificial intelligence to problems in group theory. We announce the first public release of CayleyPy, an open source Python library for computations with Cayley and Schreier graphs. Compared with systems such as GAP and Sage, CayleyPy handles much larger graphs and performs several orders of magnitude faster. Using CayleyPy we obtained about 200 new conjectures on Cayley and Schreier graphs, focused on diameters and growth. For many Cayley graphs of symmetric groups Sn we observe quasi polynomial diameter formulas: a small set of quadratic or linear polynomials indexed by n mod s. We conjecture that this is a general phenomenon, giving efficient diameter computation despite the problem being NP hard. We propose a refinement of the Babai type conjecture on diameters of Sn: n^2/2 + 4n upper bounds in the undirected case, compared to previous O(n^2) bounds. We also provide explicit generator families, related to involutions in a square with whiskers pattern, conjectured to maximize the diameter; search confirms this for all n up to 15. We further conjecture an answer to a question posed by V M Glushkov in 1968 on directed Cayley graphs generated by a cyclic shift and a transposition. For nilpotent groups we conjecture an improvement of J S Ellenberg's results on upper unitriangular matrices over Z/pZ, showing linear dependence of diameter on p. Moreover. Some conjectures are LLM friendly, naturally stated as sorting problems verifiable by algorithms or Python code. To benchmark path finding we created more than 10 Kaggle datasets. CayleyPy works with arbitrary permutation or matrix groups and includes over 100 predefined generators. Our growth computation code outperforms GAP and Sage up to 1000 times in speed and size.

generator, machine learning, programming language, (20 more...)

2509.19162

Country:

Europe (0.27)
Asia > Japan (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Leisure & Entertainment > Games > Computer Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation

Jiang, Zhennan, Liu, Kai, Qin, Yuxin, Tian, Shuai, Zheng, Yupeng, Zhou, Mingcai, Yu, Chao, Li, Haoran, Zhao, Dongbin

Robotic manipulation policies are commonly initialized through imitation learning, but their performance is limited by the scarcity and narrow coverage of expert data. Reinforcement learning can refine polices to alleviate this limitation, yet real-robot training is costly and unsafe, while training in simulators suffers from the sim-to-real gap. Recent advances in generative models have demonstrated remarkable capabilities in real-world simulation, with diffusion models in particular excelling at generation. This raises the question of how diffusion model-based world models can be combined to enhance pre-trained policies in robotic manipulation. In this work, we propose World4RL, a framework that employs diffusion-based world models as high-fidelity simulators to refine pre-trained policies entirely in imagined environments for robotic manipulation. Unlike prior works that primarily employ world models for planning, our framework enables direct end-to-end policy optimization. World4RL is designed around two principles: pre-training a diffusion world model that captures diverse dynamics on multi-task datasets and refining policies entirely within a frozen world model to avoid online real-world interactions. We further design a two-hot action encoding scheme tailored for robotic manipulation and adopt diffusion backbones to improve modeling fidelity. Extensive simulation and real-world experiments demonstrate that World4RL provides high-fidelity environment modeling and enables consistent policy refinement, yielding significantly higher success rates compared to imitation learning and other baselines. More visualization results are available at https://world4rl.github.io/.

machine learning, reinforcement learning, world model, (13 more...)

2509.1908

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Teaching Audio Models to Reason: A Unified Framework for Source- and Layer-wise Distillation

Yang, Runyan, Si, Yuke, Gao, Yingying, Feng, Junlan, Deng, Chao, Zhang, Shilei

While large audio language models excel at tasks like ASR and emotion recognition, they still struggle with complex reasoning due to the modality gap between audio and text as well as the lack of structured intermediate supervision. To address this, we propose a unified knowledge distillation framework to transfer reasoning capabilities from a high-capacity textual teacher model to a student audio models while preserving its acoustic competence. Our method introduces two key dimensions: source-wise distillation, which leverages both textual and acoustic teachers to provide complementary modality-specific supervision; and layer-wise distillation, which aligns teacher signals with appropriate student layers to improve transfer efficiency. This dual-dimensional strategy enables fine-grained control over the distillation process, effectively bridging the gap between symbolic reasoning and speech representations. Experimental results show significant improvements in audio reasoning performance, demonstrating the effectiveness of our framework as a reasoning transfer solution for audio modeling.

distillation, machine learning, natural language, (16 more...)

2509.18579

Country: Asia > China (0.15)

Genre: Research Report (0.84)

Industry: Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.35)