AITopics | dar

Collaborating Authors

dar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration

Yang, Zhicheng, Guo, Zhijiang, Huang, Yinya, Wang, Yongxin, Xie, Dongchun, Wang, Yiwei, Liang, Xiaodan, Tang, Jing

arXiv.org Artificial IntelligenceOct-7-2025

Reinforcement Learning with Verifiable Reward (RLVR) has emerged as a powerful paradigm for unlocking reasoning capabilities in large language models, yet its full potential is hindered by two under-explored dimensions: Depth-the hardest problem a model can sample; Breadth-the number of instances consumed in a single iteration. We dissect the popular GRPO algorithm and reveal a systematic bias: the cumulative-advantage disproportionately weights samples with medium accuracy, while down-weighting the low-accuracy instances that are crucial for pushing reasoning boundaries. To rectify the depth neglect, we introduce Difficulty Adaptive Rollout Sampling (DARS), which re-weights hard problems through targeted multi-stage rollouts, thereby increasing the number of positive rollouts for hard problems. Empirically, naively enlarging rollout size only accelerates convergence and even hurts Pass@K. Our DARS, in contrast, delivers consistent Pass@K gains without extra inference cost at convergence. Just as we adaptively expanded the depth of exploration, we now ask whether aggressively scaling the breadth of training data can further amplify reasoning gains. To this end, we intensely scale batch size and replace PPO's mini-batch iterations with full-batch updates over multiple epochs. Increasing breadth significantly enhances Pass@1 performance. Large-breadth training sustains high token-level entropy, indicating continued exploration and reduced gradient noise. We further present DARS-B, which augments DARS with large breadth, and demonstrate simultaneous gains in Pass@K and Pass@1. The results confirm that breadth and adaptive exploration across depth operate as orthogonal dimensions in RLVR, which are key to unleashing the reasoning power of RLVR.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.13755

Country: Asia (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f187a23c3ee681ef6913f31fd6d6446b-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 19:08:41 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Reinforced Model Merging

Han, Jiaqi, Ye, Jingwen, Liu, Shunyu, Zhang, Haofei, Song, Jie, Feng, Zunlei, Song, Mingli

arXiv.org Artificial IntelligenceMar-27-2025

The success of large language models has garnered widespread attention for model merging techniques, especially training-free methods which combine model capabilities within the parameter space. However, two challenges remain: (1) uniform treatment of all parameters leads to performance degradation; (2) search-based algorithms are often inefficient. In this paper, we present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks. These components interact to execute layer-wise merging actions, aiming to search the optimal merging architecture. Notably, RMM operates without any gradient computations on the original models, rendering it feasible for edge devices. Furthermore, by utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times. Extensive experiments demonstrate that RMM achieves state-of-the-art performance across various vision and NLP datasets and effectively overcomes the limitations of the existing baseline methods. Our code is available at https://github.com/WuDiHJQ/Reinforced-Model-Merging.

machine learning, natural language, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2503.21272

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Asia > China > Zhejiang Province > Ningbo (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Hacia la interpretabilidad de la detecci\'on anticipada de riesgos de depresi\'on utilizando grandes modelos de lenguaje

Thompson, Horacio, Sapino, Maximiliano, Ferretti, Edgardo, Errecalde, Marcelo

arXiv.org Artificial IntelligenceMar-26-2025

Early Detection of Risks (EDR) on the Web involves identifying at-risk users as early as possible. Although Large Language Models (LLMs) have proven to solve various linguistic tasks efficiently, assessing their reasoning ability in specific domains is crucial. In this work, we propose a method for solving depression-related EDR using LLMs on Spanish texts, with responses that can be interpreted by humans. We define a reasoning criterion to analyze users through a specialist, apply in-context learning to the Gemini model, and evaluate its performance both quantitatively and qualitatively. The results show that accurate predictions can be obtained, supported by explanatory reasoning, providing a deeper understanding of the solution. Our approach offers new perspectives for addressing EDR problems by leveraging the power of LLMs.

depresi, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.20939

Country:

South America > Argentina > Pampas > Buenos Aires Province > La Plata (0.04)
South America > Argentina > Cuyo > San Luis Province > San Luis (0.04)
North America > United States > California > Los Angeles County > El Segundo (0.04)

Genre: Research Report (0.69)

Industry: Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Direction-Aware Diagonal Autoregressive Image Generation

Xu, Yijia, Ju, Jianzhong, Luan, Jian, Cui, Jinshi

arXiv.org Artificial IntelligenceMar-14-2025

The raster-ordered image token sequence exhibits a significant Euclidean distance between index-adjacent tokens at line breaks, making it unsuitable for autoregressive generation. To address this issue, this paper proposes Direction-Aware Diagonal Autoregressive Image Generation (DAR) method, which generates image tokens following a diagonal scanning order. The proposed diagonal scanning order ensures that tokens with adjacent indices remain in close proximity while enabling causal attention to gather information from a broader range of directions. Additionally, two direction-aware modules: 4D-RoPE and direction embeddings are introduced, enhancing the model's capability to handle frequent changes in generation direction. To leverage the representational capacity of the image tokenizer, we use its codebook as the image token embeddings. We propose models of varying scales, ranging from 485M to 2.0B. On the 256$\times$256 ImageNet benchmark, our DAR-XL (2.0B) outperforms all previous autoregressive image generators, achieving a state-of-the-art FID score of 1.37.

autoregressive transformer, image token, transformer, (16 more...)

arXiv.org Artificial Intelligence

2503.11129

Country:

Europe > Austria > Vienna (0.14)
Europe > Italy > Lombardy > Milan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Zero-Shot Interactive Text-to-Image Retrieval via Diffusion-Augmented Representations

Long, Zijun, Liang, Kangheng, Aragon-Camarasa, Gerardo, Mccreadie, Richard, Henderson, Paul

arXiv.org Artificial IntelligenceJan-25-2025

Interactive Text-to-Image Retrieval (I-TIR) has emerged as a transformative user-interactive tool for applications in domains such as e-commerce and education. Yet, current methodologies predominantly depend on finetuned Multimodal Large Language Models (MLLMs), which face two critical limitations: (1) Finetuning imposes prohibitive computational overhead and long-term maintenance costs. (2) Finetuning narrows the pretrained knowledge distribution of MLLMs, reducing their adaptability to novel scenarios. These issues are exacerbated by the inherently dynamic nature of real-world I-TIR systems, where queries and image databases evolve in complexity and diversity, often deviating from static training distributions. To overcome these constraints, we propose Diffusion Augmented Retrieval (DAR), a paradigm-shifting framework that bypasses MLLM finetuning entirely. DAR synergizes Large Language Model (LLM)-guided query refinement with Diffusion Model (DM)-based visual synthesis to create contextually enriched intermediate representations. This dual-modality approach deciphers nuanced user intent more holistically, enabling precise alignment between textual queries and visually relevant images. Rigorous evaluations across four benchmarks reveal DAR's dual strengths: (1) Matches state-of-the-art finetuned I-TIR models on straightforward queries without task-specific training. (2) Scalable Generalization: Surpasses finetuned baselines by 7.61% in Hits@10 (top-10 accuracy) under multi-turn conversational complexity, demonstrating robustness to intricate, distributionally shifted interactions. By eliminating finetuning dependencies and leveraging generative-augmented representations, DAR establishes a new trajectory for efficient, adaptive, and scalable cross-modal retrieval systems.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.15379

Country:

Asia > Middle East > Syria > Daraa Governorate > Dar'a (0.24)
Europe > Italy (0.05)
Europe > Spain > Aragón (0.05)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment

Ghosh, Indrajeet, Chugh, Garvit, Faridee, Abu Zaher Md, Roy, Nirmalya

arXiv.org Artificial IntelligenceOct-22-2024

Recent advancements in deep learning-based wearable human action recognition (wHAR) have improved the capture and classification of complex motions, but adoption remains limited due to the lack of expert annotations and domain discrepancies from user variations. Limited annotations hinder the model's ability to generalize to out-of-distribution samples. While data augmentation can improve generalizability, unsupervised augmentation techniques must be applied carefully to avoid introducing noise. Unsupervised domain adaptation (UDA) addresses domain discrepancies by aligning conditional distributions with labeled target samples, but vanilla pseudo-labeling can lead to error propagation. To address these challenges, we propose $\mu$DAR, a novel joint optimization architecture comprised of three functions: (i) consistency regularizer between augmented samples to improve model classification generalizability, (ii) temporal ensemble for robust pseudo-label generation and (iii) conditional distribution alignment to improve domain generalizability. The temporal ensemble works by aggregating predictions from past epochs to smooth out noisy pseudo-label predictions, which are then used in the conditional distribution alignment module to minimize kernel-based class-wise conditional maximum mean discrepancy ($k$CMMD) between the source and target feature space to learn a domain invariant embedding. The consistency-regularized augmentations ensure that multiple augmentations of the same sample share the same labels; this results in (a) strong generalization with limited source domain samples and (b) consistent pseudo-label generation in target samples. The novel integration of these three modules in $\mu$DAR results in a range of $\approx$ 4-12% average macro-F1 score improvement over six state-of-the-art UDA methods in four benchmark wHAR datasets

artificial intelligence, discrepancy, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.17489

Country:

North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Double-Anonymous Review for Robotics

Yim, Justin K., Nadan, Paul, Zhu, James, Stutt, Alexandra, Payne, J. Joe, Pavlov, Catherine, Johnson, Aaron M.

arXiv.org Artificial IntelligenceJun-14-2024

However, Prior research has investigated the benefits and costs of even when reviewers self-report as having the highest level double-anonymous review (DAR, also known as double-blind of expertise in their field, their guess accuracy is no better review) in comparison to single-anonymous review (SAR) and than those who are self-reported as less knowledgeable [17]. Several review papers have attempted to Increased editor burden in handling conflict of interest, author compile experimental results in peer review research both burden in anonymizing the manuscript, and reviewer burden broadly and in engineering and computer science specifically in navigating prior work by others and by the authors are also [1-4]. This document summarizes prior research in peer review cited as costs to DAR. that may inform decisions about the format of peer review in Despite these challenges, numerous robotics conferences the field of robotics and makes some recommendations for have already made the shift to DAR, including RSS and a potential next steps for robotics publications. Furthermore, top machine learning conferences such as NeurIPS and CoRL have II. The presence of gender bias and effect of DAR on such bias is a common concern in research into peer review but Based on the current literature, we find that the evidence the conclusions are varied. Many studies do conclude that in support of double-anonymous review is not sufficient to gender can disadvantage authors, particularly women [5, 6] conclusively recommend for implementation in robotics conferences and that DAR can reduce this bias [7].

double-anonymous review, gender bia, reviewer, (14 more...)

arXiv.org Artificial Intelligence

2406.10059

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.69)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Distance-aware Attention Reshaping: Enhance Generalization of Neural Solver for Large-scale Vehicle Routing Problems

Wang, Yang, Jia, Ya-Hui, Chen, Wei-Neng, Mei, Yi

arXiv.org Artificial IntelligenceJan-13-2024

Neural solvers based on attention mechanism have demonstrated remarkable effectiveness in solving vehicle routing problems. However, in the generalization process from small scale to large scale, we find a phenomenon of the dispersion of attention scores in existing neural solvers, which leads to poor performance. To address this issue, this paper proposes a distance-aware attention reshaping method, assisting neural solvers in solving large-scale vehicle routing problems. Specifically, without the need for additional training, we utilize the Euclidean distance information between current nodes to adjust attention scores. This enables a neural solver trained on small-scale instances to make rational choices when solving a large-scale problem. Experimental results show that the proposed method significantly outperforms existing state-of-the-art neural solvers on the large-scale CVRPLib dataset.

attention score, node, vehicle, (16 more...)

arXiv.org Artificial Intelligence

2401.06979

Country:

Asia > China (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Transportation > Freight & Logistics Services (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Liu, Wei, Wang, Haozhao, Wang, Jun, Deng, Zhiying, Zhang, YuanKai, Wang, Cheng, Li, Ruixuan

arXiv.org Artificial IntelligenceDec-14-2023

Rationalization empowers deep learning models with self-explaining capabilities through a cooperative game, where a generator selects a semantically consistent subset of the input as a rationale, and a subsequent predictor makes predictions based on the selected rationale. In this paper, we discover that rationalization is prone to a problem named \emph{rationale shift}, which arises from the algorithmic bias of the cooperative game. Rationale shift refers to a situation where the semantics of the selected rationale may deviate from the original input, but the predictor still produces accurate predictions based on the deviation, resulting in a compromised generator with misleading feedback. To address this issue, we first demonstrate the importance of the alignment between the rationale and the full input through both empirical observations and theoretical analysis. Subsequently, we introduce a novel approach called DAR (\textbf{D}iscriminatively \textbf{A}ligned \textbf{R}ationalization), which utilizes an auxiliary module pretrained on the full input to discriminatively align the selected rationale and the original input. We theoretically illustrate how DAR accomplishes the desired alignment, thereby overcoming the rationale shift problem. The experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality (measured by the overlap between the model-selected explanation and the human-annotated rationale) as compared to state-of-the-art techniques. Additionally, results on two synthetic settings further validate the effectiveness of DAR in addressing the rationale shift problem.

predictor, rationale, rationalization, (16 more...)

arXiv.org Artificial Intelligence

2312.04103

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Hubei Province > Wuhan (0.05)
(16 more...)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback