AITopics

2507.07857

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
(2 more...)

arXiv.org Artificial IntelligenceJul-11-2025

Position: We Need An Algorithmic Understanding of Generative AI

Eberle, Oliver, McGee, Thomas, Giaffar, Hamza, Webb, Taylor, Momennejad, Ida

What algorithms do LLMs actually learn and use to solve problems? Studies addressing this question are sparse, as research priorities are focused on improving performance through scale, leaving a theoretical and empirical gap in understanding emergent algorithms. This position paper proposes AlgEval: a framework for systematic research into the algorithms that LLMs learn and use. AlgEval aims to uncover algorithmic primitives, reflected in latent representations, attention, and inference-time compute, and their algorithmic composition to solve task-specific problems. We highlight potential methodological paths and a case study toward this goal, focusing on emergent search algorithms. Our case study illustrates both the formation of top-down hypotheses about candidate algorithms, and bottom-up tests of these hypotheses via circuit-level analysis of attention patterns and hidden states. The rigorous, systematic evaluation of how LLMs actually solve tasks provides an alternative to resource-intensive scaling, reorienting the field toward a principled understanding of underlying computations. Such algorithmic explanations offer a pathway to human-understandable interpretability, enabling comprehension of the model's internal reasoning performance measures. This can in turn lead to more sample-efficient methods for training and improving performance, as well as novel architectures for end-to-end and multi-agent systems.

large language model, machine learning, natural language, (18 more...)

2507.07544

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California (0.46)
North America > United States > Minnesota (0.28)

Genre:

Overview (0.93)
Research Report > New Finding (0.67)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.51)

PLACE: Prompt Learning for Attributed Community Search

Fang, Shuheng, Zhao, Kangfei, Zhang, Rener, Rong, Yu, Yu, Jeffrey Xu

In this paper, we propose PLACE (Prompt Learning for Attributed Community Search), an innovative graph prompt learning framework for ACS. Enlightened by prompt-tuning in Natural Language Processing (NLP), where learnable prompt tokens are inserted to contextualize NLP queries, PLACE integrates structural and learnable prompt tokens into the graph as a query-dependent refinement mechanism, forming a prompt-augmented graph. Within this prompt-augmented graph structure, the learned prompt tokens serve as a bridge that strengthens connections between graph nodes for the query, enabling the GNN to more effectively identify patterns of structural cohesiveness and attribute similarity related to the specific query. We employ an alternating training paradigm to optimize both the prompt parameters and the GNN jointly. Moreover, we design a divide-and-conquer strategy to enhance scalability, supporting the model to handle million-scale graphs. Extensive experiments on 9 real-world graphs demonstrate the effectiveness of PLACE for three types of ACS queries, where PLACE achieves higher F1 scores by 22% compared to the state-of-the-arts on average.

information retrieval, machine learning, natural language, (19 more...)

2507.05311

Country:

North America > United States > Texas (0.05)
North America > United States > Wisconsin (0.04)
Asia > China > Hong Kong (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Strongly Solving $7 \times 6$ Connect-Four on Consumer Grade Hardware

Böck, Markus

While the game Connect-Four has been solved mathematically and the best move can be effectively computed with search based methods, a strong solution in the form of a look-up table was believed to be infeasible. In this paper, we revisit a symbolic search method based on binary decision diagrams to produce strong solutions. With our efficient implementation we were able to produce a 89.6 GB large look-up table in 47 hours on a single CPU core with 128 GB main memory for the standard $7 \times 6$ board size. In addition to this win-draw-loss evaluation, we include an alpha-beta search in our open source artifact to find the move which achieves the fastest win or slowest loss.

artificial intelligence, board configuration, connectfour, (14 more...)

2507.05267

Country: Europe > Austria (0.28)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.90)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

de Vries, Joery A., He, Jinke, Oren, Yaniv, Spaan, Matthijs T. J.

Trust-Region Twisted Policy Improvement

Monte-Carlo tree search (MCTS) has driven many recent breakthroughs in deep reinforcement learning (RL). However, scaling MCTS to parallel compute has proven challenging in practice which has motivated alternative planners like sequential Monte-Carlo (SMC). Many of these SMC methods adopt particle filters for smoothing through a reformulation of RL as a policy inference problem. Yet, persisting design choices of these particle filters often conflict with the aim of online planning in RL, which is to obtain a policy improvement at the start of planning. Drawing inspiration from MCTS, we tailor SMC planners specifically for RL by improving data generation within the planner through constrained action sampling and explicit terminal state handling, as well as improving policy and value target estimation. This leads to our Trust-Region Twisted SMC (TRT-SMC), which shows improved runtime and sample-efficiency over baseline MCTS and SMC methods in both discrete and continuous domains.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2504.06048

Country:

Europe > Netherlands (0.28)
North America (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Khanal, Abhish, Mathew, Joseph Prince, Nowzari, Cameron, Stein, Gregory J.

Learning-Augmented Model-Based Multi-Robot Planning for Time-Critical Search and Inspection Under Uncertainty

-- In disaster response or surveillance operations, quickly identifying areas needing urgent attention is critical, but deploying response teams to every location is inefficient or often impossible. Effective performance in this domain requires coordinating a multi-robot inspection team to prioritize inspecting locations more likely to need immediate response, while also minimizing travel time. This is particularly challenging because robots must directly observe the locations to determine which ones require additional attention. This work introduces a multi-robot planning framework for coordinated time-critical multi-robot search under uncertainty. Our approach uses a graph neural network to estimate the likelihood of PoIs needing attention from noisy sensor data and then uses those predictions to guide a multi-robot model-based planner to determine the cost-effective plan. Simulated experiments demonstrate that our planner improves performance at least by 16.3%, 26.7%, and 26.2% for 1, 3, and 5 robots, respectively, compared to non-learned and learned baselines. In scenarios like disaster aftermath inspection or critical surveillance operations, quickly traveling to and inspecting affected areas is crucial for an efficient response.

artificial intelligence, immediate attention, machine learning, (16 more...)

2507.06129

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
(2 more...)

Rautureau, Aloïs, Piette, Éric

CogniPlay: a work-in-progress Human-like model for General Game Playing

--While AI systems have equaled or surpassed human performance in a wide variety of games such as Chess, Go, or Dota 2, describing these systems as truly "human-like" remains far-fetched. Despite their success, they fail to replicate the pattern-based, intuitive decision-making processes observed in human cognition. This paper presents an overview of findings from cognitive psychology and previous efforts to model humanlike behavior in artificial agents, discusses their applicability to General Game Playing (GGP) and introduces our work-in-progress model based on these observations: CogniPlay. Although AI systems have surpassed human performance in games such as Chess [5], Go [14], and competitive games like Dota 2 [2], describing them as "human-like" would be an overstatement. Despite their exceptional performance, these systems fail to accurately replicate the selective, pattern-based decision-making that characterizes human cognition [8], [12].

agent, artificial intelligence, machine learning, (16 more...)

2507.05868

Country: Europe (0.94)

Genre:

Overview (0.54)
Research Report (0.40)

Industry:

Leisure & Entertainment > Games > Chess (0.72)
Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.98)

arXiv.org Artificial IntelligenceJul-8-2025

Iterative Zoom-In: Temporal Interval Exploration for Long Video Understanding

Li, Chenglin, Chen, Qianglong, fengtao, null, Zhang, Yin

Multimodal Large Language Models (MLLMs) have shown strong performance in video understanding tasks. However, they continue to struggle with long-form videos because of an inefficient perception of temporal intervals. Unlike humans, who can dynamically adjust their temporal focus to locate query-relevant moments, current MLLMs often rely on dense, uniform sampling across the video timeline, leading to high memory consumption and a risk of missing crucial information. To address this challenge, we introduce Temporal Search, a training-free framework that enables MLLMs to explore temporal regions for improved long video understanding iteratively. TS is based on a key observation: the model's generation confidence across different temporal intervals is highly correlated with prediction accuracy. TS operates through two main iterative stages. First, the MLLM proposes a temporal interval that is likely to contain task-relevant information. Then, it samples a fixed number of frames from the interval, regardless of length, and feeds them into the model to produce a refined response and confidence score. TS refines the focus of the model by iteratively shifting attention to more fine-grained temporal intervals, improving its understanding of long videos. Additionally, keyframe-level descriptions are collected to facilitate cross-interval perception throughout the video. To further improve efficiency, we introduce TS-BFS, a best-first search strategy over a tree. Each node represents a candidate interval and is expanded via two methods: self-driven proposals and uniform partitioning. Nodes are scored based on confidence and self-evaluation, and the most promising one is selected for continued exploration.

arxiv preprint arxiv, large language model, natural language, (18 more...)

2507.02946

Country: Asia > China > Zhejiang Province > Hangzhou (0.40)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.83)

arXiv.org Artificial IntelligenceJul-8-2025

WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis

Gao, Yifei, Ye, Junhong, Wang, Jiaqi, Sang, Jitao

Recent advancements in large language models (LLMs) have significantly improved the capabilities of web agents. However, effectively navigating complex and dynamic web environments still requires more advanced trajectory-level planning and execution. Prior studies have addressed self-improving agents by collecting extensive GUI trajectories from real-environment interactions. Despite their effectiveness, these approaches encounter two critical challenges: (1) Uncontrollable environment states, where real or sandboxed web environments often yield unstable and non-deterministic feedback, complicating the reproduction and debugging of agent behaviors; and (2) High API costs, as generating even a single interaction trajectory can involve hundreds of queries, leading to considerable API usage and computational expenses. To address these limitations and enable scalable self-improvement for agents, we propose WebSynthesis, a novel framework for trajectory synthesis and training. WebSynthesis leverages a learned world model to simulate virtual web environments, allowing a policy agent to perform efficient and reversible tree-based planning. This approach supports the large-scale generation of diverse and high-quality trajectories, which are subsequently utilized to refine the agent's policy. Experimental results demonstrate that an agent trained using WebSynthesis on a small-scale synthetic dataset achieves performance comparable to or even surpassing that of models trained on large-scale real-world data.

large language model, machine learning, trajectory, (20 more...)

2507.0437

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(3 more...)

arXiv.org Machine LearningJul-8-2025

Minimax and Bayes Optimal Best-arm Identification: Adaptive Experimental Design for Treatment Choice

Kato, Masahiro

This study investigates adaptive experimental design for treatment choice, also known as fixed-budget best-arm identification. We consider an adaptive procedure consisting of a treatment-allocation phase followed by a treatment-choice phase, and we design an adaptive experiment for this setup to efficiently identify the best treatment arm, defined as the one with the highest expected outcome. In our designed experiment, the treatment-allocation phase consists of two stages. The first stage is a pilot phase, where we allocate each treatment arm uniformly with equal proportions to eliminate clearly suboptimal arms and estimate outcome variances. In the second stage, we allocate treatment arms in proportion to the variances estimated in the first stage. After the treatment-allocation phase, the procedure enters the treatment-choice phase, where we choose the treatment arm with the highest sample mean as our estimate of the best treatment arm. We prove that this single design is simultaneously asymptotically minimax and Bayes optimal for the simple regret, with upper bounds that match our lower bounds up to exact constants. Therefore, our designed experiment achieves the sharp efficiency limits without requiring separate tuning for minimax and Bayesian objectives.

artificial intelligence, experiment, treatment arm, (17 more...)

arXiv.org Machine Learning

2506.24007

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.83)