Search
Double-Ended Synthesis Planning with Goal-Constrained Bidirectional Search
Computer-aided synthesis planning (CASP) algorithms have demonstrated expert-level abilities in planning retrosynthetic routes to molecules of low to moderate complexity. However, current search methods assume the sufficiency of reaching arbitrary building blocks, failing to address the common real-world constraint where using specific molecules is desired. To this end, we present a formulation of synthesis planning with starting material constraints. Under this formulation, we propose Double-Ended Synthesis Planning ( \texttt{DESP}), a novel CASP algorithm under a _bidirectional graph search_ scheme that interleaves expansions from the target and from the goal starting materials to ensure constraint satisfiability. The search algorithm is guided by a goal-conditioned cost network learned offline from a partially observed hypergraph of valid chemical reactions. We demonstrate the utility of \texttt{DESP} in improving solve rates and reducing the number of search expansions by biasing synthesis planning towards expert goals on multiple new benchmarks.
Reviews: Deep Active Learning with a Neural Architecture Search
This paper proposed a method for doing active learning (AL) where in each AL iteration the optimization is done over network architecture and the underlying parameters, as opposed to other methods which fixes the architecture and only optimizes the parameters. These two optimizations are done separately, by first performing a local search among models of monotonically increasing complexity and then optimizing parameters of the obtained architecture. The authors used this method with three different active learning algorithms and showed that their method improved performance of these ALs. The paper is very well-written and clear. The problem of architectural optimization is also of great importance in the field.
Reviews: Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning
My concern about generalization still remains, and I hope the authors can devote maybe a sentence or two to it in the final draft - even something to the effect of "it is a concern; experimental evidence suggests it is not a great concern."] Summary: For any given ML algorithm, e.g., random forests, the paper proposes a transfer-learning approach for selection of hyperparameters (limited to those parameters that can be ordered) wherein a bounding space is constructed from previous evaluations of that algorithm on other datasets. Two types of bounding spaces are described. The box space is the tightest bounding box covering the best known hyperparameter settings for previous datasets. The ellipsoid is found as the smallest-volume ellipsoid covering the best known settings (via convex optimization).
Review for NeurIPS paper: Evolving Normalization-Activation Layers
Weaknesses: Lack of ablation study for the two rejection protocols is my mean concern and is the principle component of my rating. While the experiments focused intensively on various architectures and normalization-activation layers, it is not clear how those two rejection protocols contribute to the final results. Although both of them are very well motivated by the two observations, the observations themselve are not sufficient to justify the two rejection protocol. Evolution is extremely creative and the more constraint we manually put on it, the more we limit its creativity. More specifically, the search space for complex problems are usually very deceptive, for example, a candidate might be numerically unstable based on the stability criterion, however this candidate may have potential to be evolved into a surprisingly powerful one later on, but based on the current protocol it might be rejected early on. In table 3, random search with rejection also achieved very good results and authors' EvoNorm only outperformed it by a small margin, which also concerns me about the effectiveness of the search method itself.
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
Recent methodologies in LLM self-training mostly rely on LLM generating responses and filtering those with correct output answers as training data. This approach often yields a low-quality fine-tuning training set (e.g., incorrect plans or intermediate reasoning). In this paper, we develop a reinforced self-training approach, called ReST-MCTS*, based on integrating process reward guidance with tree search MCTS* for collecting higher-quality reasoning traces as well as per-step value to train policy and reward models. ReST-MCTS* circumvents the per-step manual annotation typically used to train process rewards by tree-search-based reinforcement learning: Given oracle final correct answers, ReST-MCTS* is able to infer the correct process rewards by estimating the probability this step can help lead to the correct answer. These inferred rewards serve dual purposes: they act as value targets for further refining the process reward model and also facilitate the selection of high-quality traces for policy model self-training.
Minimax Estimation of Conditional Moment Models
We develop an approach for estimating models described via conditional moment restrictions, with a prototypical application being non-parametric instrumental variable regression. We introduce a min-max criterion function, under which the estimation problem can be thought of as solving a zero-sum game between a modeler who is optimizing over the hypothesis space of the target model and an adversary who identifies violating moments over a test function space. We analyze the statistical estimation rate of the resulting estimator for arbitrary hypothesis spaces, with respect to an appropriate analogue of the mean squared error metric, for ill-posed inverse problems. We show that when the minimax criterion is regularized with a second moment penalty on the test function and the test function space is sufficiently rich, then the estimation rate scales with the critical radius of the hypothesis and test function spaces, a quantity which typically gives tight fast rates. Our main result follows from a novel localized Rademacher analysis of statistical learning problems defined via minimax objectives.
Inference-time Scaling of Diffusion Models through Classical Search
Zhang, Xiangcheng, Lin, Haowei, Ye, Haotian, Zou, James, Ma, Jianzhu, Liang, Yitao, Du, Yilun
Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models -- adapting generated outputs to meet diverse test-time objectives -- using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It employs a theoretically grounded local search via annealed Langevin MCMC and performs compute-efficient global exploration using breadth-first and depth-first tree search. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation. Across all tasks, we observe significant gains in both performance and efficiency. These results show that classical search provides a principled and practical foundation for inference-time scaling in diffusion models. Project page at diffusion-inference-scaling.github.io.
Global optimization of graph acquisition functions for neural architecture search
Xie, Yilin, Zhang, Shiqiang, Qing, Jixiang, Misener, Ruth, Tsay, Calvin
Graph Bayesian optimization (BO) has shown potential as a powerful and data-efficient tool for neural architecture search (NAS). Most existing graph BO works focus on developing graph surrogates models, i.e., metrics of networks and/or different kernels to quantify the similarity between networks. However, the acquisition optimization, as a discrete optimization task over graph structures, is not well studied due to the complexity of formulating the graph search space and acquisition functions. This paper presents explicit optimization formulations for graph input space including properties such as reachability and shortest paths, which are used later to formulate graph kernels and the acquisition function. We theoretically prove that the proposed encoding is an equivalent representation of the graph space and provide restrictions for the NAS domain with either node or edge labels. Numerical results over several NAS benchmarks show that our method efficiently finds the optimal architecture for most cases, highlighting its efficacy.
Humanoid Loco-manipulation Planning based on Graph Search and Reachability Maps
Murooka, Masaki, Kumagai, Iori, Morisawa, Mitsuharu, Kanehiro, Fumio, Kheddar, Abderrahmane
--In this letter, we propose an efficient and highly versatile loco-manipulation planning for humanoid robots. Loco-manipulation planning is a key technological brick enabling humanoid robots to autonomously perform object transportation by manipulating them. We formulate planning of the alternation and sequencing of footsteps and grasps as a graph search problem with a new transition model that allows for a flexible representation of loco-manipulation. Our transition model is quickly evaluated by relocating and switching the reachability maps depending on the motion of both the robot and object. We evaluate our approach by applying it to loco-manipulation use-cases, such as a bobbin rolling operation with regrasping, where the motion is automatically planned by our framework. OVING large objects is a typical task required for humanoid robots in large-scale manufacturing environments. As most of such objects are heavy, they need to be moved through manipulating them by taking advantage of the ground and any possible inertia properties.
Fooling the Watchers: Breaking AIGC Detectors via Semantic Prompt Attacks
The rise of text-to-image (T2I) models has enabled the synthesis of photorealistic human portraits, raising serious concerns about identity misuse and the robustness of AIGC detectors. In this work, we propose an automated adversarial prompt generation framework that leverages a grammar tree structure and a variant of the Monte Carlo tree search algorithm to systematically explore the semantic prompt space. Our method generates diverse, controllable prompts that consistently evade both open-source and commercial AIGC detectors. Extensive experiments across multiple T2I models validate its effectiveness, and the approach ranked first in a real-world adversarial AIGC detection competition. Beyond attack scenarios, our method can also be used to construct high-quality adversarial datasets, providing valuable resources for training and evaluating more robust AIGC detection and defense systems.