Search
Reviews: Bayesian Optimization with Unknown Search Space
This paper proposes an algorithm to expand the search space for Bayesian optimization. The reviewers thought the work tackles an important problem and would be of interest to the community. The claims are well supported by empirical evidence and the paper is clearly written. There were concerns about the practicality of the method and that the work is a combination of well-known techniques. Because the paper presents a relatively novel approach and substantiates the claims with strong supporting evidence, it seems to be above the bar of acceptance.
Reviews: Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks
Post-feedback update: Thank you for your update. Your additional results will strengthen this paper, and I still think it should be accepted. Specifically, it combines the basic overall framework for SPEN training using a reward signal introduced by [1] with the idea of adding in random search to find reward scoring violations, which has been used in the past by various papers (which are cited appropriately in this work). However, this exact combination is novel. Quality: The motivation behind using random search to augment the generation of labels to use for training the model is sound and verified empirically. Numerous appropriate baselines are included, ranging from beam search-type approaches to more directly comparable approaches such as [1], and the introduced approach outperforms all of them.
Smart Cubing for Graph Search: A Comparative Study
Kirchweger, Markus, Xia, Hai, Peitl, Tomรกลก, Szeider, Stefan
Propositional satisfiability (SAT) solvers based on conflict-driven clause learning can solve huge instances with millions of variables and clauses [Fichte et al., 2023a]. However, for hard instances, particularly in combinatorial problems, parallelization becomes necessary. The cube-and-conquer technique has proven highly effective for such problems, most notably in resolving the Pythagorean triples conjecture [Heule et al., 2016]. In cube-and-conquer, a look-ahead solver first partitions the search space into disjoint subproblems via cubes (partial assignments), which are then solved independently by CDCL solvers. This independence enables efficient parallel solving. When encoding combinatorial problems into SAT, particularly those involving graphs, we often encounter highly symmetric search spaces. Many mutually isomorphic graphs satisfy the same constraints, but a solver needs to check only one representative, the canonical element, from each isomorphism class. Standard CDCL solvers cannot leverage these symmetries, and static symmetry breaking methods cannot break all symmetries [Codish et al., 2019]. SAT Modulo Symmetries (SMS) [Kirchweger and Szeider, 2021; Kirchweger and Szeider, 2024] addresses this limitation through dynamic symmetry breaking, using a custom propagator that learns symmetry-breaking predicates during the search.
MCTS-SQL: An Effective Framework for Text-to-SQL with Monte Carlo Tree Search
Yuan, Shuozhi, Chen, Liming, Yuan, Miaomiao, Zhao, Jin, Peng, Haoran, Guo, Wenming
Text-to-SQL is a fundamental and longstanding problem in the NLP area, aiming at converting natural language queries into SQL, enabling non-expert users to operate databases. Recent advances in LLM have greatly improved text-to-SQL performance. However, challenges persist, especially when dealing with complex user queries. Current approaches (e.g., COT prompting and multi-agent frameworks) rely on the ability of models to plan and generate SQL autonomously, but controlling performance remains difficult. In addition, LLMs are still prone to hallucinations. To alleviate these challenges, we designed a novel MCTS-SQL to guide SQL generation iteratively. The approach generates SQL queries through Monte Carlo Tree Search (MCTS) and a heuristic self-refinement mechanism are used to enhance accuracy and reliability. Key components include a schema selector for extracting relevant information and an MCTS-based generator for iterative query refinement. Experimental results from the SPIDER and BIRD benchmarks show that MCTS-SQL achieves state-of-the-art performance. Specifically, on the BIRD development dataset, MCTS-SQL achieves an Execution (EX) accuracy of 69.40% using GPT-4o as the base model and a significant improvement when dealing with challenging tasks, with an EX of 51.48%, which is 3.41% higher than the existing method.
Reviews: NAT: Neural Architecture Transformer for Accurate and Compact Architectures
However, the paper misses closely related work such as, e.g. However, it does not disentangle the effects of search space and search method; in particular, it remains unclear if the proposed and relatively complex search method (policy-gradient graph-convolutional neural networks) would outperform simpler baselines such as random search on the same search space. Moreover, the method is only applied to neural architectures that were not optimized for being resource-efficient; it remains unclear if NAT would also improve architectures such as, e.g., the MobileNet family or MnasNet. It also contains sufficient details for being able to replicate the paper. The proposed method could be seen as a post-processing step for NAS, which allows increasing the methods performance with the same or even less resources required (since the search space is very simple, a search method is more likely to find optimal configuration in it compared to the significantly larger typical NAS search spaces). However, to really show the significance of the proposed method, it would have to show that it outperforms simpler baselines such as random search and is also applicable to cells that were already optimized for being resource-efficient.
Review for NeurIPS paper: Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?
Weaknesses: Similar to this work, many recent NAS approaches perform architecture search in the continuous representation space, and gradient descent (GD) is one of the most commonly used approaches for architecture search in the continuous space. However, this work only considers RL and BO as the search algorithms. I am curious to know how competitive GD is compared to RL and BO. In Figure 4, I am not sure why the right plot has more blank space compared to the left plot. An explanation is needed to help the readers understand the insights behind those plots.
Review for NeurIPS paper: Minimax Estimation of Conditional Moment Models
Clarity: This paper is quite difficult to read, although that may be due to the fact that they are trying to cover a tremendous amount of material, and the fact that while I am familiar with the basics of the background knowledge needed to understand the paper, I am not an expert in some of those areas. In terms of reproducibility, the main body of the paper does not contain enough details to be reproducible, although many more details are provided in the supplementary material. The abstract makes some claims that are not explicitly mentioned later in the paper, e.g. that the estimation problem can be thought of as a zero-sum game between a modeler and an adversary. The article repeatedly refers to Theorem 6, but there is no Theorem 6 in the paper (just the supplementary material). The expression of instrumental variable models as a conditional moment restriction in equation (1), although used in econometrics is very different than the usual expression as conditional independence or graphical constraints on the model usually used in computer science and other fields.
Reviews: Interlaced Greedy Algorithm for Maximization of Submodular Functions in Nearly Linear Time
In this paper, the authors study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint and present a deterministic algorithm that achieves (1/4 - \epsilon)-approximation for the problem. Their fastest algorithm makes O(n / \epsilon \log(n / \epsilon)) queries to the oracle. While Buchbinder et al. (2015) have designed a randomized algorithm with better approximation factor and time complexity, the algorithm presented in this paper is deterministic. Although I believe deterministic algorithms which guarantee the performance in the worst-case scenarios are interesting for many researchers in the field, the contribution level of this paper is borderline. Also, I checked almost all the proofs in this paper.