Goto

Collaborating Authors

 Search


PathFinder: Guided Search over Multi-Step Reasoning Paths

arXiv.org Artificial Intelligence

With recent advancements in large language models, methods like chain-of-thought prompting to elicit reasoning chains have been shown to improve results on reasoning tasks. However, tasks that require multiple steps of reasoning still pose significant challenges to state-of-the-art models. Drawing inspiration from the beam search algorithm, we propose PathFinder, a tree-search-based reasoning path generation approach. It enhances diverse branching and multi-hop reasoning through the integration of dynamic decoding, enabled by varying sampling methods and parameters. Using constrained reasoning, PathFinder integrates novel quality constraints, pruning, and exploration methods to enhance the efficiency and the quality of generation. Moreover, it includes scoring and ranking features to improve candidate selection. Our approach outperforms competitive baselines on three complex arithmetic and commonsense reasoning tasks by 6% on average. Our model generalizes well to longer, unseen reasoning chains, reflecting similar complexities to beam search with large branching factors.


Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

arXiv.org Artificial Intelligence

This paper delves into the realm of stochastic optimization for compositional minimax optimization - a pivotal challenge across various machine learning domains, including deep AUC and reinforcement learning policy evaluation. Despite its significance, the problem of compositional minimax optimization is still under-explored. Adding to the complexity, current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes. To respond to these constraints, this paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O(\kappa^3 /\epsilon^3 )$ to obtain the $\epsilon$-accuracy solution. We also demonstrate that NSTORM can achieve the same sample complexity under the Polyak-\L ojasiewicz (PL)-condition - an insightful extension of its capabilities. Yet, NSTORM encounters an issue with its requirement for low learning rates, potentially constraining its real-world applicability in machine learning. To overcome this hurdle, we present ADAptive NSTORM (ADA-NSTORM) with adaptive learning rates. We demonstrate that ADA-NSTORM can achieve the same sample complexity but the experimental results show its more effectiveness. All the proposed complexities indicate that our proposed methods can match lower bounds to existing minimax optimizations, without requiring a large batch size in each iteration. Extensive experiments support the efficiency of our proposed methods.


Multi-granularity Causal Structure Learning

arXiv.org Machine Learning

However, these algorithms simply deem causal relationships stand exclusively at the level of individual variables Data science is moving from the data-centric paradigm forward (micro-variable), ignoring the collective interactions from the science-centric paradigm, and causal revolution multiple variables (macro-variable). For instance, the brain is sweeping across various research fields. Causality learning can be characterized at a micro granularity of neurons and endeavors to unearth causal relationships among variables their synapses, but high-order synergistic subsystems are from observational data and generate causal graph, widespread, which typically sit between canonical functional that is, directed acyclic graph (DAG). Unlike correlationbased networks and may serve an integrative role (Varley study, causality analysis reveals the causal mechanism et al. 2023). Actually, observational data can be regarded of data generation. Identifying causality holds paramount as knowledge in the lowest granularity level, while knowledge significance for stable inference and rational decisions can be regarded as the abstraction of data at different in many applications, such as recommendation systems granularity levels (Wang 2017; Wang et al. 2022). Similar (Wang et al. 2020), medical diagnostics (Richens, Lee, and viewpoints appear in the research of complex systems, Johri 2020), epidemiology (Vandenbroucke, Broadbent, and which suggests that causal relationship is more pronounced Pearce 2016) and many others (Von Kügelgen et al. 2022).


Improvement in Variational Quantum Algorithms by Measurement Simplification

arXiv.org Artificial Intelligence

After the discovery of Shor's algorithm and Grover's search algorithm, there has been many researches covering the concept of quantum advantage, which insists quantum computers will exhibit specific advantages over classical computers. Google named the advantage as "Quantum Supremacy"[1] and explains for specific problems, quantum computers can surpass classical computer in computation time and required memory capacity. However, complex quantum algorithms such as Shor's algorithm requires number of qubits and gate fidelity exponentially more than currently we have, and therefore investigating executable algorithms that show quantum advantage even with noisy and few qubits have been arised as an important question in the NISQ (Noisy Intermediate-Scale Quantum) era[2]. Among them, VQAs (Variational Quantum Algorithms)[3] have been remarked as efficient algorithms that can been executed in NISQ devices with low limitation. VQA is a hybrid quantum algorithm that utilizes classical optimizer and Variational Quantum Circuit (VQC), it first measures a state's probability after quantum circuit, and passes the result to classical optimizer.


Proceedings of the 2023 XCSP3 Competition

arXiv.org Artificial Intelligence

This short paper gives an overview of the XCSP3 solver implemented in Picat. Picat provides several constraint modules, and the Picat XCSP3 solver uses the sat module. The XCSP3 solver mainly consists of a parser implemented in Picat, which converts constraints from XCSP3 format to Picat. The solver demonstrates the strengths of Picat, a logic-based language, in parsing, modeling, and encoding constraints into SAT. The solver submitted to the 2022 XCSP competition is based on the one that won the 2019 XCSP competition.


Existence and Minimax Theorems for Adversarial Surrogate Risks in Binary Classification

arXiv.org Artificial Intelligence

Adversarial training is one of the most popular methods for training methods robust to adversarial attacks, however, it is not well-understood from a theoretical perspective. We prove and existence, regularity, and minimax theorems for adversarial surrogate risks. Our results explain some empirical observations on adversarial robustness from prior work and suggest new directions in algorithm development. Furthermore, our results extend previously known existence and minimax theorems for the adversarial classification risk to surrogate risks.


Proceedings of the 2022 XCSP3 Competition

arXiv.org Artificial Intelligence

This short paper gives an overview of the XCSP3 solver implemented in Picat. Picat provides several constraint modules, and the Picat XCSP3 solver uses the sat module. The XCSP3 solver mainly consists of a parser implemented in Picat, which converts constraints from XCSP3 format to Picat. The solver demonstrates the strengths of Picat, a logic-based language, in parsing, modeling, and encoding constraints into SAT. The solver submitted to the 2022 XCSP competition is based on the one that won the 2019 XCSP competition.


Language-Conditioned Semantic Search-Based Policy for Robotic Manipulation Tasks

arXiv.org Artificial Intelligence

Reinforcement learning and Imitation Learning approaches utilize policy learning strategies that are difficult to generalize well with just a few examples of a task. In this work, we propose a language-conditioned semantic search-based method to produce an online search-based policy from the available demonstration dataset of state-action trajectories. Here we directly acquire actions from the most similar manipulation trajectories found in the dataset. Our approach surpasses the performance of the baselines on the CALVIN benchmark and exhibits strong zero-shot adaptation capabilities. This holds great potential for expanding the use of our online search-based policy approach to tasks typically addressed by Imitation Learning or Reinforcement Learning-based policies.


Exact and rapid linear clustering of networks with dynamic programming

arXiv.org Artificial Intelligence

We study the problem of clustering networks whose nodes have imputed or physical positions in a single dimension, for example prestige hierarchies or the similarity dimension of hyperbolic embeddings. Existing algorithms, such as the critical gap method and other greedy strategies, only offer approximate solutions to this problem. Here, we introduce a dynamic programming approach that returns provably optimal solutions in polynomial time -- O(n^2) steps -- for a broad class of clustering objectives. We demonstrate the algorithm through applications to synthetic and empirical networks and show that it outperforms existing heuristics by a significant margin, with a similar execution time.


Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars

arXiv.org Machine Learning

The discovery of neural architectures from simple building blocks is a long-standing goal of Neural Architecture Search (NAS). Hierarchical search spaces are a promising step towards this goal but lack a unifying search space design framework and typically only search over some limited aspect of architectures. In this work, we introduce a unifying search space design framework based on context-free grammars that can naturally and compactly generate expressive hierarchical search spaces that are 100s of orders of magnitude larger than common spaces from the literature. By enhancing and using their properties, we effectively enable search over the complete architecture and can foster regularity. Further, we propose an efficient hierarchical kernel design for a Bayesian Optimization search strategy to efficiently search over such huge spaces. We demonstrate the versatility of our search space design framework and show that our search strategy can be superior to existing NAS approaches. Code is available at https://github.com/automl/hierarchical_nas_construction.