AITopics

doi: 10.1109/CoG57401.2023.10333169

2405.14346

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Texas (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceMay-23-2024

AlphaMath Almost Zero: process Supervision without process

Chen, Guoxin, Liao, Minpeng, Li, Chengxi, Fan, Kai

Recent advancements in large language models (LLMs) have substantially enhanced their mathematical reasoning abilities. However, these models still struggle with complex problems that require multiple reasoning steps, frequently leading to logical or numerical errors. While numerical mistakes can be largely addressed by integrating a code interpreter, identifying logical errors within intermediate steps is more challenging. Moreover, manually annotating these steps for training is not only expensive but also labor-intensive, requiring the expertise of professional annotators. In our study, we introduce an innovative approach that bypasses the need for process annotations (from human or GPTs) by utilizing the Monte Carlo Tree Search (MCTS) framework. This technique automatically generates both the process supervision and the step-level evaluation signals. Our method iteratively trains the policy and value models, leveraging the capabilities of a well-pretrained LLM to progressively enhance its mathematical reasoning skills. Furthermore, we propose an efficient inference strategy--step-level beam search, where the value model is crafted to assist the policy model (i.e., LLM) in navigating more effective reasoning paths, rather than solely relying on prior probabilities. The experimental results on both in-domain and out-of-domain datasets demonstrate that even without GPT-4 or human-annotated process supervision, our AlphaMath framework achieves comparable or superior results to previous state-of-the-art methods.

arxiv preprint arxiv, equation, value model, (14 more...)

2405.03553

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.54)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-23-2024

Explaining Graph Neural Networks via Structure-aware Interaction Index

Bui, Ngoc, Nguyen, Hieu Trung, Nguyen, Viet Anh, Ying, Rex

The Shapley value is a prominent tool for interpreting black-box machine learning models thanks to its strong theoretical foundation. However, for models with structured inputs, such as graph neural networks, existing Shapley-based explainability approaches either focus solely on node-wise importance or neglect the graph structure when perturbing the input instance. This paper introduces the Myerson-Taylor interaction index that internalizes the graph structure into attributing the node values and the interaction values among nodes. Unlike the Shapley-based methods, the Myerson-Taylor index decomposes coalitions into components satisfying a pre-chosen connectivity criterion. We prove that the Myerson-Taylor index is the unique one that satisfies a system of five natural axioms accounting for graph structure and high-order interaction among nodes. Leveraging these properties, we propose Myerson-Taylor Structure-Aware Graph Explainer (MAGE), a novel explainer that uses the second-order Myerson-Taylor index to identify the most important motifs influencing the model prediction, both positively and negatively. Extensive experiments on various graph datasets and models demonstrate that our method consistently provides superior subgraph explanations compared to state-of-the-art methods.

dataset, explanation, myerson-taylor index, (12 more...)

2405.14352

Country:

Europe > Austria > Vienna (0.14)
Europe > Spain (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Liang, Huidong, Wan, Xingchen, Dong, Xiaowen

Bayesian Optimization of Functions over Node Subsets in Graphs

arXiv.org Machine LearningMay-23-2024

We address the problem of optimizing over functions defined on node subsets in a graph. The optimization of such functions is often a non-trivial task given their combinatorial, black-box and expensive-to-evaluate nature. Although various algorithms have been introduced in the literature, most are either task-specific or computationally inefficient and only utilize information about the graph structure without considering the characteristics of the function. To address these limitations, we utilize Bayesian Optimization (BO), a sample-efficient black-box solver, and propose a novel framework for combinatorial optimization on graphs. More specifically, we map each $k$-node subset in the original graph to a node in a new combinatorial graph and adopt a local modeling approach to efficiently traverse the latter graph by progressively sampling its subgraphs using a recursive algorithm. Extensive experiments under both synthetic and real-world setups demonstrate the effectiveness of the proposed BO framework on various types of graphs and optimization tasks, where its behavior is analyzed in detail with ablation studies.

graph, node, optimization, (17 more...)

arXiv.org Machine Learning

2405.15119

Country:

North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > District of Columbia > Washington (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.69)
Energy (0.67)
Health & Medicine > Epidemiology (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(3 more...)

Hirsch, Eran, Uziel, Guy, Anaby-Tavor, Ateret

What's the Plan? Evaluating and Developing Planning-Aware Techniques for Language Models

Planning is a fundamental task in artificial intelligence that involves finding a sequence of actions that achieve a specified goal in a given environment. Large language models (LLMs) are increasingly employed in applications that require such planning capabilities, including web and embodied agents. In line with recent studies, we demonstrate through experimentation that LLMs lack necessary skills required for planning. We focus on their ability to function as world models, and show that they struggle to simulate the complex dynamics of classic planning domains. Based on these observations, we advocate for the potential of a hybrid approach that combines language models with classical planning methodology. We introduce SimP lan, a novel hybrid architecture, utilizing external world modeling tools and the greedy best-first search algorithm. We assess its effectiveness in a rigorous set of experiments across a variety of challenging planning domains. Our results demonstrate that SimP lan significantly outperforms existing LLM-based planners, highlighting the critical role of search strategies and world models in planning applications.

experiment, ferry, language model, (15 more...)

2402.11489

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Discretely Beyond $1/e$: Guided Combinatorial Algorithms for Submodular Maximization

Chen, Yixin, Nath, Ankur, Peng, Chunli, Kuhnle, Alan

For constrained, not necessarily monotone submodular maximization, all known approximation algorithms with ratio greater than $1/e$ require continuous ideas, such as queries to the multilinear extension of a submodular function and its gradient, which are typically expensive to simulate with the original set function. For combinatorial algorithms, the best known approximation ratios for both size and matroid constraint are obtained by a simple randomized greedy algorithm of Buchbinder et al. [9]: $1/e \approx 0.367$ for size constraint and $0.281$ for the matroid constraint in $\mathcal O (kn)$ queries, where $k$ is the rank of the matroid. In this work, we develop the first combinatorial algorithms to break the $1/e$ barrier: we obtain approximation ratio of $0.385$ in $\mathcal O (kn)$ queries to the submodular set function for size constraint, and $0.305$ for a general matroid constraint. These are achieved by guiding the randomized greedy algorithm with a fast local search algorithm. Further, we develop deterministic versions of these algorithms, maintaining the same ratio and asymptotic time complexity. Finally, we develop a deterministic, nearly linear time algorithm with ratio $0.377$.

algorithm, constraint, matroid constraint, (13 more...)

2405.05202

Country: North America > United States > Texas > Brazos County > College Station (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Ant Colony Sampling with GFlowNets for Combinatorial Optimization

Kim, Minsu, Choi, Sanghyeok, Kim, Hyeonah, Son, Jiwoo, Park, Jinkyoo, Bengio, Yoshua

This paper introduces the Generative Flow Ant Colony Sampler (GFACS), a neural-guided probabilistic search algorithm for solving combinatorial optimization (CO). GFACS integrates generative flow networks (GFlowNets), an emerging amortized inference method, with ant colony optimization (ACO), a promising probabilistic search algorithm. Specifically, we use GFlowNets to learn a constructive policy in combinatorial spaces for enhancing ACO by providing an informed prior distribution over decision variables conditioned on input graph instances. Furthermore, we introduce a novel off-policy training algorithm for scaling conditional GFlowNets into large-scale combinatorial spaces by leveraging local search and shared energy normalization. Our experimental results demonstrate that GFACS outperforms baseline ACO algorithms in seven CO tasks and is competitive with problem-specific heuristics for vehicle routing problems.

gflownet, international conference, optimization, (14 more...)

2403.07041

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Transportation > Freight & Logistics Services (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Cheng, Hongyu, Basu, Amitabh

Learning Cut Generating Functions for Integer Programming

The branch-and-cut algorithm is the method of choice to solve large scale integer programming problems in practice. A key ingredient of branch-and-cut is the use of cutting planes which are derived constraints that reduce the search space for an optimal solution. Selecting effective cutting planes to produce small branch-and-cut trees is a critical challenge in the branch-and-cut algorithm. Recent advances have employed a data-driven approach to select optimal cutting planes from a parameterized family, aimed at reducing the branch-and-bound tree size (in expectation) for a given distribution of integer programming instances. We extend this idea to the selection of the best cut generating function (CGF), which is a tool in the integer programming literature for generating a wide variety of cutting planes that generalize the well-known Gomory Mixed-Integer (GMI) cutting planes. We provide rigorous sample complexity bounds for the selection of an effective CGF from certain parameterized families that provably performs well for any specified distribution on the problem instances. Our empirical results show that the selected CGF can outperform the GMI cuts for certain distributions. Additionally, we explore the sample complexity of using neural networks for instance-dependent CGF selection.

cut generating function, plane, tree size, (14 more...)

2405.13992

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.86)

Tukan, Murad, Mualem, Loay, Feldman, Moran

Practical $0.385$-Approximation for Submodular Maximization Subject to a Cardinality Constraint

Non-monotone constrained submodular maximization plays a crucial role in various machine learning applications. However, existing algorithms often struggle with a trade-off between approximation guarantees and practical efficiency. The current state-of-the-art is a recent $0.401$-approximation algorithm, but its computational complexity makes it highly impractical. The best practical algorithms for the problem only guarantee $1/e$-approximation. In this work, we present a novel algorithm for submodular maximization subject to a cardinality constraint that combines a guarantee of $0.385$-approximation with a low and practical query complexity of $O(n+k^2)$. Furthermore, we evaluate the empirical performance of our algorithm in experiments based on various machine learning applications, including Movie Recommendation, Image Summarization, and more. These experiments demonstrate the efficacy of our approach.

algorithm, inequality hold, theorem 3, (14 more...)

2405.13994

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)

DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis

Shee, Yu, Li, Haote, Morgunov, Anton, Batista, Victor

Traditional computer-aided synthesis planning (CASP) methods rely on iterative single-step predictions, leading to exponential search space growth that limits efficiency and scalability. We introduce a transformer-based model that directly generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones. The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset with a 2.2x improvement in Top-1 accuracy on the n$_1$ test set and a 3.3x improvement on the n$_5$ test set. It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities. While the current suboptimal diversity of the training set may impact performance on less common reaction types, our approach presents a promising direction towards fully automated retrosynthetic planning.

compound, prediction, target compound, (15 more...)

2405.13983

Country: North America > United States > New York (0.04)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Government Relations & Public Policy (1.00)
Government > Regional Government > North America Government > United States Government > FDA (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)