Choi, Sanghyeok
Neural Genetic Search in Discrete Spaces
Kim, Hyeonah, Choi, Sanghyeok, Son, Jiwoo, Park, Jinkyoo, Kwon, Changhyun
Effective search methods are crucial for improving the performance of deep generative models at test time. In this paper, we introduce a novel test-time search method, Neural Genetic Search (NGS), which incorporates the evolutionary mechanism of genetic algorithms into the generation procedure of deep models. The core idea behind NGS is its crossover, which is defined as parent-conditioned generation using trained generative models. This approach offers a versatile and easy-to-implement search algorithm for deep generative models. We demonstrate the effectiveness and flexibility of NGS through experiments across three distinct domains: routing problems, adversarial prompt generation for language models, and molecular design.
Knowledge Synthesis of Photosynthesis Research Using a Large Language Model
Yoon, Seungri, Jeon, Woosang, Choi, Sanghyeok, Kim, Taehyeong, Ahn, Tae In
The development of biological data analysis tools and large language models (LLMs) has opened up new possibilities for utilizing AI in plant science research, with the potential to contribute significantly to knowledge integration and research gap identification. Nonetheless, current LLMs struggle to handle complex biological data and theoretical models in photosynthesis research and often fail to provide accurate scientific contexts. Therefore, this study proposed a photosynthesis research assistant (PRAG) based on OpenAI's GPT-4o with retrieval-augmented generation (RAG) techniques and prompt optimization. Vector databases and an automated feedback loop were used in the prompt optimization process to enhance the accuracy and relevance of the responses to photosynthesis-related queries. PRAG showed an average improvement of 8.7% across five metrics related to scientific writing, with a 25.4% increase in source transparency. Additionally, its scientific depth and domain coverage were comparable to those of photosynthesis research papers. A knowledge graph was used to structure PRAG's responses with papers within and outside the database, which allowed PRAG to match key entities with 63% and 39.5% of the database and test papers, respectively. PRAG can be applied for photosynthesis research and broader plant science domains, paving the way for more in-depth data analysis and predictive capabilities.
Improved Off-policy Reinforcement Learning in Biological Sequence Design
Kim, Hyeonah, Kim, Minsu, Yun, Taeyoung, Choi, Sanghyeok, Bengio, Emmanuel, Hernรกndez-Garcรญa, Alex, Park, Jinkyoo
Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce $\delta$-Conservative Search, a novel off-policy search method for training GFlowNets designed to improve robustness against proxy misspecification. The key idea is to incorporate conservativeness, controlled by parameter $\delta$, to constrain the search to reliable regions. Specifically, we inject noise into high-score offline sequences by randomly masking tokens with a Bernoulli distribution of parameter $\delta$ and then denoise masked tokens using the GFlowNet policy. Additionally, $\delta$ is adaptively adjusted based on the uncertainty of the proxy model for each data point. This enables the reflection of proxy uncertainty to determine the level of conservativeness. Experimental results demonstrate that our method consistently outperforms existing machine learning methods in discovering high-score sequences across diverse tasks-including DNA, RNA, protein, and peptide design-especially in large-scale scenarios.
Adaptive teachers for amortized samplers
Kim, Minsu, Choi, Sanghyeok, Yun, Taeyoung, Bengio, Emmanuel, Feng, Leo, Rector-Brooks, Jarrid, Ahn, Sungsoo, Park, Jinkyoo, Malkin, Nikolay, Bengio, Yoshua
Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnormalized density where exact sampling is intractable. When sampling is implemented as a sequential decision-making process, reinforcement learning (RL) methods, such as generative flow networks, can be used to train the sampling policy. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose to use an adaptive training distribution (the Teacher) to guide the training of the primary amortized sampler (the Student) by prioritizing high-loss regions. The Teacher, an auxiliary behavior model, is trained to sample high-error regions of the Student and can generalize across unexplored modes, thereby enhancing mode coverage by providing an efficient training curriculum. We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge, two diffusion-based sampling tasks, and four biochemical discovery tasks demonstrating its ability to improve sample efficiency and mode coverage.
Ant Colony Sampling with GFlowNets for Combinatorial Optimization
Kim, Minsu, Choi, Sanghyeok, Kim, Hyeonah, Son, Jiwoo, Park, Jinkyoo, Bengio, Yoshua
This paper introduces the Generative Flow Ant Colony Sampler (GFACS), a neural-guided probabilistic search algorithm for solving combinatorial optimization (CO). GFACS integrates generative flow networks (GFlowNets), an emerging amortized inference method, with ant colony optimization (ACO), a promising probabilistic search algorithm. Specifically, we use GFlowNets to learn a constructive policy in combinatorial spaces for enhancing ACO by providing an informed prior distribution over decision variables conditioned on input graph instances. Furthermore, we introduce a novel off-policy training algorithm for scaling conditional GFlowNets into large-scale combinatorial spaces by leveraging local search and shared energy normalization. Our experimental results demonstrate that GFACS outperforms baseline ACO algorithms in seven CO tasks and is competitive with problem-specific heuristics for vehicle routing problems.
Genetic-guided GFlowNets: Advancing in Practical Molecular Optimization Benchmark
Kim, Hyeonah, Kim, Minsu, Choi, Sanghyeok, Park, Jinkyoo
The proposed method shows a stateof-the-art score of 16.213, significantly outperforming the reported best score in the benchmark genetic algorithms (e.g., Jensen, 2019). of 15.185, in practical molecular optimization The recent work of Gao et al. (2022a) proposes a practical (PMO), which is an official benchmark for molecular optimization (PMO) benchmark, emphasizing sample-efficient molecular optimization. Remarkably, the importance of sample efficiency in de novo molecular ours exceeds all baselines, including reinforcement optimization for practical applicability. The benchmark is learning, Bayesian optimization, generative reasonable because real-world applications of molecule optimization models, GFlowNets, and genetic algorithms, (e.g., drug discovery) require expensive scoring in 14 out of 23 tasks. Our code is available at processes such as wet lab experiments.
Solving NP-hard Min-max Routing Problems as Sequential Generation with Equity Context
Son, Jiwoo, Kim, Minsu, Choi, Sanghyeok, Park, Jinkyoo
Min-max routing problems aim to minimize the maximum tour length among agents as they collaboratively visit all cities, i.e., the completion time. These problems include impactful real-world applications but are known as NP-hard. Existing methods are facing challenges, particularly in large-scale problems that require the coordination of numerous agents to cover thousands of cities. This paper proposes a new deep-learning framework to solve large-scale min-max routing problems. We model the simultaneous decision-making of multiple agents as a sequential generation process, allowing the utilization of scalable deep-learning models for sequential decision-making. In the sequentially approximated problem, we propose a scalable contextual Transformer model, Equity-Transformer, which generates sequential actions considering an equitable workload among other agents. The effectiveness of Equity-Transformer is demonstrated through its superior performance in two representative min-max routing tasks: the min-max multiple traveling salesman problem (min-max mTSP) and the min-max multiple pick-up and delivery problem (min-max mPDP). Notably, our method achieves significant reductions of runtime, approximately 335 times, and cost values of about 53% compared to a competitive heuristic (LKH3) in the case of 100 vehicles with 1,000 cities of mTSP. We provide reproducible source code: https://github.com/kaist-silab/equity-transformer