AITopics

2402.12483

Country:

Asia > Singapore (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Maryland (0.04)
(10 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Materials (1.00)
Health & Medicine (1.00)
Education (0.85)
Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceJun-6-2024

Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning

Guo, Zixian, Liu, Ming, Ji, Zhilong, Bai, Jinfeng, Guo, Yiwen, Zuo, Wangmeng

Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor. Will this strategy also work well for solving complex non-convex optimization problems? Here, a common gradient-based optimizer acts like a disciplined doer, making locally optimal update at each step. Recent methods utilize large language models (LLMs) to optimize solutions for concrete problems by inferring from natural language instructions, akin to a high-level instructor. In this paper, we show that these two optimizers are complementary to each other, suggesting a collaborative optimization approach. The gradient-based optimizer and LLM-based optimizer are combined in an interleaved manner. We instruct LLMs using task descriptions and timely optimization trajectories recorded during gradient-based optimization. Inferred results from LLMs are used as restarting points for the next stage of gradient optimization. By leveraging both the locally rigorous gradient-based optimizer and the high-level deductive LLM-based optimizer, our combined optimization method consistently yields improvements over competitive baseline prompt tuning methods. Our results demonstrate the synergistic effect of conventional gradient-based optimization and the inference ability of LLMs. The code is released at https://github.com/guozix/LLM-catalyst.

llm, optimization, optimizer, (17 more...)

2405.19732

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Monaco (0.04)
(3 more...)

Genre: Research Report > New Finding (0.86)

Industry: Materials > Chemicals > Specialty Chemicals (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceJun-6-2024

Characterizing segregation in blast rock piles a deep-learning approach leveraging aerial image analysis

Liu, Chengeng, Liu, Sihong, Shen, Chaomin, Gao, Yupeng, Liu, Yuxuan

Blasted rock material serves a critical role in various engineering applications, yet the phenomenon of segregation-where particle sizes vary significantly along the gradient of a quarry pile-presents challenges for optimizing quarry material storage and handling. This study introduces an advanced image analysis methodology to characterize such segregation of rock fragments. The accurate delineation of detailed rock fragment size distributions was achieved through the analysis of drone-captured imagery, coupled with the application of an enhanced Unet semantic segmentation model integrated with an expansion-based post-processing technique. The quarry slope was stratified into four vertical sections, with the size distribution of each section quantified via ellipsoid shape approximations. Our results disclose pronounced vertical segregation patterns, with finer particles concentrated in the upper slope regions and coarser particles in the lower. Utilizing relative characteristic diameters, we offered insight into the degree of segregation, thereby illustrating the spatial heterogeneity in fragment size more clearly. The techniques outlined in this study deliver a scalable and accurate method for assessing fragment size distribution, with the potential to better inform resource management and operational decisions in quarry management.

particle, segregation, size distribution, (14 more...)

2406.04149

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > China > Yunnan Province (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Materials > Metals & Mining (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kausar, Rizwana, Zayer, Fakhreddine, Viegas, Jaime, Dias, Jorge

Efficient Hybrid Neuromorphic-Bayesian Model for Olfaction Sensing: Detection and Classification

Olfaction sensing in autonomous robotics faces challenges in dynamic operations, energy efficiency, and edge processing. It necessitates a machine learning algorithm capable of managing real-world odor interference, ensuring resource efficiency for mobile robotics, and accurately estimating gas features for critical tasks such as odor mapping, localization, and alarm generation. This paper introduces a hybrid approach that exploits neuromorphic computing in combination with probabilistic inference to address these demanding requirements. Our approach implements a combination of a convolutional spiking neural network for feature extraction and a Bayesian spiking neural network for odor detection and identification. The developed algorithm is rigorously tested on a dataset for sensor drift compensation for robustness evaluation. Additionally, for efficiency evaluation, we compare the energy consumption of our model with a non-spiking machine learning algorithm under identical dataset and operating conditions. Our approach demonstrates superior efficiency alongside comparable accuracy outcomes.

algorithm, classification, neural network, (14 more...)

2407.04714

Country:

North America > United States (0.04)
Asia > Middle East > UAE (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.82)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.47)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Offline Multi-Objective Optimization

Xue, Ke, Tan, Rong-Xi, Huang, Xiaobin, Qian, Chao

Offline optimization aims to maximize a black-box objective function with a static dataset and has wide applications. In addition to the objective function being black-box and expensive to evaluate, numerous complex real-world problems entail optimizing multiple conflicting objectives, i.e., multi-objective optimization (MOO). Nevertheless, offline MOO has not progressed as much as offline single-objective optimization (SOO), mainly due to the lack of benchmarks like Design-Bench for SOO. To bridge this gap, we propose a first benchmark for offline MOO, covering a range of problems from synthetic to real-world tasks. This benchmark provides tasks, datasets, and open-source examples, which can serve as a foundation for method comparisons and advancements in offline MOO. Furthermore, we analyze how the current related methods can be adapted to offline MOO from four fundamental perspectives, including data, model architecture, learning algorithm, and search algorithm. Empirical results show improvements over the best value of the training set, demonstrating the effectiveness of offline MOO methods. As no particular method stands out significantly, there is still an open challenge in further enhancing the effectiveness of offline MOO. We finally discuss future challenges for offline MOO, with the hope of shedding some light on this emerging field. Our code is available at \url{https://github.com/lamda-bbo/offline-moo}.

algorithm, optimization, search space, (15 more...)

2406.03722

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(19 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Transportation (0.86)
Materials (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Radwan, Yousef A., Kronberger, Gabriel, Winkler, Stephan

A Comparison of Recent Algorithms for Symbolic Regression to Genetic Programming

Symbolic regression is a machine learning method with the goal to produce interpretable results. Unlike other machine learning methods such as, e.g. random forests or neural networks, which are opaque, symbolic regression aims to model and map data in a way that can be understood by scientists. Recent advancements, have attempted to bridge the gap between these two fields; new methodologies attempt to fuse the mapping power of neural networks and deep learning techniques with the explanatory power of symbolic regression. In this paper, we examine these new emerging systems and test the performance of an end-to-end transformer model for symbolic regression versus the reigning traditional methods based on genetic programming that have spearheaded symbolic regression throughout the years. We compare these systems on novel datasets to avoid bias to older methods who were improved on well-known benchmark datasets. Our results show that traditional GP methods as implemented e.g., by Operon still remain superior to two recently published symbolic regression methods.

dataset, genetic programming, symbolic regression, (13 more...)

2406.03585

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Europe > Austria > Upper Austria (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Government (0.70)
Materials (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation

Cheng, Zelei, Wu, Xian, Yu, Jiahao, Yang, Sabrina, Wang, Gang, Xing, Xinyu

Deep reinforcement learning (DRL) is playing an increasingly important role in real-world applications. However, obtaining an optimally performing DRL agent for complex tasks, especially with sparse rewards, remains a significant challenge. The training of a DRL agent can be often trapped in a bottleneck without further progress. In this paper, we propose RICE, an innovative refining scheme for reinforcement learning that incorporates explanation methods to break through the training bottlenecks. The high-level idea of RICE is to construct a new initial state distribution that combines both the default initial states and critical states identified through explanation methods, thereby encouraging the agent to explore from the mixed initial states. Through careful design, we can theoretically guarantee that our refining scheme has a tighter sub-optimality bound. We evaluate RICE in various popular RL environments and real-world applications. The results demonstrate that RICE significantly outperforms existing refining schemes in enhancing agent performance.

agent, explanation method, reinforcement learning, (13 more...)

2405.03064

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Materials > Metals & Mining (0.67)
Government > Military > Cyberwarfare (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Therrien, Félix, Sargent, Edward H., Voznyy, Oleksandr

Using GNN property predictors as molecule generators

University of Toronto, Department of Electrical and Computer Engineering Graph neural networks (GNNs) have emerged as powerful tools to accurately predict materials and molecular properties in computational discovery pipelines. In this article, we exploit the invertible nature of these neural networks to directly generate molecular structures with desired electronic properties. Starting from a random graph or an existing molecule, we perform a gradient ascent while holding the GNN weights fixed in order to optimize its input, the molecular graph, towards the target property. Valence rules are enforced strictly through a judicious graph construction. The method relies entirely on the property predictor; no additional training is required on molecular structures. We demonstrate the application of this method by generating molecules with specific DFT-verified energy gaps and octanol-water partition coefficients (logP). Our approach hits target properties with rates comparable to or better than state-of-the-art generative models while consistently generating more diverse molecules.

adjacency matrix, molecule, representation, (14 more...)

2406.03278

Country: North America > Canada > Ontario > Toronto (0.54)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals (0.48)
Health & Medicine (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Wang, Han, Prasad, Archiki, Stengel-Eskin, Elias, Bansal, Mohit

Soft Self-Consistency Improves Language Model Agents

Generations from large language models (LLMs) can be improved by sampling and scoring multiple solutions to select a final answer. Current "sample and select" methods such as self-consistency (SC) rely on majority voting to score answers. However, when tasks have many distinct and valid answers, selection by voting requires a large number of samples. This makes SC prohibitively expensive for interactive tasks that involve generating multiple actions (answers) sequentially. After establishing that majority voting fails to provide consistent gains on such tasks, we demonstrate how to increase success rates by softening the scoring criterion. We introduce Soft Self-Consistency (SOFT-SC), which replaces SC's discontinuous scoring with a continuous score computed from model likelihoods, allowing for selection even when actions are sparsely distributed. SOFT-SC improves both performance and efficiency on long-horizon interactive tasks, requiring half as many samples as SC for comparable or better performance. For a fixed number of samples, SOFT-SC leads to a 1.3% increase over SC in absolute success rate on writing bash programs, a 6.6% increase on online shopping (WebShop), and a 4.7% increase for an interactive household game (ALFWorld). Finally, we show that SOFT-SC can be applied to both open-source and black-box models.

language model, oft -sc, webshop, (15 more...)

2402.13212

Country:

Asia > Singapore (0.04)
North America > United States > Oregon (0.04)
Asia > Middle East > UAE (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.64)

Industry:

Education (0.46)
Materials (0.46)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceJun-4-2024

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

Liu, Songtao, Dai, Hanjun, Zhao, Yue, Liu, Peng

Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecule set without any lookahead. Furthermore, existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count. In this work, we propose a general and principled framework via conditional residual energy-based models (EBMs), that focus on the quality of the entire synthetic route based on the specific criteria. By incorporating an additional energy-based function into our probabilistic model, our proposed algorithm can enhance the quality of the most probable synthetic routes (with higher probabilities) generated by various strategies in a plug-and-play fashion. Extensive experiments demonstrate that our framework can consistently boost performance across various strategies and outperforms previous state-of-the-art top-1 accuracy by a margin of 2.5%. Code is available at https://github.com/SongtaoLiu0823/CREBM.

molecule, prediction, synthetic route, (14 more...)

2406.02066

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania (0.04)

Genre:

Workflow (0.68)
Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Materials > Chemicals > Commodity Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)