reso
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
Zhou, Heng, Geng, Hejia, Xue, Xiangyuan, Yin, Zhenfei, Bai, Lei
Multi-agent systems have emerged as a promising approach for enhancing the reasoning capabilities of large language models in complex problem-solving. However, current MAS frameworks are limited by poor flexibility and scalability, with underdeveloped optimization strategies. To address these challenges, we propose ReSo, which integrates task graph generation with a reward-driven two-stage agent selection process. The core of ReSo is the proposed Collaborative Reward Model, which can provide fine-grained reward signals for MAS cooperation for optimization. We also introduce an automated data synthesis framework for generating MAS benchmarks, without human annotations. Experimentally, ReSo matches or outperforms existing methods. ReSo achieves \textbf{33.7\%} and \textbf{32.3\%} accuracy on Math-MAS and SciBench-MAS SciBench, while other methods completely fail. Code is available at: \href{https://github.com/hengzzzhou/ReSo}{ReSo}
- Europe > Monaco (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (3 more...)
Gonogo: An R Implementation of Test Methods to Perform, Analyze and Simulate Sensitivity Experiments
This work provides documentation for a suite of R functions contained in gonogo.R. The functions provide sensitivity testing practitioners and researchers with an ability to conduct, analyze and simulate various sensitivity experiments involving binary responses and a single stimulus level (e.g., drug dosage, drop height, velocity, etc.). Included are the modern Neyer and 3pod adaptive procedures, as well as the Bruceton and Langlie. The latter two benchmark procedures are capable of being performed according to generalized up-down transformed-response rules. Each procedure is designated phase-one of a three-phase experiment. The goal of phase-one is to achieve overlapping data. The two additional (and optional) refinement phases utilize the D-optimal criteria and the Robbins-Monro-Joseph procedure. The goals of the two refinement phases are to situate testing in the vicinity of the median and tails of the latent response distribution, respectively.