AITopics | wizardarena

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena

Neural Information Processing SystemsMar-22-2026, 12:09:07 GMT

Recent work demonstrates that, post-training large language models with open-domain instruction following data have achieved colossal success. Simultaneously, human Chatbot Arena has emerged as one of the most reasonable benchmarks for model evaluation and developmental guidance. However, the processes of manually curating high-quality training data and utilizing online human evaluation platforms are both expensive and limited. To mitigate the manual and temporal costs associated with post-training, this paper introduces a Simulated Chatbot Arena named WizardArena, which is fully based on and powered by open-source LLMs. For evaluation scenario, WizardArena can efficiently predict accurate performance rankings among different models based on offline test set. For training scenario, we simulate arena battles among various state-of-the-art models on a large scale of instruction data, subsequently leveraging the battle results to constantly enhance target model in both the supervised fine-tuning and reinforcement learning . Experimental results demonstrate that our WizardArena aligns closely with the online human arena rankings, and our models trained on offline extensive battle data exhibit significant performance improvements during SFT, DPO, and PPO stages.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Genre: Research Report (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)

Add feedback

ca4aa9acc097e0f606af55bf986cb031-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 03:42:34 GMT

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena

Neural Information Processing SystemsOct-10-2025, 16:38:36 GMT

Recent work demonstrates that, post-training large language models with open-domain instruction following data have achieved colossal success.

arxiv preprint arxiv, language model, wizardarena, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena

Neural Information Processing SystemsMay-27-2025, 16:27:04 GMT

Recent work demonstrates that, post-training large language models with open-domain instruction following data have achieved colossal success. Simultaneously, human Chatbot Arena has emerged as one of the most reasonable benchmarks for model evaluation and developmental guidance. However, the processes of manually curating high-quality training data and utilizing online human evaluation platforms are both expensive and limited. To mitigate the manual and temporal costs associated with post-training, this paper introduces a Simulated Chatbot Arena named WizardArena, which is fully based on and powered by open-source LLMs. For evaluation scenario, WizardArena can efficiently predict accurate performance rankings among different models based on offline test set. For training scenario, we simulate arena battles among various state-of-the-art models on a large scale of instruction data, subsequently leveraging the battle results to constantly enhance target model in both the supervised fine-tuning and reinforcement learning .

chatbot arena, simulated offline chatbot arena, wizardarena, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)

Add feedback

Filters

Collaborating Authors

wizardarena

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena

ca4aa9acc097e0f606af55bf986cb031-Paper-Conference.pdf

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena