Achieving balanced alignment of large language models (LLMs) in terms of Help-Harmless O fulness,ptimHonestyizat,iandon Harmlessness H(3Heoptimization)lpful Opconstitutestimizaatcornerstoneion

Neural Information Processing Systems 

Existing methods like data mixture strategies face limitations, including heavy reliance on expert knowledge and conflicting optimization signals. While model merging offers parameter-level conflict-resolution strategies through integrating specialized models' parameters, its potential for 3H optimization remains underexplored. This paper systematically compares the effectiveness of model merging and data mixture methods in constructing 3H-aligned LLMs for the first time, revealing previously overlooked collaborative and conflict relationships among the 3H dimensions and discussing the advantages and drawbacks of Mdata mixture (data-level) and model merging (parameter-level) methods in mitiodgating the conflict for balanced 3H optimization.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found