Achieving balanced alignment of large language models (LLMs) in terms of Help-Harmless O fulness,ptimHonestyizat,iandon Harmlessness H(3Heoptimization)lpful Opconstitutestimizaatcornerstoneion

Jun-23-2026, 10:53:44 GMT–Neural Information Processing Systems

Existing methods like data mixture strategies face limitations, including heavy reliance on expert knowledge and conflicting optimization signals. While model merging offers parameter-level conflict-resolution strategies through integrating specialized models' parameters, its potential for 3H optimization remains underexplored. This paper systematically compares the effectiveness of model merging and data mixture methods in constructing 3H-aligned LLMs for the first time, revealing previously overlooked collaborative and conflict relationships among the 3H dimensions and discussing the advantages and drawbacks of Mdata mixture (data-level) and model merging (parameter-level) methods in mitiodgating the conflict for balanced 3H optimization.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Jun-23-2026, 10:53:44 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found