The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment

Kim, HyunJin, Yi, Xiaoyuan, Yao, Jing, Lian, Jianxun, Huang, Muhua, Duan, Shitong, Bak, JinYeong, Xie, Xing

arXiv.org Artificial Intelligence 

The emergence of large language models (LLMs) has sparkedthe discussion on Artificial Superintelligence (ASI), a hypothetical AI system surpassing human intelligence. Though ASI is still hypothetical and far from current AI capabilities, existing alignment methods struggle to guide such advanced AI ensure its safety in the future. It is essential to discuss the alignment of such AI now. Superalignment, the alignment of AI at superhuman levels of capability systems with human values and safety requirements, aims to address two primary goals: scalability in supervision to provide high-quality guidance signals and robust governance to ensure alignment with human values. In this survey, we review the original scalable oversight problem and corresponding methods and potential solutions for superalignment. Specifically, we introduce the Figure 1: Challenges from the perspectives of supervision challenges and limitations of current alignment and governance. While supervision perspective paradigms in addressing the superalignment focuses on providing high-quality guidance signals for problem. Then we review scalable oversight enhancing system competence, governance perspective methods for superalignment. Finally, we discuss emphasizes aligning the behavior of advanced aI with the key challenges and propose pathways human values to prevent harmful outcomes.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found