ProgressGym: Alignment with a Millennium of Moral Progress
–Neural Information Processing Systems
Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale. We introduce progress alignment as a technical solution to mitigate this imminent risk. Progress alignment algorithms learn to emulate the mechanics of human moral progress, thereby addressing the susceptibility of existing alignment methods to contemporary moral blindspots.
Neural Information Processing Systems
Oct-9-2025, 19:56:34 GMT
- Country:
- Asia > China
- Europe
- Hungary > Budapest
- Budapest (0.04)
- Latvia > Lubāna Municipality
- Lubāna (0.04)
- Netherlands (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- Hungary > Budapest
- North America > United States
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- New Jersey > Passaic County
- Paterson (0.04)
- Pennsylvania (0.04)
- Michigan > Washtenaw County
- Genre:
- Research Report
- Experimental Study (0.67)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine (0.46)
- Information Technology (0.67)
- Technology: