ProgressGym: Alignment with a Millennium of Moral Progress

Oct-9-2025, 19:56:34 GMT–Neural Information Processing Systems

Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale. We introduce progress alignment as a technical solution to mitigate this imminent risk. Progress alignment algorithms learn to emulate the mechanics of human moral progress, thereby addressing the susceptibility of existing alignment methods to contemporary moral blindspots.

algorithm, alignment, arxiv preprint arxiv, (14 more...)

Neural Information Processing Systems

Oct-9-2025, 19:56:34 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania (0.04)
  - New Jersey > Passaic County
    - Paterson (0.04)
  - Michigan > Washtenaw County
    - Ann Arbor (0.04)
- Europe
  - Netherlands (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
    - Cambridgeshire > Cambridge (0.04)
  - Latvia > Lubāna Municipality
    - Lubāna (0.04)
  - Hungary > Budapest
    - Budapest (0.04)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.67)

Industry:
- Information Technology (0.67)
- Health & Medicine (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)

Duplicate Docs Excel Report

Title
1a6d49c1a298ebb799d005b7b90ab31d-Paper-Datasets_and_Benchmarks_Track.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found