The AI Alignment Paradox

Feb-5-2025, 18:35:01 GMT–Communications of the ACM

The release of GPT-3, and later ChatGPT, catapulted large language models from the proceedings of computer science conferences to newspaper headlines across the globe, fueling their rise to one of today's most hyped technologies. The public's awe about GPT-3's knowledge and fluency was quickly blemished by concerns regarding its potential to radicalize, instigate, and misinform, for example, by stating that Bill Gates aimed to "kill billions of people with vaccines" or that Hillary Clinton was a "high-level satanic priestess."4 These shortcomings, in turn, have sparked a surge in research on AI alignment,7 a field aiming to "steer AI systems toward a person's or group's intended goals, preferences, and ethical principles" (definition by Wikipedia). A well-aligned AI system will "understand" what is "good" and what is "bad" and will do only the "good" while avoiding the "bad."a The resulting techniques, including instruction fine-tuning, reinforcement learning from human feedback, and so forth, have contributed in major ways to improving the output quality of large language models.

large language model, machine learning, natural language, (18 more...)

Communications of the ACM

Feb-5-2025, 18:35:01 GMT

Journals Web Page

Add feedback

Country:
- Europe > Ukraine (0.08)

Industry:
- Government (0.36)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)