Training AI to be Loyal

Oh, Sewoong, Tyagi, Himanshu, Viswanath, Pramod

arXiv.org Artificial Intelligence 

Loyal AI is loyal to the community that builds it. An AI is loyal to a community if the community has ownership, alignment, and control. Community owned models can only be used with the approval of the community and share the economic rewards communally. Community aligned models have values that are aligned with the consensus of the community. Community controlled models perform functions designed by the community. Since we would like permissionless access to the loyal AI's community, we need the AI to be open source. The key scientific question then is: how can we build models that are openly accessible (open source) and yet are owned and governed by the community. This seeming impossibility is the focus of this paper where we outline a concrete pathway to Open, Monetizable and Loyal models (OML), building on our earlier work on OML, arXiv:2411.03887(1) , and a representation via a cryptographic-ML library http://github.com/sentient-agi/oml-1.0-fingerprinting .