AITopics | good researcher

Collaborating Authors

good researcher

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CBF-LLM: Safe Control for LLM Alignment

Miyaoka, Yuya, Inoue, Masaki

arXiv.org Artificial IntelligenceAug-28-2024

While large language models (LLMs) are known to have strong language understanding and generation abilities, they can also generate harmful, biased, and toxic content [1][2]. Alignment of LLMs ensures that they generate content that is "desirable" for the user, typically meaning content that is safe and ethical. Various approaches for LLM alignment have been presented ([1], [2], [3] and reference therein). The major approach to the alignment is reinforcement learning from human feedback (RLHF) [4], where a reward model is constructed by human feedback and used for the training of LLMs. Variants of RLHF architectures are also proposed, such as Safe-RLHF [5], SENSEI [6], and f-DPG [7], and their implementations are presented, such as training pre-trained LLMs [8][9], and applications like information-seeking chatbot [10].

cbf-llm, good researcher, llm, (13 more...)

arXiv.org Artificial Intelligence

2408.15625

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback