AITopics | thefollowinginequalityh...

Collaborating Authors

thefollowinginequalityh...

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Relative Density Ratio Optimization for Stable and Statistically Consistent Model Alignment

Takahashi, Hiroshi, Iwata, Tomoharu, Kumagai, Atsutoshi, Kanai, Sekitoshi, Yamada, Masanori, Nishida, Kosuke, Shinoda, Kazutoshi

arXiv.org Machine LearningApr-7-2026

Aligning language models with human preferences is essential for ensuring their safety and reliability. Although most existing approaches assume specific human preference models such as the Bradley-Terry model, this assumption may fail to accurately capture true human preferences, and consequently, these methods lack statistical consistency, i.e., the guarantee that language models converge to the true human preference as the number of samples increases. In contrast, direct density ratio optimization (DDRO) achieves statistical consistency without assuming any human preference models. DDRO models the density ratio between preferred and non-preferred data distributions using the language model, and then optimizes it via density ratio estimation. However, this density ratio is unstable and often diverges, leading to training instability of DDRO. In this paper, we propose a novel alignment method that is both stable and statistically consistent. Our approach is based on the relative density ratio between the preferred data distribution and a mixture of the preferred and non-preferred data distributions. Our approach is stable since this relative density ratio is bounded above and does not diverge. Moreover, it is statistically consistent and yields significantly tighter convergence guarantees than DDRO. We experimentally show its effectiveness with Qwen 2.5 and Llama 3.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2604.0441

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)

Add feedback

Appendices

Neural Information Processing SystemsFeb-8-2026, 03:34:38 GMT

In detail, we choose UMAP [15] as the projection algorithm and train the projecting function in Hopper using 64000 transitions sampled by the expert agent. To evaluate a policy, we sample the same number of transitions, and then project them onto a 2-dimensional space by the trained projectingfunction. For empirical estimation, we subsequently discretize the projected 2-dimensional state space into small grid regions, and estimated the distribution via Kernel Density Estimation (KDE) [19]with Gaussian kernel. These twohyperparameters affect the experimental results more significantly. Moreover, as mentioned in Section 6.3, they can be tuned based onthedistribution ofthedataset.

artificial intelligence, dtv, thefollowinginequalityhold, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback