Goto

Collaborating Authors

 Education


Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models

Neural Information Processing Systems

There are two updating strategies: 1) mimicking strategy to generate similar samples based on original data, preserving stylistic and contextual essence, and 2) extending strategy that further expands existing samples at varying cognitive levels by adapting Bloom's taxonomy of educational objectives.







Embedding-Aligned Language Models Guy Tennenholtz

Neural Information Processing Systems

In this paper, we present a novel framework which accomplishes this by exploiting latent embedding spaces to define an objective function for an LLM in an iterative RL-driven process. As an example, consider the challenge of assisting content creators in generating valuable content within a recommender ecosystem (e.g., Y ouTube, Reddit, Spotify) [Boutilier et al., 2024].



Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning Hao Ma

Neural Information Processing Systems

Reinforcement learning (RL) has emerged as a pivotal technique for fine-tuning large language models (LLMs) on specific tasks. However, prevailing RL fine-tuning methods predominantly rely on PPO and its variants. Though these algorithms are effective in general RL settings, they often exhibit suboptimal performance and vulnerability to distribution collapse when applied to the fine-tuning of LLMs.


No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations

Neural Information Processing Systems

The resulting features are evaluated on k-nearest neighbor classification over 11 datasets from vision, 5 from natural language processing, and 2 from audio.