Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

Open in new window