A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models