Reversed Attention: On The Gradient Descent Of Attention Layers In GPT