Power Allocation for Delay Optimization in Device-to-Device Networks: A Graph Reinforcement Learning Approach