Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training