Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets