On the Generalization Capability of Temporal Graph Learning Algorithms: Theoretical Insights and a Simpler Method