Supplementary Materials for the Paper " L2T-DLN: Learning to Teach with Dynamic Loss Network "
–Neural Information Processing Systems
BIT Special Zone Beijing Institute Of Technology Beijing, China, 100081 {haizhaoyang, liyuan.pan, In this supplementary material, we provide the proofs of convergence analysis in Section 1, 1-vs-1 transformation employed in the classification and semantic segmentation tasks in Section 2, the coordinate-wise and the preprocessing method of the LSTM teacher in Section 3, the loss functions of YOLO-v3 in Section 4, more experiments of image classification in Section 5, and the inferences of semantic segmentation in Section 6. A differentiable function e() is L-smooth with gradient Lipschitz constant C (uniformly Lipschitz continuous), if e(x) e(y) C x y, x, y. If e(x) ϵ, then x is an ϵ-first-order stationary point. For a differentiable function e(), if x is a SS1, and there exists ϵ > 0 so that for any y in the ϵ-neighborhood of x, we have e(x) e(y), then x is a local minimum. A saddle point x is an SS1 that is not a local minimum.
Neural Information Processing Systems
May-25-2025, 03:44:06 GMT