dihcl
- North America > Canada > Ontario > Toronto (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- Research Report (0.68)
- Instructional Material (0.46)
Curriculum Learning by Dynamic Instance Hardness
A good teacher can adjust the curriculum based on students' learning history. By analogy, in this paper, we study the dynamics of a deep neural network's (DNN) performance on individual samples during its learning process. The observed properties allow us to develop an adaptive curriculum that leads to faster learning of more accurate models. We introduce dynamic instance hardness (DIH), the exponential moving average of a sample's instantaneous hardness (e.g., a loss, or a change in outputs) over the training history. A low DIH indicates that a model retains knowledge about a sample over time, and implies a flat loss landscape for that sample. Moreover, for DNNs, we find that a sample's DIH early in training predicts its DIH in later stages.
- North America > Canada > Ontario > Toronto (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- Research Report (0.68)
- Instructional Material (0.46)
62000dee5a05a6a71de3a6127a68778a-AuthorFeedback.pdf
We appreciate the reviewers' time and suggestions! We address them all and report new experimental results below. Although DIH can be helpful to identify noisy data in noisy-label setting (ref.Middle plot in Figure 1), DIHCL still achieves 90.34% test-set accuracy under 40% symmetric label noise on CIFAR10 (ref.Top plot in Figure 1). The statement may be revised that "updating in-6 Is the method specific to cyclic learning rate... DI-23 HCL is applicable to other learning rate schedules. We report the result of DIHCL with a piecewise exponential decay learning rate in Figure 1.
Curriculum Learning by Dynamic Instance Hardness
A good teacher can adjust the curriculum based on students' learning history. By analogy, in this paper, we study the dynamics of a deep neural network's (DNN) performance on individual samples during its learning process. The observed properties allow us to develop an adaptive curriculum that leads to faster learning of more accurate models. We introduce dynamic instance hardness (DIH), the exponential moving average of a sample's instantaneous hardness (e.g., a loss, or a change in outputs) over the training history. A low DIH indicates that a model retains knowledge about a sample over time, and implies a flat loss landscape for that sample. Moreover, for DNNs, we find that a sample's DIH early in training predicts its DIH in later stages.