A Appendix
–Neural Information Processing Systems
A.1 PAC Bayesian Bound In this part, we provide a detailed PAC-Bound based on the continual learning scenario. Given a "prior" distribution P (a common assumption is zero mean, σ We now consider the bound in the continual learning scenario. Based on Eq. (6), the expected error of f Note that we only consider one gradient update to v in the second equation for simplicity, but using multiple gradient updates is a straightforward extension. The importance of each basis is constrained to be between 0 and 1, where 0 indicates that the basis is not important to old tasks and can completely release for learning new tasks. Similar to [34], we calculate the bases of these subspaces for each layer by analyzing network representations after learning each task with Singular Value Decomposition (SVD), and then use it to update v and w by layer.
Neural Information Processing Systems
May-29-2025, 12:27:47 GMT
- Technology: