Goto

Collaborating Authors

 nguyen


Dual Space Gradient Descent for Online Learning

Neural Information Processing Systems

One crucial goal in kernel online learning is to bound the model size. Common approaches employ budget maintenance procedures to restrict the model sizes using removal, projection, or merging strategies. Although projection and merging, in the literature, are known to be the most effective strategies, they demand extensive computation whilst removal strategy fails to retain information of the removed vectors. An alternative way to address the model size problem is to apply random features to approximate the kernel function. This allows the model to be maintained directly in the random feature space, hence effectively resolve the curse of kernelization. However, this approach still suffers from a serious shortcoming as it needs to use a high dimensional random feature space to achieve a sufficiently accurate kernel approximation.





Supplementary Document

Neural Information Processing Systems

The pseudo-code of plugging our method into the vanilla BO is summarised in Algorithm 1. Therefore, our method is applicable to any other variants of BO in a plug-in manner. In this section, we present the proofs associated with the theoretical assertions from Section 2. To Lemma 1. Assume the GP employs a stationary kernel Lemma 2. Given Lemma 1, determining Proposition 2. Leveraging Lemma 2, suppose Lemma 3. As per Srinivas et al., the optimization process in BO can be conceptualized as a sampling Pr null |f ( x) µ(x) | ωσ ( x) null > δ, (24) where δ > 0 signifies the confidence level adhered to by the UCB. This lemma is directly from Srinivas et al. . The proof can be found therein. Theorem 1. Leveraging Corollary 1, when employing the termination method proposed in this paper, As discussed in Remark 2 of Section 2.2 in the main manuscript, we suggest initializing L-BFGS Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's


Tree-to-tree Neural Networks for Program Translation

Xinyun Chen, Chang Liu, Dawn Song

Neural Information Processing Systems

Program translation isanimportant tool tomigrate legacycode inone language into an ecosystem built in a different language. In this work, we are the first to employ deep neural networks toward tackling this problem.





Ferrari: FederatedFeatureUnlearningvia OptimizingFeatureSensitivity

Neural Information Processing Systems

Existing methods employ the influence function to achieve feature unlearning, which is impractical for FL as it necessitates the participation of other clients,if not all, in the unlearning process. Furthermore, current research lacks an evaluation of the effectiveness of feature unlearning. Toaddress these limitations, we define feature sensitivity in evaluating feature unlearning according to Lipschitz continuity. Thismetric characterizes themodel output'srateofchange or sensitivity to perturbations in the input feature. We then propose an effective federated feature unlearning framework called Ferrari, which minimizes feature sensitivity. Extensive experimental results and theoretical analysis demonstrate the effectiveness of Ferrari across various feature unlearning scenarios, including sensitive, backdoor, and biased features.