Goto

Collaborating Authors

A Proofs of Theorems

Neural Information Processing Systems

Under the assumption that the MDP is deterministic and all states are strongly connected, there exists at least one shortest state trajectory from s to g. We use the agent's trajectories to construct and update the adjacency matrix. Concretely, the adjacency matrix is initialized to an empty matrix at the beginning of training. Each time when the agent explores a new state that it has never visited before, the adjacency matrix is augmented by a new row and a new column with zero elements, representing the k-step adjacent relation between the new state and explored states. When the temporal distance between two states in one trajectory is not larger than k, then the corresponding element in the adjacency matrix will be labeled to 1, indicating the adjacency.


f5f3b8d720f34ebebceb7765e447268b-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all reviewers for detailed and valuable comments, and will revise the paper accordingly as described below. We thank all reviewers for pointing those out, and will do corrections in the revision. We agree with the reviewer and will change the wording in the revision. HIRO paper, goal-conditioned HRL often yields better performance than HRL with Options. E.g. all graph-based works cited in the review obtain the subgoal sequence by solving a shortest-path In the revision, we will add these discussions to the related work section.



Shadowcast: Stealthy Data Poisoning Attacks against Vision-Language Models

Neural Information Processing Systems

Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, but their versatility raises security concerns. This study takes the first step in exposing VLMs' susceptibility to data poisoning attacks that can manipulate responses to innocuous, everyday prompts. We introduce Shadowcast, a stealthy data poisoning attack where poison samples are visually indistinguishable from benign images with matching texts. Shadowcast demonstrates effectiveness in two attack types. The first is a traditional Label Attack, tricking VLMs into misidentifying class labels, such as confusing Donald Trump for Joe Biden.





Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

Neural Information Processing Systems

We consider the problem of online linear regression in the stochastic setting. We derive high probability regret bounds for online ridge regression and the forward algorithm. This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions. Our study advocates for the use of the forward algorithm in lieu of ridge due to its enhanced bounds and robustness to the regularization parameter. Moreover, we explain how to integrate it in algorithms involving linear function approximation to remove a boundedness assumption without deteriorating theoretical bounds. We showcase this modification in linear bandit settings where it yields improved regret bounds. Last, we provide numerical experiments to illustrate our results and endorse our intuitions.


Quantile Propagation for Wasserstein-Approximate Gaussian Processes, Edwin V. Bonilla

Neural Information Processing Systems

Approximate inference techniques are the cornerstone of probabilistic methods based on Gaussian process priors. Despite this, most work approximately optimizes standard divergence measures such as the Kullback-Leibler (KL) divergence, which lack the basic desiderata for the task at hand, while chiefly offering merely technical convenience. We develop a new approximate inference method for Gaussian process models which overcomes the technical challenges arising from abandoning these convenient divergences.


Thanks to all of the reviewers for their time and effort, and both constructive and critical comments

Neural Information Processing Systems

Proving that "at convergence, the parameters of the approximate factors can be considered fixed" may Reviewer 1 puts it well, that "replacing it [KL] by the L2 Wasserstein should strike the majority of researchers as an If the CDF is accessible, Equation (5), which forms the basis of our lookup tables, is stable and it avoids divergence. If the CDF isn't accessible, there are double integrations We thank reviewer 1 for their supportive comments and helpful suggestions on e.g. the broader impact, "Found analytically for fewer distributions than EP" While we agree Reviewer 2. "No code is given" Please see General Comment 1. "not clear if the proposed method is worth it" Please Please see General Comments 2.1 and 2.2. Reviewer 3. "empirical advantages of the method are not demonstrated" We respectfully remind that in 8 out of 10 Besides, Figure 1.a and 1.b illustrate the effectiveness of our method in alleviating the over-estimation of variances Approximate Inference, Opper & Winther] and proving the property pointed out by Reviewer 2 that "at convergence, However, we cannot yet rule out that our method is already provably convergent under appropriate assumptions. Reviewer 4. "a lot of references to the Appendix... the paper a little hard to digest" We will take the suggestion We will also use the extra page in the final version for this. "I'm not sure whether the page on the locality property is We ask the reviewer to kindly consider the broader relevance outlined in General Comment 3. "Note sure whether General Comment 3. "marginal likelihood and its accuracy" Please see General Comments 2.1 and 2.2.