Goto

Collaborating Authors

Flipping-based Policy for Chance-Constrained Markov Decision Processes

Neural Information Processing Systems

Safe reinforcement learning (RL) is a promising approach for many real-world decision-making problems where ensuring safety is a critical necessity. In safe RL research, while expected cumulative safety constraints (ECSCs) are typically the first choices, chance constraints are often more pragmatic for incorporating safety under uncertainties. This paper proposes a flipping-based policy for Chance-Constrained Markov Decision Processes (CCMDPs). The flipping-based policy selects the next action by tossing a potentially distorted coin between two action candidates. The probability of the flip and the two action candidates vary depending on the state.



Evaluating Generated Text as Text Generation

Neural Information Processing Systems

A wide variety of NLP applications, such as machine translation, summarization, and dialog, involve text generation. One major challenge for these applications is how to evaluate whether such generated texts are actually fluent, accurate, or effective. In this work, we conceptualize the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models. The general idea is that models trained to convert the generated text to/from a reference output or the source text will achieve higher scores when the generated text is better.


Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing Systems

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.


would like to address all concerns raised. please find some details regarding the proposed methods

Neural Information Processing Systems

We would like to thank all of the reviewers for their valuable time and their constructive comments. Reviewer 1: We will incorporate the proposed minor corrections in the final version of the paper. The two-stage approach, i.e., i) running gradient descent to convergence, and then ii) projection onto sparsity set, On whether support set changes during iterations, we observe that in experiments (subsection 4.1) IHT changes support, Reviewer 2: We thank the reviewer for the supportive and constructive review. Regarding the comment in lines 198-202, we apologize for any confusion. Regarding variance in experiments, we have observed high variance is not enough for the algorithm to get "lucky".




Joint-task Self-supervised Learning for Temporal Correspondence

Neural Information Processing Systems

This paper proposes to learn reliable dense correspondence from videos in a self-supervised manner. Our learning process integrates two highly related tasks: tracking large image regions and establishing fine-grained pixel-level associations between consecutive video frames. We exploit the synergy between both tasks through a shared inter-frame affinity matrix, which simultaneously models transitions between video frames at both the region-and pixel-levels. While region-level localization helps reduce ambiguities in fine-grained matching by narrowing down search regions; fine-grained matching provides bottom-up features to facilitate region-level localization. Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking. Our self-supervised method even surpasses the fully-supervised affinity feature representation obtained from a ResNet-18 pre-trained on the ImageNet. The project website can be found at https://sites.google.com/view/uvc2019/.


140f6969d5213fd0ece03148e62e461e-AuthorFeedback.pdf

Neural Information Processing Systems

The shared affinity matrix bridges these tasks, and facilitates iterative improvements. These contributions are significant in the field of self-supervised learning. The contributions of this work are also demonstrated by our ablation study, i.e., Table 2 in the paper. We note that these components are novel and have not been explored in prior work. Which methods should the work compare with?


Overleaf Example

Neural Information Processing Systems

A widely used algorithm for transfer learning is fine-tuning, where a pre-trained model is fine-tuned on a target task with a small amount of labeled data. When the capacity of the pre-trained model is much larger than the size of the target data set, fine-tuning is prone to overfitting and "memorizing" the training labels. Hence, an important question is to regularize fine-tuning and ensure its robustness to noise. To address this question, we begin by analyzing the generalization properties of fine-tuning. We present a PAC-Bayes generalization bound that depends on the distance traveled in each layer during fine-tuning and the noise stability of the fine-tuned model.