Goto

Collaborating Authors

 South America




LearningtoGenerateVisualQuestions withNoisySupervision

Neural Information Processing Systems

Moreover,VQG models are also particularly useful for the few-shot learning or zero-shot learning [36,44]. Conceptually, VQG is a very challenging task since the generated questions are not only required to be consistent with the image content but also meaningful and answerabletohumans.





AnEfficientAsynchronousMethodforIntegrating EvolutionaryandGradient-basedPolicySearch

Neural Information Processing Systems

These have the opposite properties, with DRL having good sample efficiencyandpoor stability, while ESbeing vice versa. Recently,there havebeen attempts tocombine these algorithms, butthesemethods fullyrelyonsynchronous updatescheme, making it not ideal to maximize the benefits of the parallelism in ES.



AD-DROP: Attribution-DrivenDropoutforRobust LanguageModelFine-Tuning

Neural Information Processing Systems

Pre-training large language models (PrLMs) on massive unlabeled corpora and fine-tuning them on downstream tasks has become a new paradigm [1-3]. Their success can be partly attributed to the self-attention mechanism [4], yet these self-attention networks are often redundant [5, 6] and tend to cause overfitting when fine-tuned on downstream tasks due to the mismatch between their overparameterization and the limited annotated data [7-13]. To address this issue, various regularization techniques such as data augmentation [14, 15], adversarial training [16, 17]), and dropout-based methods [11,13,18]have been developed.