Goto

Collaborating Authors

 exi


ABest-of-both-worldsAlgorithmforBanditswith DelayedFeedbackwithRobustnesstoExcessiveDelays

Neural Information Processing Systems

Joulani et al. (2013) have studied multi-armed bandits with delayed feedback under the assumption that the rewards are stochastic and the delays are sampled from a fixed distribution.


f593c9c251d4d7cf14d4ab9861dfb7eb-Paper-Conference.pdf

Neural Information Processing Systems

However, some recent studies haverecognized that most ofthese approaches failtoimprovethe performance over empirical risk minimization especially when applied to overparameterized neural networks.



ATheory-DrivenSelf-LabelingRefinementMethodfor ContrastiveRepresentationLearning

Neural Information Processing Systems

Althoughintuitive,sucha nativelabelassignment strategycannot revealtheunderlying semantic similarity between aquery anditspositivesandnegatives,andimpairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query.