Goto

Collaborating Authors

 Search


RandAugment: Practical Automated Data Augmentation with a Reduced Search Space

Neural Information Processing Systems

Recent work on automated data augmentation strategies has led to state-of-the-art results in image classification and object detection. An obstacle to a large-scale adoption of these methods is that they require a separate and expensive search phase. A common way to overcome the expense of the search phase was to use a smaller proxy task.





Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Neural Information Processing Systems

Many problems in machine learning reduce to learning a probability distribution (or policy) over sequences of discrete actions so as to maximize a downstream utility function. Examples include generating text sequences to maximize a task-specific metric like BLEU and generating action sequences in reinforcement learning (RL) to maximize expected return.


start with common concerns and then respond to individual reviewer comments as space permits: 2 Common: There should be a baseline using MCTS and assuming access to simulator / common random numbers

Neural Information Processing Systems

Thank you for the thoughtful and careful reviews. We hope the AC nominates some of you for reviewer awards. There should be a baseline using MCTS and assuming access to simulator / common random numbers. There appears to be some imprecision in reviews about what this means. Then environment stochasticity is re-sampled and the algorithm repeats.