Scalable Option Learning in High-Throughput Environments