Goto

Collaborating Authors

 kumar







Behavior Transformers: Cloningkmodeswithonestone

Neural Information Processing Systems

Infact, modelingmulti-modal 3 k means Continuous action dataset (|A| x a)Clustering into k bins Action offset (1 x a)Continuous action (1 x a)Categorical action bin (1 x k)Continuous action (1 x a)k means encoderk means decoderA.




26657d5ff9020d2abefe558796b99584-Paper.pdf

Neural Information Processing Systems

Specifically, there now exists a tight relaxation for verifying therobustness ofaneural networkto` input perturbations, aswell asefficient primal and dual solvers for the relaxation. Buoyed by this success, we consider the problem of developing similar techniques for verifying robustness to input perturbations within the probability simplex. We prove a somewhat surprising result that,inthiscase, notonlycanonedesign atightrelaxation thatovercomes the convexbarrier,butthe size ofthe relaxation remains linear inthe number of neurons, thereby leading tosimpler and more efficient algorithms.


OfflineReinforcementLearningasOneBig SequenceModelingProblem

Neural Information Processing Systems

Reinforcement learning (RL) is typically concerned with estimating stationary policies orsingle-step models, leveraging theMarkovproperty tofactorize problems in time. However, we can also view RL as a generic sequence modeling problem, with the goal being to produce a sequence of actions that leads to a sequence ofhighrewards.