Goto

Collaborating Authors

 corr



ANPL: Towards Natural Programming with Interactive Decomposition Di Huang

Neural Information Processing Systems

Though LLMs are capable of generating plausible programs, it's challenging to interact with the LLMs further to revise the program, especially if the user's specific requirements are different from the initial proposal.








State Regularized Policy Optimization on Data with Dynamics Shift

Neural Information Processing Systems

We then demonstrate a lower-bound performance guarantee on policies regularized by the stationary state distribution. In practice, SRPO can be an add-on module to context-based algorithms in both online and offline RL settings.