Goto

Collaborating Authors

 initialize


A Architectures, Hyper-parameters and Algorithms

Neural Information Processing Systems

Our approach, named ORDER, uses a three-step training process. In the next parts of this section, we'll explain the methods, structures, and settings we use in each of After that, we'll talk about how we set up and carried out our experiments. In this section, we'll break down the design of the state encoder, how we decided on the best We used a grid search strategy to find the optimal hyper-parameters for our experiments. This allowed each observation dimension to match up with a state factor. We summarize the training process in Algorithm 1.


Checklist

Neural Information Processing Systems

Do the main claims made in the abstract and introduction accurately reflect the paper's Did you describe the limitations of your work? Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experi-20 Did you include the total amount of compute and the type of resources used (e.g., type If your work uses existing assets, did you cite the creators? Did you mention the license of the assets? Did you include any new assets either in the supplemental material or as a URL? [Y es] Did you discuss whether and how consent was obtained from people whose data you're We thereby state that we bear all responsibility in case of violation of rights, etc., and confirmation of F or what purpose was the dataset created? - For the novel task of data analysis as explained Who created the dataset and on behalf of which entity? - This dataset is created during a Who funded the creation of the dataset? What do the instances that comprise the dataset represent?





Near-OptimalGoal-Oriented Reinforcement LearninginNon-StationaryEnvironments

Neural Information Processing Systems

The different roles of c and P in this lower bound inspire us to design algorithms that estimate costs and transitions separately. Specifically, assuming the knowledge of c and P, we develop a simple but sub-optimal algorithm and another more involved minimax optimal algorithm (up to logarithmic terms). These algorithms combine the ideas of finite-horizon approximation [Chen et al., 2022a], special Bernstein-style bonuses of the MVP algorithm[Zhangetal.,2020],adaptiveconfidencewidening[WeiandLuo,2021],as well as some new techniques such as properly penalizing long-horizon policies. Finally,when c and P are unknown, we develop avariant ofthe MASTER algorithm [Weiand Luo,2021]and integrate the aforementioned ideas into itto achieve O(min{B?S




e0640c93b05097a9380870aa06aa0df4-Paper.pdf

Neural Information Processing Systems

Weintroduce COPT,anoveldistance metric between graphs defined via anoptimization routine, computing a coordinated pair of optimal transport maps simultaneously. This gives an unsupervised way to learn general-purpose graph representation, applicable tobothgraphsketching andgraphcomparison.