Appendix and policy can be well-approximated by an

Neural Information Processing Systems 

In the beginning of this Appendix, we will provide the overall organization of the Appendix and notation table for the paper. Then we will include description of lower bound on the warm-up duration and briefly comment on their goal in helping to achieve the regret result. We also provide Lemma A.1 that shows that any stabilizing In particular, in Appendix E.1 we show the persistence of excitation during the warm-up, in Appendix E.2 we formally define the persistence of excitation property of the controllers in M, i.e. (43), and finally in Appendix E.3, we show that the control policies of In Appendix G, we state the formal regret result of the paper, Theorem 5 and provides its proof. Appendix H briefly looks into the case where the loss functions are convex. Finally, in Appendix I, we provide the supporting technical theorems and lemmas.