Goto

Collaborating Authors

 nef


Supplementary Material for Learning Energy-based Model via Dual-MCMC Teaching

Neural Information Processing Systems

We show additional image synthesis in Fig.2. For reported numbers in main text, we adopt the network structure that contains Residue Blocks (see implementation details in Tab.5). We then test our model for the task of image inpainting. As shown in Fig.1, our This is the marginal version of Eqn.8 shown in the main text. 2 2.3 Learning Algorithm Three models are trained in an alternative and iterative manner based on the current model parameters. Compared to Eqn.3 and Eqn.6 in the main text, Eqn.5 and Eqn.6 start with initial points initialized We present the learning algorithm in Alg.1.


Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits

Neural Information Processing Systems

We study how tail properties of the base distribution of a NEF impose limits on the NEF: if the base distribution is subexponential (subgaussian), we show that the NEF is self-concordant with a stretch factor that grows inverse quadratically (respectively, linearly) 2.


Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits

Neural Information Processing Systems

We study how tail properties of the base distribution of a NEF impose limits on the NEF: if the base distribution is subexponential (subgaussian), we show that the NEF is self-concordant with a stretch factor that grows inverse quadratically (respectively, linearly) 2.


Supplementary Material for Learning Energy-based Model via Dual-MCMC Teaching

Neural Information Processing Systems

We show additional image synthesis in Fig.2. For reported numbers in main text, we adopt the network structure that contains Residue Blocks (see implementation details in Tab.5). We then test our model for the task of image inpainting. As shown in Fig.1, our This is the marginal version of Eqn.8 shown in the main text. 2 2.3 Learning Algorithm Three models are trained in an alternative and iterative manner based on the current model parameters. Compared to Eqn.3 and Eqn.6 in the main text, Eqn.5 and Eqn.6 start with initial points initialized We present the learning algorithm in Alg.1.


Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits

Liu, Shuai, Ayoub, Alex, Sentenac, Flore, Tan, Xiaoqi, Szepesvári, Csaba

arXiv.org Machine Learning

We prove that single-parameter natural exponential families with subexponential tails are self-concordant with polynomial-sized parameters. For subgaussian natural exponential families we establish an exact characterization of the growth rate of the self-concordance parameter. Applying these findings to bandits allows us to fill gaps in the literature: We show that optimistic algorithms for generalized linear bandits enjoy regret bounds that are both second-order (scale with the variance of the optimal arm's reward distribution) and free of an exponential dependence on the bound of the problem parameter in the leading term. To the best of our knowledge, ours is the first regret bound for generalized linear bandits with subexponential tails, broadening the class of problems to include Poisson, exponential and gamma bandits.


Comparing Comparators in Generalization Bounds

Hellström, Fredrik, Guedj, Benjamin

arXiv.org Machine Learning

We derive generic information-theoretic and PAC-Bayesian generalization bounds involving an arbitrary convex comparator function, which measures the discrepancy between the training and population loss. The bounds hold under the assumption that the cumulant-generating function (CGF) of the comparator is upper-bounded by the corresponding CGF within a family of bounding distributions. We show that the tightest possible bound is obtained with the comparator being the convex conjugate of the CGF of the bounding distribution, also known as the Cram\'er function. This conclusion applies more broadly to generalization bounds with a similar structure. This confirms the near-optimality of known bounds for bounded and sub-Gaussian losses and leads to novel bounds under other bounding distributions.