combinatorial space
- North America > United States > Washington (0.04)
- Europe > France (0.04)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces
Structured latent variables allow incorporating meaningful prior knowledge into deep learning models. However, learning with such variables remains challenging because of their discrete nature. Nowadays, the standard learning approach is to define a latent variable as a perturbed algorithm output and to use a differentiable surrogate for training. In general, the surrogate puts additional constraints on the model and inevitably leads to biased gradients. To alleviate these shortcomings, we extend the Gumbel-Max trick to define distributions over structured domains. We avoid the differentiable surrogates by leveraging the score function estimators for optimization. In particular, we highlight a family of recursive algorithms with a common feature we call stochastic invariant. The feature allows us to construct reliable gradient estimates and control variates without additional constraints on the model. In our experiments, we consider various structured latent variable models and achieve results competitive with relaxation-based counterparts.
- North America > United States > New York (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (5 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology (0.93)
- Energy (0.67)
- Health & Medicine > Epidemiology (0.46)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
- North America > United States (0.45)
- Europe > United Kingdom > England (0.14)
- Asia (0.14)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology (0.93)
- Energy (0.67)
- Health & Medicine > Epidemiology (0.46)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
- Information Technology > Information Management > Search (1.00)
- Information Technology > Data Science (1.00)
- Information Technology > Communications (1.00)
- (4 more...)
- North America > Canada > British Columbia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Washington (0.04)
- Europe > France (0.04)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces
Structured latent variables allow incorporating meaningful prior knowledge into deep learning models. However, learning with such variables remains challenging because of their discrete nature. Nowadays, the standard learning approach is to define a latent variable as a perturbed algorithm output and to use a differentiable surrogate for training. In general, the surrogate puts additional constraints on the model and inevitably leads to biased gradients. To alleviate these shortcomings, we extend the Gumbel-Max trick to define distributions over structured domains.
Reviews: Combinatorial Bayesian Optimization using the Graph Cartesian Product
This manuscript proposes a system for combinatorial Bayesian optimization called COMBO, aimed at problems with large numbers of categorical and/or ordinal features. The main contribution is an effective kernel for this setting based on applying a graph kernel to the graph Cartesian product of each of the features, which can be computed efficiently by exploiting structure. This kernel can be further enhanced using an ARD extension and a horseshoe prior to encourage sparse feature selection. The COMBO system then creates a GP with this kernel and does random local search to maximize an acquisition function such as EI in the combinatorial space. A series of experiments demonstrate COMBO performing better on real and synthetic tasks than alternatives such as systems using one-hot encodings.