equivalence relation
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
e8a642ed6a9ad20fb159472950db3d65-Supplemental.pdf
Methods for handling missing data has been extensively studied in the past few decades. Those methods can be roughly classified into two categories: complete case analysis (CCA) based, and imputationbasedmethods. CCAbasedmethods,suchaslistwisedeletion[1]andpairwisedeletion [31] focuses on deleting data instances that contains missing entries, and keeping those that are complete. Standardtechniquesof single imputation include mean/zero imputation, regression-based imputation [1], no-parametric methods [15,54]. For the factorized priorp(Z|U) of the i-VAE component of GINA, we used 15 a linear network with one auxiliary input (which is set to be fully observed dimension,X1).
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Asia > Middle East > Jordan (0.04)
- (5 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Asia > Middle East > Jordan (0.04)
- (5 more...)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Heuristics for Combinatorial Optimization via Value-based Reinforcement Learning: A Unified Framework and Analysis
Davidovich, Orit, Shtern, Shimrit, Wasserkrug, Segev, Megiddo, Nimrod
Since the 1990s, considerable empirical work has been carried out to train statistical models, such as neural networks (NNs), as learned heuristics for combinatorial optimization (CO) problems. When successful, such an approach eliminates the need for experts to design heuristics per problem type. Due to their structure, many hard CO problems are amenable to treatment through reinforcement learning (RL). Indeed, we find a wealth of literature training NNs using value-based, policy gradient, or actor-critic approaches, with promising results, both in terms of empirical optimality gaps and inference runtimes. Nevertheless, there has been a paucity of theoretical work undergirding the use of RL for CO problems. To this end, we introduce a unified framework to model CO problems through Markov decision processes (MDPs) and solve them using RL techniques. We provide easy-to-test assumptions under which CO problems can be formulated as equivalent undiscounted MDPs that provide optimal solutions to the original CO problems. Moreover, we establish conditions under which value-based RL techniques converge to approximate solutions of the CO problem with a guarantee on the associated optimality gap. Our convergence analysis provides: (1) a sufficient rate of increase in batch size and projected gradient descent steps at each RL iteration; (2) the resulting optimality gap in terms of problem parameters and targeted RL accuracy; and (3) the importance of a choice of state-space embedding. Together, our analysis illuminates the success (and limitations) of the celebrated deep Q-learning algorithm in this problem context.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > France (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.84)