dl 2
- North America > United States > Virginia (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Workflow (0.68)
- Government > Regional Government (0.46)
- Energy (0.45)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Preference Based Adaptation for Learning Objectives
In many real-world learning tasks, it is hard to directly optimize the true performance measures, meanwhile choosing the right surrogate objectives is also difficult. Under this situation, it is desirable to incorporate an optimization of objective process into the learning loop based on weak modeling of the relationship between the true measure and the objective. In this work, we discuss the task of objective adaptation, in which the learner iteratively adapts the learning objective to the underlying true objective based on the preference feedback from an oracle. We show that when the objective can be linearly parameterized, this preference based learning problem can be solved by utilizing the dueling bandit model.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Workflow (0.68)
- Government > Regional Government (0.46)
- Energy (0.45)
Supplementary Material A Stochastic Bilevel Optimizer PZOBO-S
We present the algorithm specification for our proposed stochastic bilevel optimizer PZOBO-S.Algorithm 2 Stochastic PZOBO algorithm (PZOBO-S) For the experiments in Sections 4.1 and 4.4, the bilevel problems are relatively simpler with quadratic It can be checked that the strong-convexity, smoothness properties are satisfied. For the experiments that involve neural networks, e.g., in deep hyper-representation (Section 4.2) and in meta-learning (Section 4.3), the lower-level problem optimizes Second, the estimator in DARTS uses an outer gradient difference evaluated at points with a gap of the inner gradient. The batch size is fixed to 128 for both methods. E.1 Specifications on Baseline Bilevel Approaches in Section 4.1 We compare our algorithm PZOBO with the following baseline methods: 16 Figure 8: PZOBO with different choices of Q for HR with two-layer net. We use the following hyperparameters for all compared methods.