anchor direction
Exact Pareto Optimal Search for Multi-Task Learning and Multi-Criteria Decision-Making
Mahapatra, Debabrata, Rajan, Vaibhav
Given multiple non-convex objective functions and objective-specific weights, Chebyshev scalarization (CS) is a well-known approach to obtain an Exact Pareto Optimal (EPO), i.e., a solution on the Pareto front (PF) that intersects the ray defined by the inverse of the weights. First-order optimizers that use the CS formulation to find EPO solutions encounter practical problems of oscillations and stagnation that affect convergence. Moreover, when initialized with a PO solution, they do not guarantee a controlled trajectory that lies completely on the PF. These shortcomings lead to modeling limitations and computational inefficiency in multi-task learning (MTL) and multi-criteria decision-making (MCDM) methods that utilize CS for their underlying non-convex multi-objective optimization (MOO). To address these shortcomings, we design a new MOO method, EPO Search. We prove that EPO Search converges to an EPO solution and empirically illustrate its computational efficiency and robustness to initialization. When initialized on the PF, EPO Search can trace the PF and converge to the required EPO solution at a linear rate of convergence. Using EPO Search we develop new algorithms: PESA-EPO for approximating the PF in a posteriori MCDM, and GP-EPO for preference elicitation in interactive MCDM; experiments on benchmark datasets confirm their advantages over competing alternatives. EPO Search scales linearly with the number of decision variables which enables its use for training deep networks. Empirical results on real data from personalized medicine, e-commerce and hydrometeorology demonstrate the efficacy of EPO Search for deep MTL.
A General Framework for Visualizing Embedding Spaces of Neural Survival Analysis Models Based on Angular Information
We propose a general framework for visualizing any intermediate embedding representation used by any neural survival analysis model. Our framework is based on so-called anchor directions in an embedding space. We show how to estimate these anchor directions using clustering or, alternatively, using user-supplied "concepts" defined by collections of raw inputs (e.g., feature vectors all from female patients could encode the concept "female"). For tabular data, we present visualization strategies that reveal how anchor directions relate to raw clinical features and to survival time distributions. We then show how these visualization ideas extend to handling raw inputs that are images. Our framework is built on looking at angles between vectors in an embedding space, where there could be "information loss" by ignoring magnitude information. We show how this loss results in a "clumping" artifact that appears in our visualizations, and how to reduce this information loss in practice.