Goto

Collaborating Authors

 Supervised Learning


Supplementary Material: Structured Prediction for Conditional Meta-Learning

Neural Information Processing Systems

The Appendix is organized in two main parts: Appendix A proves the formal version of Theorem 1 and provides additional details on the connection between structured prediction and conditional meta-learning investigated in this work. Appendix B provides additional details on the model hyperparameters and additional experimental evaluation. We first recall the general formulation of the structured prediction approach in [13], followed by showing how the conditional meta-learning problem introduced in Section 3 can be cast within this setting. A.1 General Structured Prediction In this section, we borrow from the notation of [13]. Consider X, Y and Z three spaces, respectively the input, label and output sets of our problem.


Review for NeurIPS paper: Structured Prediction for Conditional Meta-Learning

Neural Information Processing Systems

Especially, more task conditioning methods (e.g., MMAML) are considered in this paper. However, my major concern has not been addressed. The authors still ignore the discussion with multi-task learning. From my perspective, the goal for meta-learning is to generalize knowledge from previous tasks, which further benefits the training of a new task. The setting in this paper allows a new meta-testing task to access all meta-training tasks.


Structured Prediction for Conditional Meta-Learning

Neural Information Processing Systems

The goal of optimization-based meta-learning is to find a single initialization shared across a distribution of tasks to speed up the process of learning new tasks. Conditional meta-learning seeks task-specific initialization to better capture complex task distributions and improve performance. However, many existing conditional methods are difficult to generalize and lack theoretical guarantees. In this work, we propose a new perspective on conditional meta-learning via structured prediction. We derive task-adaptive structured meta-learning (TASML), a principled framework that yields task-specific objective functions by weighing meta-training data on target tasks. Our non-parametric approach is model-agnostic and can be combined with existing meta-learning methods to achieve conditioning. Empirically, we show that TASML improves the performance of existing metalearning models, and outperforms the state-of-the-art on benchmark datasets.


Review for NeurIPS paper: Structured Prediction for Conditional Meta-Learning

Neural Information Processing Systems

The reviewers agreed that this paper brings an important and relevant contribution to the NeurIPS community, and presents comprehensive experiments to validate the proposed approach. The authors are strongly encouraged to revise the submitted paper according to the feedback in the reviews, including a discussion of multi-task learning, adding the requested clarifications, and fixing typos.


INDIGO: GNN-Based Inductive Knowledge Graph Completion Using Pair-Wise Encoding

Neural Information Processing Systems

The aim of knowledge graph (KG) completion is to extend an incomplete KG with missing triples. Popular approaches based on graph embeddings typically work by first representing the KG in a vector space, and then applying a predefined scoring function to the resulting vectors to complete the KG. These approaches work well in transductive settings, where predicted triples involve only constants seen during training; however, they are not applicable in inductive settings, where the KG on which the model was trained is extended with new constants or merged with other KGs. The use of Graph Neural Networks (GNNs) has recently been proposed as a way to overcome these limitations; however, existing approaches do not fully exploit the capabilities of GNNs and still rely on heuristics and adhoc scoring functions. In this paper, we propose a novel approach, where the KG is fully encoded into a GNN in a transparent way, and where the predicted triples can be read out directly from the last layer of the GNN without the need for additional components or scoring functions. Our experiments show that our model outperforms state-of-the-art approaches on inductive KG completion benchmarks.


Reviews: Localized Structured Prediction

Neural Information Processing Systems

The model is learned by breaking the structure into parts and performing kernel ridge regression on the parts. They show elaborate convergence rate analysis in the estimation. The theoretical analysis is the strong part of this paper. In a lot of computer vision and NLP applications the latest research is about capturing long range dependencies. The correlation in Figure 1 is highly concentrated at the central patch because it's the average of many different images, but on individual images the correlation patten can be very different.


Reviews: Localized Structured Prediction

Neural Information Processing Systems

The authors propose a general theoretical framework for structured prediction that deals with cases where the data exhibits a local structure, so that the inputs and outputs can be decomposed into parts. The reviewers deemed the theoretical contributions to be of original and of a high quality. The author response addressed the perceived weaknesses, in particular in the empirical evaluation, in a satisfcatory way.


Reviews: Linear Relaxations for Finding Diverse Elements in Metric Spaces

Neural Information Processing Systems

Although the provided novel algorithm looks impressive both from the theoretical prospective and in the experimental comparison, its substantiation has quite some room for improvement. The major point is the proof of Theorem 1: - it is unclear how the proof of the theorem follows from Lemmas 3 and 4, since none of these lemmas is related to the optimal solution of the considered diversity problem. I assume that the missing proposition is the one, which would establish connection between the considered linear program in lines 153-154 (by the way, it is very uncomfortable that the main formulation is not numbered and therefore can not be easily referenced) and the diversity problem. I believe that this connection may have the following format: if the linear program is equipped with integrality constraints (which is, all variables x_{ir}\in {0,1}), the resulting ILP is equivalent to the considered diversity problem. Indeed, the proof of such a proposition is not obvious for me as well.


Linear Relaxations for Finding Diverse Elements in Metric Spaces

Neural Information Processing Systems

Choosing a diverse subset of a large collection of points in a metric space is a fundamental problem, with applications in feature selection, recommender systems, web search, data summarization, etc. Various notions of diversity have been proposed, tailored to different applications. The general algorithmic goal is to find a subset of points that maximize diversity, while obeying a cardinality (or more generally, matroid) constraint. The goal of this paper is to develop a novel linear programming (LP) framework that allows us to design approximation algorithms for such problems. We study an objective known as sum-min diversity, which is known to be effective in many applications, and give the first constant factor approximation algorithm. Our LP framework allows us to easily incorporate additional constraints, as well as secondary objectives. We also prove a hardness result for two natural diversity objectives, under the so-called planted clique assumption. Finally, we study the empirical performance of our algorithm on several standard datasets.