Goto

Collaborating Authors

 Inductive Learning


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","157" "Title:","Object Localization based on Structural SVM using Privileged Information" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The method is effective for the object localization task and results in good improvements in localization accuracy. It looks like the authors' formulation of SSVM+ contains separate slack variables \xi_i for each example x_i and there are extra degrees of freedom. How many alternating iterations are required? When the parameter vectors w and w^* are far from the optimal solution, could this alternating inference procedure get stuck in bad local minima?


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","24" "Title:","Communication Efficient Distributed Machine Learning with the Parameter Server" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper presents improvements on a system for large-scale learning known as parameter server. The parameter server is designed to perform reliable distributed machine learning in large-scale industrial systems (1000's of nodes). The architecture is based on a bipartite graph composed by servers and workers. Workers compute gradients based on subsets of the training instances, while servers aggregate the workers' gradients, update the shared parameter vector and redistribute it to the workers for the next iteration.


Learning to Weight Parameters for Training Data Attribution

arXiv.org Artificial Intelligence

We study gradient-based data attribution, aiming to identify which training examples most influence a given output. Existing methods for this task either treat network parameters uniformly or rely on implicit weighting derived from Hessian approximations, which do not fully model functional heterogeneity of network parameters. To address this, we propose a method to explicitly learn parameter importance weights directly from data, without requiring annotated labels. Our approach improves attribution accuracy across diverse tasks, including image classification, language modeling, and diffusion, and enables fine-grained attribution for concepts like subject and style.


SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs

arXiv.org Artificial Intelligence

Large scale pretrained models have revolutionized Natural Language Processing (NLP) and Computer Vision (CV), showcasing remarkable cross domain generalization abilities. However, in graph learning, models are typically trained on individual graph datasets, limiting their capacity to transfer knowledge across different graphs and tasks. This approach also heavily relies on large volumes of annotated data, which presents a significant challenge in resource-constrained settings. Unlike NLP and CV, graph structured data presents unique challenges due to its inherent heterogeneity, including domain specific feature spaces and structural diversity across various applications. To address these challenges, we propose a novel structure aware self supervised learning method for Text Attributed Graphs (SSTAG). By leveraging text as a unified representation medium for graph learning, SSTAG bridges the gap between the semantic reasoning of Large Language Models (LLMs) and the structural modeling capabilities of Graph Neural Networks (GNNs). Our approach introduces a dual knowledge distillation framework that co-distills both LLMs and GNNs into structure-aware multilayer perceptrons (MLPs), enhancing the scalability of large-scale TAGs. Additionally, we introduce an in-memory mechanism that stores typical graph representations, aligning them with memory anchors in an in-memory repository to integrate invariant knowledge, thereby improving the model's generalization ability. Extensive experiments demonstrate that SSTAG outperforms state-of-the-art models on cross-domain transfer learning tasks, achieves exceptional scalability, and reduces inference costs while maintaining competitive performance.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes an incremental but very sensible and practical modification to'curriculum learning'. Given a partition of the training examples into classes, they propose an additional regularising term (and an additional parameter) to ensure that the'easy' examples selected during learning are spread across the classes, and not from one class. The partition into classes can come from a clustering algorithm, or from a priori knowledge. The idea is straightforward and sensible, and the authors propose an algorithm that looks efficient and correct.




Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper focuses on how to choose optimal training examples for people learning to discriminate categories. The authors develop an optimal teacher model that selects training examples in order to minimize generalization error, assuming that people make classification decisions in accordance with the GCM, a widely used categorization model. They test their model with an experiment and find that the best teacher is one that assumes that people have a limited memory capacity that only allows them to retrieve a few previous examples to compare to a new item. This teacher chooses idealized training sets rather than representative ones.