Goto

Collaborating Authors

 Supervised Learning


Clone-Resistant Weights in Metric Spaces: A Framework for Handling Redundancy Bias

arXiv.org Artificial Intelligence

We are given a set of elements in a metric space. The distribution of the elements is arbitrary, possibly adversarial. Can we weigh the elements in a way that is resistant to such (adversarial) manipulations? This problem arises in various contexts. For instance, the elements could represent data points, requiring robust domain adaptation. Alternatively, they might represent tasks to be aggregated into a benchmark; or questions about personal political opinions in voting advice applications. This article introduces a theoretical framework for dealing with such problems. We propose clone-proof representation functions as a solution concept. These functions distribute importance across elements of a set such that similar objects (``clones'') share (some of) their weights, thus avoiding a potential bias introduced by their multiplicity. Our framework extends the maximum uncertainty principle to accommodate general metric spaces and includes a set of axioms - symmetry, continuity, and clone-proofness - that guide the construction of representation functions. Finally, we address the existence of representation functions satisfying our axioms in the significant case of Euclidean spaces and propose a general method for their construction.


Reviews: Graph Structured Prediction Energy Networks

Neural Information Processing Systems

Based on structured SVM, the authors combine the structured prediction and the learning using hinge loss, the results is a novel model, Graph Structured Prediction Energy Networks. Overall the model is novel and the theory is mostly solid. However, I have some concerns about the inference part. 1. Marginal Polytope. The relaxation of the marginal polytope is always tricky for structured prediction. A loose relaxation might result in an efficient algorithm, but the bad quality of the solution.


Reviews: Graph Structured Prediction Energy Networks

Neural Information Processing Systems

All the reviewers thought that generalizing the structured prediction energy network (SPEN) to incorporate factored potentials (following graph structure) with proposed approximate inference schemes for structured prediction make a nice contribution to NeurIPS. The extensive experiments were lauded, but concerns were expressed with the theoretical backing of the methods. After discussion and looking at the paper, the AC agrees with R2 that the paper makes an interesting practical contribution, and that the theory could be clarified in follow-up work. The authors should include their timing results as well as additional clarification from the rebuttal in their camera ready version. Additional side notes: - [*] from the rebuttal should be mentioned in the main paper as a way to handle the entropy term over the marginal polytope in a principled manner with Frank-Wolfe.


Invariant Kernels: Rank Stabilization and Generalization Across Dimensions

arXiv.org Machine Learning

Symmetry arises often when learning from high dimensional data. For example, data sets consisting of point clouds, graphs, and unordered sets appear routinely in contemporary applications, and exhibit rich underlying symmetries. Understanding the benefits of symmetry on the statistical and numerical efficiency of learning algorithms is an active area of research. In this work, we show that symmetry has a pronounced impact on the rank of kernel matrices. Specifically, we compute the rank of a polynomial kernel of fixed degree that is invariant under various groups acting independently on its two arguments. In concrete circumstances, including the three aforementioned examples, symmetry dramatically decreases the rank making it independent of the data dimension. In such settings, we show that a simple regression procedure is minimax optimal for estimating an invariant polynomial from finitely many samples drawn across different dimensions. We complete the paper with numerical experiments that illustrate our findings.



Reviews: Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing Systems

The paper is clear, and makes efforts to highlight the behavior of the proposed algorithm (value of the regret bound for some specific settings, experiments measuring the impact of the metric). The comparison to other settings may still be enforced. In the experimental part, I would also appreciate the comparison to include some state of the art algorithms. What would be the empirical results of a gaussian process-based bandit? It would also be interesting to have results on datasets used by other contextual/similarity-based bandits (except that these datasets use a context in R d). Finally, it's surprising to have a context space of dimension 1. Extending the algorithm to R d setting seems strait-forward.


Reviews: Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing Systems

This paper proposed a nonparametric approach to contextual bandits that can adapt to unknown simple structure between the arms. The reviewers found the paper to be novel in how it combined existing ideas to solve an interesting problem. Additionally, the reviewers found the paper to be clearly written and of potential significance for the problem they tacked. Some minor concerns were raised, however, the authors seem to have addressed them in their response. The authors should incorporate any suggested edits by the reviewers as well as the promised updates in the author response.


Review for NeurIPS paper: Fairness constraints can help exact inference in structured prediction

Neural Information Processing Systems

My biggest concern is with the way that \epsilon_1 behaves depending on n. Firstly, it seems the choice of the value -n for \rho is arbitrary (with the choice being repercuted in the definition of \epsilon), and this should be discussed more clearly in the text. Next, it is not clear to me why the choice \rho -n is the best. Does it optimize \epsilon_1 in some way? Furthermore, as n tends to \infty, it seems that \epsilon_1 does NOT tend to infinity.


Review for NeurIPS paper: Fairness constraints can help exact inference in structured prediction

Neural Information Processing Systems

This paper is about structures output predictions analysis under'fairness' constraint. This paper shows that constraints relative to fairness can help to increases accuracies. Fairness is one of the notion whose importance is rising in our community, and this paper give interesting insights about it. One of the main issue raised by one of the reviewer that pleads for non acceptance is the "vagueness" of the definition of fairness here. I personally think that this issue should not be taken to much into account here, there is still in our community some "vagueness" according to what the good definition should be.


Review for NeurIPS paper: Provably adaptive reinforcement learning in metric spaces

Neural Information Processing Systems

Summary and Contributions: This paper studies reinforcement learning (RL) problems on large state and action spaces that are endowed with a metric. They key assumption is that the optimal state-action value function, Q*, is Lipschitz smooth with respect to that metric. The setting is that of an episodic, H-stage Markov decision process, in which the learner must choose an action for each stage of each episode while achieving low regret against an optimal policy. Previously, an algorithm was proposed for this problem based on learning the Q* function on an adaptive discretization of the state-action space that becomes steadily finer on important regions of the space. The regret of this algorithm was thought to depend on the packing number of the state-action space.