Goto

Collaborating Authors

 reduction






Sample Complexity of Forecast Aggregation

Neural Information Processing Systems

We consider a Bayesian forecast aggregation model where n experts, after observing private signals about an unknown binary event, report th eir posterior beliefs about the event to a principal, who then aggregates the repor ts into a single prediction for the event. The signals of the experts and the outcome of the event follow a joint distribution that is unknown to the principal, but th e principal has access to i.i.d. "samples" from the distribution, where each sampl e is a tuple of the experts' reports (not signals) and the realization of the even t. Using these samples, the principal aims to find an ε -approximately optimal aggregator, where optimal-ity is measured in terms of the expected squared distance bet ween the aggregated prediction and the realization of the event.





SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities

Neural Information Processing Systems

Many approaches in machine learning rely on a weighted graph to encode thesimilarities between samples in a dataset. Entropic affinities (EAs), which are notably used in the popular Dimensionality Reduction (DR) algorithm t-SNE, are particular instances of such graphs. To ensure robustness to heterogeneous sampling densities, EAs assign a kernel bandwidth parameter to every sample in such a way that the entropy of each row in the affinity matrix is kept constant at a specific value, whose exponential is known as perplexity. EAs are inherently asymmetric and row-wise stochastic, but they are used in DR approaches after undergoing heuristic symmetrization methods that violate both the row-wise constant entropy and stochasticity properties. In this work, we uncover a novel characterization of EA as an optimal transport problem, allowing a natural symmetrization that can be computed efficiently using dual ascent. The corresponding novel affinity matrix derives advantages from symmetric doubly stochastic normalization in terms of clustering performance, while also effectively controlling the entropy of each row thus making it particularly robust to varying noise levels. Following, we present a new DR algorithm, SNEkhorn, that leverages this new affinity matrix. We show its clear superiority to state-of-the-art approaches with several indicators on both synthetic and real-world datasets.


On Approximate Computation of Critical Points

Ahmadi, Amir Ali, Hall, Georgina

arXiv.org Machine Learning

We show that computing even very coarse approximations of critical points is intractable for simple classes of nonconvex functions. More concretely, we prove that if there exists a polynomial-time algorithm that takes as input a polynomial in $n$ variables of constant degree (as low as three) and outputs a point whose gradient has Euclidean norm at most $2^n$ whenever the polynomial has a critical point, then P=NP. The algorithm is permitted to return an arbitrary point when no critical point exists. We also prove hardness results for approximate computation of critical points under additional structural assumptions, including settings in which existence and uniqueness of a critical point are guaranteed, the function is lower bounded, and approximation is measured in terms of distance to a critical point. Overall, our results stand in contrast to the commonly-held belief that, in nonconvex optimization, approximate computation of critical points is a tractable task.