Representation Of Examples
Review for NeurIPS paper: Provably adaptive reinforcement learning in metric spaces
This paper is about model-free RL where the state-action state is a metric space. An improved analysis of an existing algorithm (with some modifications) is shown to achieve a regret that scales with the zooming dimension of the metric space, instead of the covering dimesion. A general consensus among reviewers emerged that this theoretical RL paper is well executed, and provides a reasonable though not groundbreaking contribution to the RL literature.
Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
Except the presence of each edge is probabilistic than deterministic, the core idea is quite similar to Isomap. The novelty should be better addressed by comparing to Isomap. For example, edges between words that frequently co-occur in the same contexts are not independent to each other. Edges between pixels in small coherent regions are not independent. Do we eventually need to know such dependency structures a priori to correctly represent arbitrary geometry in the data?
Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
The paper proposed a quite interesting idea of representing data by weighted graphs (shortest path between nodes). Reviewers have raised concerns on edge dependency and the given similarity metric. However, I'm less worried about making the independence assumption because after all, it's a model, and it seems to work well in experiments. Likewise, it is also common in variational inference to use independent distribution to approximate a graphical model, based on which learning is carried out. What interests me more is the general methodology of optimization.
Reviews: Linear Relaxations for Finding Diverse Elements in Metric Spaces
Although the provided novel algorithm looks impressive both from the theoretical prospective and in the experimental comparison, its substantiation has quite some room for improvement. The major point is the proof of Theorem 1: - it is unclear how the proof of the theorem follows from Lemmas 3 and 4, since none of these lemmas is related to the optimal solution of the considered diversity problem. I assume that the missing proposition is the one, which would establish connection between the considered linear program in lines 153-154 (by the way, it is very uncomfortable that the main formulation is not numbered and therefore can not be easily referenced) and the diversity problem. I believe that this connection may have the following format: if the linear program is equipped with integrality constraints (which is, all variables x_{ir}\in {0,1}), the resulting ILP is equivalent to the considered diversity problem. Indeed, the proof of such a proposition is not obvious for me as well.
Reviews: Active Nearest-Neighbor Learning in Metric Spaces
I am not qualified to evaluate this work in term of its relevance within the literature. Therefore my judgment is only about the paper content itself. Also, I have only reviewed the proofs contained in the main paper the one of Lemma A.1. Theorem 3.2 guarantees a significant improvement upon the passive learner characterized by 3.1. I find the example in line2 141-143 about the 1/sqrt(m) order very helpful and I suggest the authors to include it in the introduction as well.
A Metric Topology of Deep Learning for Data Classification
Wu, Jwo-Yuh, Huang, Liang-Chi, Li, Wen-Hsuan, Liu, Chun-Hung
Empirically, Deep Learning (DL) has demonstrated unprecedented success in practical applications. However, DL remains by and large a mysterious "black-box", spurring recent theoretical research to build its mathematical foundations. In this paper, we investigate DL for data classification through the prism of metric topology. Considering that conventional Euclidean metric over the network parameter space typically fails to discriminate DL networks according to their classification outcomes, we propose from a probabilistic point of view a meaningful distance measure, whereby DL networks yielding similar classification performances are close. The proposed distance measure defines such an equivalent relation among network parameter vectors that networks performing equally well belong to the same equivalent class. Interestingly, our proposed distance measure can provably serve as a metric on the quotient set modulo the equivalent relation. Then, under quite mild conditions it is shown that, apart from a vanishingly small subset of networks likely to predict non-unique labels, our proposed metric space is compact, and coincides with the well-known quotient topological space. Our study contributes to fundamental understanding of DL, and opens up new ways of studying DL using fruitful metric space theory.
Is magnitude 'generically continuous' for finite metric spaces?
Katsumasa, Hirokazu, Roff, Emily, Yoshinaga, Masahiko
Magnitude is a real-valued invariant of metric spaces which, in the finite setting, can be understood as recording the 'effective number of points' in a space as the scale of the metric varies. Motivated by applications in topological data analysis, this paper investigates the stability of magnitude: its continuity properties with respect to the Gromov-Hausdorff topology. We show that magnitude is nowhere continuous on the Gromov-Hausdorff space of finite metric spaces. Yet, we find evidence to suggest that it may be 'generically continuous', in the sense that generic Gromov-Hausdorff limits are preserved by magnitude. We make the case that, in fact, 'generic stability' is what matters for applicability.
A cohomology-based Gromov-Hausdorff metric approach for quantifying molecular similarity
Wee, JunJie, Gong, Xue, Tuschmann, Wilderich, Xia, Kelin
We introduce, for the first time, a cohomology-based Gromov-Hausdorff ultrametric method to analyze 1-dimensional and higher-dimensional (co)homology groups, focusing on loops, voids, and higher-dimensional cavity structures in simplicial complexes, to address typical clustering questions arising in molecular data analysis. The Gromov-Hausdorff distance quantifies the dissimilarity between two metric spaces. In this framework, molecules are represented as simplicial complexes, and their cohomology vector spaces are computed to capture intrinsic topological invariants encoding loop and cavity structures. These vector spaces are equipped with a suitable distance measure, enabling the computation of the Gromov-Hausdorff ultrametric to evaluate structural dissimilarities. We demonstrate the methodology using organic-inorganic halide perovskite (OIHP) structures. The results highlight the effectiveness of this approach in clustering various molecular structures. By incorporating geometric information, our method provides deeper insights compared to traditional persistent homology techniques.
The Magnitude of Categories of Texts Enriched by Language Models
Bradley, Tai-Danae, Vigneaux, Juan Pablo
The purpose of this article is twofold. Firstly, we use the next-token probabilities given by a language model to explicitly define a $[0,1]$-enrichment of a category of texts in natural language, in the sense of Bradley, Terilla, and Vlassopoulos. We consider explicitly the terminating conditions for text generation and determine when the enrichment itself can be interpreted as a probability over texts. Secondly, we compute the M\"obius function and the magnitude of an associated generalized metric space $\mathcal{M}$ of texts using a combinatorial version of these quantities recently introduced by Vigneaux. The magnitude function $f(t)$ of $\mathcal{M}$ is a sum over texts $x$ (prompts) of the Tsallis $t$-entropies of the next-token probability distributions $p(-|x)$ plus the cardinality of the model's possible outputs. The derivative of $f$ at $t=1$ recovers a sum of Shannon entropies, which justifies seeing magnitude as a partition function. Following Leinster and Schulman, we also express the magnitude function of $\mathcal M$ as an Euler characteristic of magnitude homology and provide an explicit description of the zeroeth and first magnitude homology groups.