induction hypothesis
Convexity in Disguise: A Theoretical Framework for Nonconvex Low-Rank Matrix Estimation
Nonconvex methods have emerged as a dominant approach for low-rank matrix estimation, a problem that arises widely in machine learning and AI for learning and representing high-dimensional data. Existing analyses for these methods often require additional regularization to mitigate nonconvexity, even though such regularization is often unnecessary in practice. Moreover, most analyses rely on problem-specific arguments that are difficult to generalize to more complex settings. In this paper, we develop a theoretical framework for studying nonconvex procedures across a broad class of low-rank matrix estimation problems. Rather than focusing on a specific model, we reveal a fundamental mechanism that explains why nonconvex procedures can behave well in low-rank estimation. Our key device is a {\it benign regularizer} that does not alter the original update rule, but yields an equivalent locally strongly convex formulation of the algorithm. This perspective uncovers a disguised convexity inherent in the nonconvex procedure and provides a new route to theoretical guarantees for nonconvex low-rank matrix estimation.
Fitting trees to ℓ1-hyperbolic distances
Building trees to represent or to fit distances is a critical component of phylogenetic analysis, metric embeddings, approximation algorithms, geometric graph neural nets, and the analysis of hierarchical data. Much of the previous algorithmic work, however, has focused on generic metric spaces (i.e., those with no a priori constraints). Leveraging several ideas from the mathematical analysis of hyperbolic geometry and geometric group theory, we study the tree fitting problem as finding the relation between the hyperbolicity (ultrametricity) vector and the error of tree (ultrametric) embedding. That is, we define a vector of hyperbolicity (ultrametric) values over all triples of points and compare the ℓp norms of this vector with the ℓq norm of the distortion of the best tree fit to the distances. This formulation allows us to define the average hyperbolicity (ultrametricity) in terms of a normalized ℓ1 norm of the hyperbolicity vector. Furthermore, we can interpret the classical tree fitting result of Gromov as a p = q = result. We present an algorithm HCCROOTEDTREEFIT such that the ℓ1 error of the output embedding is analytically bounded in terms of the ℓ1 norm of the hyperbolicity vector (i.e., p = q = 1) and that this result is tight. Furthermore, this algorithm has significantly different theoretical and empirical performance as compared to Gromov's result and related algorithms.
is as powerful as CWL with the generalised update rule HASH ct,ctB(),ctC(),ct# (),ct " ()
A.1 Cellular WLResults In this section, we assume basic familiarity with the WL test and its higher-order variants. For an introduction to these topics, we refer the reader to the survey of Sato [62]. We begin by introducing a few useful concepts. A cellular colouring is a map c that maps a cell complex X and one of its cells to a colour from a fixed colour palette. Let X,Y be two regular cell complexes and c a cellular colouring. We say that X,Y are c-similar, denoted by cX = cY, if the number of cells in X coloured with a given colour equals the number of cells in Y with the same colour. Otherwise, we have cX 6= cY . We emphasise that in this paper we are interested only in colourings c with the property that any two isomorphic cell complexes are c-similar. A cellular colouring c refines a cellular colouring d, denoted by c v d, if for all cell complexes X and Y and all 2 PX and 2 PY, cX = cY implies dX = dY . Additionally, if d v c, we say the two colourings are equivalent and we represent it by c d. We state the following result from Bodnar et al. [8] about simplicial colourings, which we translate here directly to cell complexes. The proof is however, identical, and we refer the reader to their work for that. Let X,Y be any regular cellular complexes with A PX and B PY . Consider two cellular colourings c,d such that c v d.
Assumptions and Likelihoods in More Detail
A.1 Notation Let T be a failure time with CDFF. T's survival function is defined by F = 1 F. We denote failure models by FθT. Let C be a censoring time with CDFG, survival function G, and model GθC. Under right-censoring, define U = min(T,C), = 1 [T C] and we observe (Xi,Ui, i). We use G(t) to denote P(C t).
Logical Characterizations of Recurrent Graph Neural Networks with Reals and Floats
In pioneering work from 2019, Barceló and coauthors identified logics that precisely match the expressive power of constant iteration-depth graph neural networks (GNNs) relative to properties definable in first-order logic. In this article, we give exact logical characterizations of recurrent GNNs in two scenarios: (1) in the setting with floating-point numbers and (2) with reals. For floats, the formalism matching recurrent GNNs is a rule-based modal logic with counting, while for reals we use a suitable infinitary modal logic, also with counting. These results give exact matches between logics and GNNs in the recurrent setting without rel-ativising to a background logic in either case, but using some natural assumptions about floating-point arithmetic. Applying our characterizations, we also prove that, relative to graph properties definable in monadic second-order logic (MSO), our infinitary and rule-based logics are equally expressive. This implies that recurrent GNNs with reals and floats have the same expressive power over MSO-definable properties and shows that, for such properties, also recurrent GNNs with reals are characterized by a (finitary!)