to

### A Metric Space for Point Process Excitations

A multivariate Hawkes process enables self- and cross-excitations through a triggering matrix that behaves like an asymmetrical covariance structure, characterizing pairwise interactions between the event types. Full-rank estimation of all interactions is often infeasible in empirical settings. Models that specialize on a spatiotemporal application alleviate this obstacle by exploiting spatial locality, allowing the dyadic relationships between events to depend only on separation in time and relative distances in real Euclidean space. Here we generalize this framework to any multivariate Hawkes process, and harness it as a vessel for embedding arbitrary event types in a hidden metric space. Specifically, we propose a Hidden Hawkes Geometry (HHG) model to uncover the hidden geometry between event excitations in a multivariate point process. The low dimensionality of the embedding regularizes the structure of the inferred interactions. We develop a number of estimators and validate the model by conducting several experiments. In particular, we investigate regional infectivity dynamics of COVID-19 in an early South Korean record and recent Los Angeles confirmed cases. By additionally performing synthetic experiments on short records as well as explorations into options markets and the Ebola epidemic, we demonstrate that learning the embedding alongside a point process uncovers salient interactions in a broad range of applications.

### Minimax in Geodesic Metric Spaces: Sion's Theorem and Algorithms

Determining whether saddle points exist or are approximable for nonconvex-nonconcave problems is usually intractable. We take a step towards understanding certain nonconvex-nonconcave minimax problems that do remain tractable. Specifically, we study minimax problems cast in geodesic metric spaces, which provide a vast generalization of the usual convex-concave saddle point problems. The first main result of the paper is a geodesic metric space version of Sion's minimax theorem; we believe our proof is novel and transparent, as it relies on Helly's theorem only. In our second main result, we specialize to geodesically complete Riemannian manifolds: we devise and analyze the complexity of first-order methods for smooth minimax problems.

### Smoothed Embeddings for Certified Few-Shot Learning

Randomized smoothing is considered to be the state-of-the-art provable defense against adversarial perturbations. However, it heavily exploits the fact that classifiers map input objects to class probabilities and do not focus on the ones that learn a metric space in which classification is performed by computing distances to embeddings of classes prototypes. In this work, we extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. We provide analysis of Lipschitz continuity of such models and derive robustness certificate against $\ell_2$-bounded perturbations that may be useful in few-shot learning scenarios. Our theoretical results are confirmed by experiments on different datasets.

### Theoretical analysis and computation of the sample Frechet mean for sets of large graphs based on spectral information

To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Frechet mean. In this work, we equip a set of graphs with the pseudometric defined by the norm between the eigenvalues of their respective adjacency matrix. Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems for graph-valued data. We describe an algorithm to compute an approximation to the sample Frechet mean of a set of undirected unweighted graphs with a fixed size using this pseudometric.

### Generalized Shape Metrics on Neural Representations

Understanding the operation of biological and artificial networks remains a difficult and important challenge. To identify general principles, researchers are increasingly interested in surveying large collections of networks that are trained on, or biologically adapted to, similar tasks. A standardized set of analysis tools is now needed to identify how network-level covariates -- such as architecture, anatomical brain region, and model organism -- impact neural representations (hidden layer activations). Here, we provide a rigorous foundation for these analyses by defining a broad family of metric spaces that quantify representational dissimilarity. Using this framework we modify existing representational similarity measures based on canonical correlation analysis to satisfy the triangle inequality, formulate a novel metric that respects the inductive biases in convolutional layers, and identify approximate Euclidean embeddings that enable network representations to be incorporated into essentially any off-the-shelf machine learning method. We demonstrate these methods on large-scale datasets from biology (Allen Institute Brain Observatory) and deep learning (NAS-Bench-101). In doing so, we identify relationships between neural representations that are interpretable in terms of anatomical features and model performance.

### DeHIN: A Decentralized Framework for Embedding Large-scale Heterogeneous Information Networks

Modeling heterogeneity by extraction and exploitation of high-order information from heterogeneous information networks (HINs) has been attracting immense research attention in recent times. Such heterogeneous network embedding (HNE) methods effectively harness the heterogeneity of small-scale HINs. However, in the real world, the size of HINs grow exponentially with the continuous introduction of new nodes and different types of links, making it a billion-scale network. Learning node embeddings on such HINs creates a performance bottleneck for existing HNE methods that are commonly centralized, i.e., complete data and the model are both on a single machine. To address large-scale HNE tasks with strong efficiency and effectiveness guarantee, we present \textit{Decentralized Embedding Framework for Heterogeneous Information Network} (DeHIN) in this paper. In DeHIN, we generate a distributed parallel pipeline that utilizes hypergraphs in order to infuse parallelization into the HNE task. DeHIN presents a context preserving partition mechanism that innovatively formulates a large HIN as a hypergraph, whose hyperedges connect semantically similar nodes. Our framework then adopts a decentralized strategy to efficiently partition HINs by adopting a tree-like pipeline. Then, each resulting subnetwork is assigned to a distributed worker, which employs the deep information maximization theorem to locally learn node embeddings from the partition it receives. We further devise a novel embedding alignment scheme to precisely project independently learned node embeddings from all subnetworks onto a common vector space, thus allowing for downstream tasks like link prediction and node classification.

### CORE: A Knowledge Graph Entity Type Prediction Method via Complex Space Regression and Embedding

Research on knowledge graph (KG) construction, completion, inference, and applications has grown rapidly in recent years since it offers a powerful tool for modeling human knowledge in graph forms. Nodes in KGs denote entities and links represent relations between entities. The basic building blocks of KG are entity-relation triples in form of (subject, predicate, object) introduced by the Resource Description Framework (RDF). Learning representations for entities and relations in low dimensional vector spaces is one of the most active research topics in the field. Entity type offers a valuable piece of information to KG learning tasks. Better results in KG-related tasks have been achieved with the help of entity type. For example, TKRL [1] uses a hierarchical type encoder for KG completion by incorporating entity type information. AutoETER [2] adopts a similar approach but encodes the type information with projection matrices. Based on DistMult [3] and ComplEx [4] embedding, [5] propose an improved factorization model without explicit type supervision.

### Distance and Hop-wise Structures Encoding Enhanced Graph Attention Networks

Many works have proven that existing neighbor-averaging Graph Neural Networks cannot efficiently catch structure information, such GNNs cannot even catch degree features in some cases. The reason is intuitive: as the neighbor-averaging GNNs can only combine neighbor's feature vectors for every node, if the neighbor's feature vectors contains no structure information, the hop-wise neighbor-averaging GNNs can only catch degree information at best([1];[2];[3]). So, as an intuitive idea, injecting structure information into feature vectors may improve the performance of GNNs. Numerous works have shown that injecting structure, distance, position or spatial information can significantly improve performance of neighbor-averaging GNNs([4];[5];[6];[7];[8];[9];[10]). However, existing works have their problems. Some of them has very high computation complexity which can not apply to large-scale graph(MotifNet[4]). Some of them simply concatenate structure information with intrinsic feature vector (ID-GNN[6]; P-GNN[8]; DE-GNN[9]), which may confuse the signals of different feature. For example, in ogbn-arxiv dataset, the intrinsic feature is semantic embedding of headline or abstract, which provides total different signal with structure information. Some of them are graph-level-task oriented and only deal with small graph(Graphormer[7]; SubGNN[10]).

### A Gentle Introduction to Vector Space Models

Vector space models are to consider the relationship between data that are represented by vectors. It is popular in information retrieval systems but also useful for other purposes. Generally, this allows us to compare the similarity of two vectors from a geometric perspective. In this tutorial, we will see what is a vector space model and what it can do. A Gentle Introduction to Vector Space Models Photo by liamfletch, some rights reserved.

### Turing approximations, toric isometric embeddings & manifold convolutions

Convolutions are fundamental elements in deep learning architectures. Here, we present a theoretical framework for combining extrinsic and intrinsic approaches to manifold convolution through isometric embeddings into tori. In this way, we define a convolution operator for a manifold of arbitrary topology and dimension. We also explain geometric and topological conditions that make some local definitions of convolutions which rely on translating filters along geodesic paths on a manifold, computationally intractable. A result of Alan Turing from 1938 underscores the need for such a toric isometric embedding approach to achieve a global definition of convolution on computable, finite metric space approximations to a smooth manifold.