We introduce GSimCNN (Graph Similarity Computation via Convolutional Neural Networks) for predicting the similarity score between two graphs. As the core operation of graph similarity search, pairwise graph similarity computation is a challenging problem due to the NPhard nature of computing many graph distance/similarity metrics. We demonstrate our model using the Graph Edit Distance (GED)  as the example metric. It is defined as the number of edit operations in the optimal alignments that transform one graph into the other, where an edit operation can be an insertion or a deletion of a node/edge, or relabelling of a node. It is NPhard  and costly to compute in practice . The key idea is to turn the pairwise graph distance computation problem into a learning problem.
Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity/distance computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, which is usually very costly to compute. Inspired by the recent success of neural network approaches to several graph applications, such as node classification and graph classification, we propose a novel neural network-based approach to address this challenging while classical graph problem, with the hope to alleviate the computational burden while preserving a good performance. Our model generalizes to unseen graphs, and in the worst case runs in linear time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error and great time reduction compared against several approximate algorithms on GED computation. To the best of our knowledge, we are among the first to adopt neural networks to model the similarity between two graphs, and provide a new direction for future research on graph similarity computation and graph similarity search.
Traditional semantic similarity models often fail to encapsulate the external context in which texts are situated. However, textual datasets generated on mobile platforms can help us build a truer representation of semantic similarity by introducing multimodal data. This is especially important in sparse datasets, making solely text-driven interpretation of context more difficult. In this paper, we develop new algorithms for building external features into sentence embeddings and semantic similarity scores. Then, we test them on embedding spaces on data from Twitter, using each tweet's time and geolocation to better understand its context. Ultimately, we show that applying PCA with eight components to the embedding space and appending multimodal features yields the best outcomes. This yields a considerable improvement over pure text-based approaches for discovering similar tweets. Our results suggest that our new algorithm can help improve semantic understanding in various settings.
In this paper, we propose a scalable algorithm for spectral embedding. The latter is a standard tool for graph clustering. However, its computational bottleneck is the eigendecomposition of the graph Laplacian matrix, which prevents its application to large-scale graphs. Our contribution consists of reformulating spectral embedding so that it can be solved via stochastic optimization. The idea is to replace the orthogonality constraint with an orthogonalization matrix injected directly into the criterion. As the gradient can be computed through a Cholesky factorization, our reformulation allows us to develop an efficient algorithm based on mini-batch gradient descent. Experimental results, both on synthetic and real data, confirm the efficiency of the proposed method in term of execution speed with respect to similar existing techniques.
We present a system which constructs a topological map of an environment given a sequence of images. This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positions of local features simultaneously. Additionally an MRF is constructed to model the probability of loop-closures. A locally optimal labeling is found using Loopy-BP. Finally we outline a method to generate a topological map from loop closure data. Results are presented on four urban sequences and one indoor sequence.