Ding, Hao
Personalized Federated Domain Adaptation for Item-to-Item Recommendation
Fan, Ziwei, Ding, Hao, Deoras, Anoop, Hoang, Trong Nghia
Item-to-Item (I2I) recommendation is an important function in most recommendation systems, which generates replacement or complement suggestions for a particular item based on its semantic similarities to other cataloged items. Given that subsets of items in a recommendation system might be co-interacted with by the same set of customers, graph-based models, such as graph neural networks (GNNs), provide a natural framework to combine, ingest and extract valuable insights from such high-order relational interactions between cataloged items, as well as their metadata features, as has been shown in many recent studies. However, learning GNNs effectively for I2I requires ingesting a large amount of relational data, which might not always be available, especially in new, emerging market segments. To mitigate this data bottleneck, we postulate that recommendation patterns learned from existing mature market segments (with private data) could be adapted to build effective warm-start models for emerging ones. To achieve this, we propose and investigate a personalized federated modeling framework based on GNNs to summarize, assemble and adapt recommendation patterns across market segments with heterogeneous customer behaviors into effective local models. Our key contribution is a personalized graph adaptation model that bridges the gap between recent literature on federated GNNs and (non-graph) personalized federated learning, which either does not optimize for the adaptability of the federated model or is restricted to local models with homogeneous parameterization, excluding GNNs with heterogeneous local graphs.
Twin-S: A Digital Twin for Skull-base Surgery
Shu, Hongchao, Liang, Ruixing, Li, Zhaoshuo, Goodridge, Anna, Zhang, Xiangyu, Ding, Hao, Nagururu, Nimesh, Sahu, Manish, Creighton, Francis X., Taylor, Russell H., Munawar, Adnan, Unberath, Mathias
Purpose: Digital twins are virtual interactive models of the real world, exhibiting identical behavior and properties. In surgical applications, computational analysis from digital twins can be used, for example, to enhance situational awareness. Methods: We present a digital twin framework for skull-base surgeries, named Twin-S, which can be integrated within various image-guided interventions seamlessly. Twin-S combines high-precision optical tracking and real-time simulation. We rely on rigorous calibration routines to ensure that the digital twin representation precisely mimics all real-world processes. Twin-S models and tracks the critical components of skull-base surgery, including the surgical tool, patient anatomy, and surgical camera. Significantly, Twin-S updates and reflects real-world drilling of the anatomical model in frame rate. Results: We extensively evaluate the accuracy of Twin-S, which achieves an average 1.39 mm error during the drilling process. We further illustrate how segmentation masks derived from the continuously updated digital twin can augment the surgical microscope view in a mixed reality setting, where bone requiring ablation is highlighted to provide surgeons additional situational awareness. Conclusion: We present Twin-S, a digital twin environment for skull-base surgery. Twin-S tracks and updates the virtual model in real-time given measurements from modern tracking technologies. Future research on complementing optical tracking with higher-precision vision-based approaches may further increase the accuracy of Twin-S.
Rethinking Causality-driven Robot Tool Segmentation with Temporal Constraints
Ding, Hao, Wu, Jie Ying, Li, Zhaoshuo, Unberath, Mathias
Purpose: Vision-based robot tool segmentation plays a fundamental role in surgical robots and downstream tasks. CaRTS, based on a complementary causal model, has shown promising performance in unseen counterfactual surgical environments in the presence of smoke, blood, etc. However, CaRTS requires over 30 iterations of optimization to converge for a single image due to limited observability. Method: To address the above limitations, we take temporal relation into consideration and propose a temporal causal model for robot tool segmentation on video sequences. We design an architecture named Temporally Constrained CaRTS (TC-CaRTS). TC-CaRTS has three novel modules to complement CaRTS - temporal optimization pipeline, kinematics correction network, and spatial-temporal regularization. Results: Experiment results show that TC-CaRTS requires much fewer iterations to achieve the same or better performance as CaRTS. TC- CaRTS also has the same or better performance in different domains compared to CaRTS. All three modules are proven to be effective. Conclusion: We propose TC-CaRTS, which takes advantage of temporal constraints as additional observability. We show that TC-CaRTS outperforms prior work in the robot tool segmentation task with improved convergence speed on test datasets from different domains.
Context Uncertainty in Contextual Bandits with Applications to Recommender Systems
Wang, Hao, Ma, Yifei, Ding, Hao, Wang, Yuyang
Recurrent neural networks have proven effective in modeling sequential user feedbacks for recommender systems. However, they usually focus solely on item relevance and fail to effectively explore diverse items for users, therefore harming the system performance in the long run. To address this problem, we propose a new type of recurrent neural networks, dubbed recurrent exploration networks (REN), to jointly perform representation learning and effective exploration in the latent space. REN tries to balance relevance and exploration while taking into account the uncertainty in the representations. Our theoretical analysis shows that REN can preserve the rate-optimal sublinear regret even when there exists uncertainty in the learned representations. Our empirical study demonstrates that REN can achieve satisfactory long-term rewards on both synthetic and real-world recommendation datasets, outperforming state-of-the-art models.
Zero-Shot Recommender Systems
Ding, Hao, Ma, Yifei, Deoras, Anoop, Wang, Yuyang, Wang, Hao
Performance of recommender systems (RS) relies heavily on the Many large scale e-commerce platforms (such as Etsy, Overstock, amount of training data available. This poses a chicken-and-egg etc) and online content platforms (such as Spotify, Overstock, Disney, problem for early-stage products, whose amount of data, in turn, Netflix, etc) have such a large inventory of items that showcasing relies on the performance of their RS. On the other hand, zero-shot all of them in front of their users is simply not practical. In learning promises some degree of generalization from an old dataset particular, in the online content category of businesses, it is often to an entirely new dataset. In this paper, we explore the possibility seen that users of their service do not have a crisp intent in mind of zero-shot learning in RS. We develop an algorithm, dubbed ZEro-unlike in the retail shopping experience where the users often have Shot Recommenders (ZESRec), that is trained on an old dataset a crisp intent of purchasing something. The need for personalized and generalize to a new one where there are neither overlapping recommendations therefore arises from the fact that not only it is users nor overlapping items, a setting that contrasts typical crossdomain impractical to show all the items in the catalogue but often times RS that has either overlapping users or items. Different users of such services need help discovering the next best thing from categorical item indices, i.e., item ID, in previous methods, -- be it the new and exciting movie or be it a new music album or ZESRec uses items' natural-language descriptions (or description even a piece of merchandise that they may want to consider for embeddings) as their continuous indices, and therefore naturally future buying if not immediately.
Unsupervised Inductive Whole-Graph Embedding by Preserving Graph Proximity
Bai, Yunsheng, Ding, Hao, Qiao, Yang, Marinovic, Agustin, Gu, Ken, Chen, Ting, Sun, Yizhou, Wang, Wei
Recent years we have witnessed the great popularity of graph representation learning with success in not only node-level tasks such as node classification (Kipf & Welling, 2016a) and link prediction (Zhang & Chen, 2018) but also graph-level tasks such as graph classification (Ying et al., 2018) and graph similarity/distance computation (Bai et al., 2019). There has been a rich body of work (Belkin & Niyogi, 2003; Qiu et al., 2018) on node-level embeddings that turn each node in a graph into a vector preserving node-node proximity (similarity/distance). It is thus natural to raise the question: Can we embed an entire graph into a vector in an unsupervised way, and how? However, most existing methods for graph-level, i.e. whole-graph, embeddings assume a supervised model (Zhang & Chen, 2019), with only a few exceptions such as G KERNELS(Yanardag & Vishwanathan, 2015), which typically count subgraphs for a given graph and can be slow, and GRAPH2VEC(Narayanan et al., 2017), which is transductive. A key challenge facing designing an unsupervised graph-level embedding model is the lack of graphlevel signals in the training stage. Unlike node-level embedding which has a long history in utilizing the link structure of a graph to embed nodes, there lacks such natural proximity (similarity/distance) information between graphs. Supervised methods, therefore, typically resort to graph labels as guidance and use aggregation based methods, e.g. However, this assumption is problematic, as simple aggregation of node embeddings can only preserve limited graph-level properties, which is, however, often insufficient in measuring graphgraph proximity ("inter-graph" information). Inspired by the recent progress on graph proximity modeling (Ktena et al., 2017; Bai et al., 2019), we propose a novel framework, UG RAPHEMBthat employs multi-scale aggregations of node-level embeddings, guided by the graph-graph proximity defined by well-accepted and domain-agnostic graph proximity metrics such as Graph Edit Distance (GED) (Bunke, 1983), Maximum Common Subgraph (MCS) (Bunke & Shearer, 1998), etc. RAPHEMBis to learn high-quality graph-level representations in a completely unsupervised and inductive fashion: During training, it learns a function that maps a graph into a universal embedding space best preserving graph-graph proximity, so that after training, any new graph can be mapped to this embedding space by applying the learned function. Inspired by the recent success of pre-training methods in the text domain, such as ELMO (Peters et al., 2018), B RAPHEMBfirst computes the graph-graph proximity scores, (c) yielding a "hyper-level graph" where each node is a graph in the dataset, and each edge has a proximity score associated with it, representing its weight/strength. RAPHEMBthen trains a function that maps each graph into an embedding which preserves the proximity score.
Convolutional Set Matching for Graph Similarity
Bai, Yunsheng, Ding, Hao, Sun, Yizhou, Wang, Wei
We introduce GSimCNN (Graph Similarity Computation via Convolutional Neural Networks) for predicting the similarity score between two graphs. As the core operation of graph similarity search, pairwise graph similarity computation is a challenging problem due to the NPhard nature of computing many graph distance/similarity metrics. We demonstrate our model using the Graph Edit Distance (GED) [2] as the example metric. It is defined as the number of edit operations in the optimal alignments that transform one graph into the other, where an edit operation can be an insertion or a deletion of a node/edge, or relabelling of a node. It is NPhard [3] and costly to compute in practice [4]. The key idea is to turn the pairwise graph distance computation problem into a learning problem.
Convolutional Neural Networks for Fast Approximation of Graph Edit Distance
Bai, Yunsheng, Ding, Hao, Sun, Yizhou, Wang, Wei
Graph Edit Distance (GED) computation is a core operation of many widely-used graph applications, such as graph classification, graph matching, and graph similarity search. However, computing the exact GED between two graphs is NP-complete. Most current approximate algorithms are based on solving a combinatorial optimization problem, which involves complicated design and high time complexity. In this paper, we propose a novel end-to-end neural network based approach to GED approximation, aiming to alleviate the computational burden while preserving good performance. The proposed approach, named GSimCNN, turns GED computation into a learning problem. Each graph is considered as a set of nodes, represented by learnable embedding vectors. The GED computation is then considered as a two-set matching problem, where a higher matching score leads to a lower GED. A Convolutional Neural Network (CNN) based approach is proposed to tackle the set matching problem. We test our algorithm on three real graph datasets, and our model achieves significant performance enhancement against state-of-the-art approximate GED computation algorithms.
Graph Edit Distance Computation via Graph Neural Networks
Bai, Yunsheng, Ding, Hao, Bian, Song, Chen, Ting, Sun, Yizhou, Wang, Wei
Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity/distance computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, which is usually very costly to compute. Inspired by the recent success of neural network approaches to several graph applications, such as node classification and graph classification, we propose a novel neural network-based approach to address this challenging while classical graph problem, with the hope to alleviate the computational burden while preserving a good performance. Our model generalizes to unseen graphs, and in the worst case runs in linear time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error and great time reduction compared against several approximate algorithms on GED computation. To the best of our knowledge, we are among the first to adopt neural networks to model the similarity between two graphs, and provide a new direction for future research on graph similarity computation and graph similarity search.
On Group Popularity Prediction in Event-Based Social Networks
Li, Guangyu (New York University) | Liu, Yong (New York University) | Ribeiro, Bruno (Purdue University) | Ding, Hao (New York University)
Although previous work has shown that member and structural features are important to the future popularity of groups in EBSN, it is not yet clear how different member roles and the interplay between them contribute to group popularity. In this paper, we study a real-world dataset from Meetup --- a popular EBSN platform --- and propose a deep neural network based method to predict the popularity of new Meetup groups. Our method uses group-level features specific to event-based social networks, such as time and location of events in a group, as well as the structural features internal to a group, such as the inferred member roles in a group and social substructures among members. Empirically, our approach reduces the RMSE of the popularity prediction (measured in RSVPs) of a group's future events by up to 12%, against the state-of-the-art baselines.