Not enough data to create a plot.
Try a different view from the menu above.
Wang, Cheng
ConTIG: Continuous Representation Learning on Temporal Interaction Graphs
Yan, Xu, Fan, Xiaoliang, Yang, Peizhen, Wu, Zonghan, Pan, Shirui, Chen, Longbiao, Zang, Yu, Wang, Cheng
Representation learning on temporal interaction graphs (TIG) is to model complex networks with the dynamic evolution of interactions arising in a broad spectrum of problems. Existing dynamic embedding methods on TIG discretely update node embeddings merely when an interaction occurs. They fail to capture the continuous dynamic evolution of embedding trajectories of nodes. In this paper, we propose a two-module framework named ConTIG, a continuous representation method that captures the continuous dynamic evolution of node embedding trajectories. With two essential modules, our model exploit three-fold factors in dynamic networks which include latest interaction, neighbor features and inherent characteristics. In the first update module, we employ a continuous inference block to learn the nodes' state trajectories by learning from time-adjacent interaction patterns between node pairs using ordinary differential equations. In the second transform module, we introduce a self-attention mechanism to predict future node embeddings by aggregating historical temporal interaction information. Experiments results demonstrate the superiority of ConTIG on temporal link prediction, temporal node recommendation and dynamic node classification tasks compared with a range of state-of-the-art baselines, especially for long-interval interactions prediction.
Detecting Mitosis against Domain Shift using a Fused Detector and Deep Ensemble Classification Model for MIDOG Challenge
Liang, Jingtang, Wang, Cheng, Cheng, Yujie, Wang, Zheng, Wang, Fang, Huang, Liyu, Yu, Zhibin, Wang, Yubo
Mitotic figure count is an important marker of tumor proliferation and has been shown to be associated with patients' prognosis. Deep learning based mitotic figure detection methods have been utilized to automatically locate the cell in mitosis using hematoxylin \& eosin (H\&E) stained images. However, the model performance deteriorates due to the large variation of color tone and intensity in H\&E images. In this work, we proposed a two stage mitotic figure detection framework by fusing a detector and a deep ensemble classification model. To alleviate the impact of color variation in H\&E images, we utilize both stain normalization and data augmentation, aiding model to learn color irrelevant features. The proposed model obtains an F1 score of 0.7550 on the preliminary testing set released by the MIDOG challenge.
Handling Long-Tail Queries with Slice-Aware Conversational Systems
Wang, Cheng, Kim, Sun, Park, Taiwoo, Choudhary, Sajal, Park, Sunghyun, Kim, Young-Bum, Sarikaya, Ruhi, Lee, Sungjin
We have been witnessing the usefulness of conversational AI systems such as Siri and Alexa, directly impacting our daily lives. These systems normally rely on machine learning models evolving over time to provide quality user experience. However, the development and improvement of the models are challenging because they need to support both high (head) and low (tail) usage scenarios, requiring fine-grained modeling strategies for specific data subsets or slices. In this paper, we explore the recent concept of slice-based learning (SBL) (Chen et al., 2019) to improve our baseline conversational skill routing system on the tail yet critical query traffic. We first define a set of labeling functions to generate weak supervision data for the tail intents. We then extend the baseline model towards a slice-aware architecture, which monitors and improves the model performance on the selected tail intents. Applied to de-identified live traffic from a commercial conversational AI system, our experiments show that the slice-aware model is beneficial in improving model performance for the tail intents while maintaining the overall performance.
Neighbor Embedding Variational Autoencoder
Tu, Renfei, Liu, Yang, Xue, Yongzeng, Wang, Cheng, Guo, Maozu
Being one of the most popular generative framework, variational autoencoders(VAE) are known to suffer from a phenomenon termed posterior collapse, i.e. the latent variational distributions collapse to the prior, especially when a strong decoder network is used. In this work, we analyze the latent representation of collapsed VAEs, and proposed a novel model, neighbor embedding VAE(NE-VAE), which explicitly constraints the encoder to encode inputs close in the input space to be close in the latent space. We observed that for VAE variants that report similar ELBO, KL divergence or even mutual information scores may still behave quite differently in the latent organization. In our experiments, NE-VAE can produce qualitatively different latent representations with majority of the latent dimensions remained active, which may benefit downstream latent space optimization tasks. NE-VAE can prevent posterior collapse to a much greater extent than it's predecessors, and can be easily plugged into any autoencoder framework, without introducing addition model components and complex training routines.
State-Regularized Recurrent Neural Networks
Wang, Cheng, Niepert, Mathias
Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, it is difficult to understand what exactly they learn. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) nonregular languages such as balanced parentheses, palindromes, and the copy task where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition, and language modeling. We show that state-regularization (a) simplifies the extraction of finite state automata modeling an RNN's state transition dynamics; (b) forces RNNs to operate more like automata with external memory and less like finite state machines; (c) makes RNNs have better interpretability and explainability.
LRMM: Learning to Recommend with Missing Modalities
Wang, Cheng, Niepert, Mathias, Li, Hui
Multimodal learning has shown promising performance in content-based recommendation due to the auxiliary user and item information of multiple modalities such as text and images. However, the problem of incomplete and missing modality is rarely explored and most existing methods fail in learning a recommendation model with missing or corrupted modalities. In this paper, we propose LRMM, a novel framework that mitigates not only the problem of missing modalities but also more generally the cold-start problem of recommender systems. We propose modality dropout (m-drop) and a multimodal sequential autoencoder (m-auto) to learn multimodal representations for complementing and imputing missing modalities. Extensive experiments on real-world Amazon data show that LRMM achieves state-of-the-art performance on rating prediction tasks. More importantly, LRMM is more robust to previous methods in alleviating data-sparsity and the cold-start problem.
Auto-Annotation of 3D Objects via ImageNet
Luo, Huan (Xiamen University) | Wang, Cheng (Xiamen University) | Li, Jonathan (Xiamen University)
Automatic annotation of 3D objects in cluttered scenes shows its great importance to a variety of applications. Nowadays, 3D point clouds, a new 3D representation of real-world objects, can be easily and rapidly collected by mobile LiDAR systems, e.g. RIEGL VMX-450 system. Moreover, the mobile LiDAR system can also provide a series of consecutive multi-view images which are calibrated with 3D point clouds. This paper proposes to automatically annotate 3D objects of interest in point clouds of road scenes by exploiting a multitude of annotated images in image databases, such as LabelMe and ImageNet. In the proposed method, an object detector trained on the annotated images is used to locate the object regions in acquired multi-view images. Then, based on the correspondences between multi-view images and 3D point clouds, a probabilistic graphical model is used to model the temporal, spatial and geometric constraints to extract the 3D objects automatically. A new dataset was built for evaluation and the experimental results demonstrate a satisfied performance on 3D object extraction.
Towards Domain Adaptive Vehicle Detection in Satellite Image by Supervised Super-Resolution Transfer
Cao, Liujuan (Xiamen University) | Ji, Rongrong (Xiamen University) | Wang, Cheng (Xiamen University) | Li, Jonathan (Xiamen University)
Vehicle detection in satellite image has attracted extensive research attentions with various emerging applications.However, the detector performance has been significantly degenerated due to the low resolutions of satellite images, as well as the limited training data.In this paper, a robust domain-adaptive vehicle detection framework is proposed to bypass both problems.Our innovation is to transfer the detector learning to the high-resolution aerial image domain,where rich supervision exists and robust detectors can be trained.To this end, we first propose a super-resolution algorithm using coupled dictionary learning to ``augment'' the satellite image region being tested into the aerial domain.Notably, linear detection loss is embedded into the dictionary learning, which enforces the augmented region to be sensitive to the subsequent detector training.Second, to cope with the domain changes, we propose an instance-wised detection using Exemplar Support Vector Machines (E-SVMs), which well handles the intra-class and imaging variations like scales, rotations, and occlusions.With comprehensive experiments on large-scale satellite image collections, we demonstrate that the proposed framework can significantly boost the detection accuracy over several state-of-the-arts.