Not enough data to create a plot.
Try a different view from the menu above.
Yang, Cheng
Fast Graph Metric Learning via Gershgorin Disc Alignment
Yang, Cheng, Cheung, Gene, Hu, Wei
We propose a fast general projection-free metric learning framework, where the minimization objective $\min_{\M \in \cS} Q(\M)$ is a convex differentiable function of the metric matrix $\M$, and $\M$ resides in the set $\cS$ of generalized graph Laplacian matrices for connected graphs with positive edge weights and node degrees. Unlike low-rank metric matrices common in the literature, $\cS$ includes the important positive-diagonal-only matrices as a special case in the limit. The key idea for fast optimization is to rewrite the positive definite cone constraint in $\cS$ as signal-adaptive linear constraints via Gershgorin disc alignment, so that the alternating optimization of the diagonal and off-diagonal terms in $\M$ can be solved efficiently as linear programs via Frank-Wolfe iterations. We prove that the Gershgorin discs can be aligned perfectly using the first eigenvector $\v$ of $\M$, which we update iteratively using Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) with warm start as diagonal / off-diagonal terms are optimized. Experiments show that our efficiently computed graph metric matrices outperform metrics learned using competing methods in terms of classification tasks.
GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification
Zhou, Jie, Han, Xu, Yang, Cheng, Liu, Zhiyuan, Wang, Lifeng, Li, Changcheng, Sun, Maosong
Fact verification (FV) is a challenging task which requires to retrieve relevant evidence from plain text and use the evidence to verify given claims. Many claims require to simultaneously integrate and reason over several pieces of evidence for verification. However, previous work employs simple models to extract information from evidence without letting evidence communicate with each other, e.g., merely concatenate the evidence for processing. Therefore, these methods are unable to grasp sufficient relational and logical information among the evidence. To alleviate this issue, we propose a graph-based evidence aggregating and reasoning (GEAR) framework which enables information to transfer on a fully-connected evidence graph and then utilizes different aggregators to collect multi-evidence information. We further employ BERT, an effective pre-trained language representation model, to improve the performance. Experimental results on a large-scale benchmark dataset FEVER have demonstrated that GEAR could leverage multi-evidence information for FV and thus achieves the promising result with a test FEVER score of 67.10%. Our code is available at https://github.com/thunlp/GEAR.
A Unified Framework for Marketing Budget Allocation
Zhao, Kui, Hua, Junhao, Yan, Ling, Zhang, Qi, Xu, Huan, Yang, Cheng
While marketing budget allocation has been studied for decades in traditional business, nowadays online business brings much more challenges due to the dynamic environment and complex decision-making process. In this paper, we present a novel unified framework for marketing budget allocation. By leveraging abundant data, the proposed data-driven approach can help us to overcome the challenges and make more informed decisions. In our approach, a semi-black-box model is built to forecast the dynamic market response and an efficient optimization method is proposed to solve the complex allocation task. First, the response in each market-segment is forecasted by exploring historical data through a semi-black-box model, where the capability of logit demand curve is enhanced by neural networks. The response model reveals relationship between sales and marketing cost. Based on the learned model, budget allocation is then formulated as an optimization problem, and we design efficient algorithms to solve it in both continuous and discrete settings. Several kinds of business constraints are supported in one unified optimization paradigm, including cost upper bound, profit lower bound, or ROI lower bound. The proposed framework is easy to implement and readily to handle large-scale problems. It has been successfully applied to many scenarios in Alibaba Group. The results of both offline experiments and online A/B testing demonstrate its effectiveness.
Graph Neural Networks: A Review of Methods and Applications
Zhou, Jie, Cui, Ganqu, Zhang, Zhengyan, Yang, Cheng, Liu, Zhiyuan, Sun, Maosong
Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require that a model learns from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth. Although the primitive GNNs have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on graph convolutional network (GCN) and gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.
Learning and Transferring IDs Representation in E-commerce
Zhao, Kui, Li, Yuechuan, Shuai, Zhaoqian, Yang, Cheng
Many machine intelligence techniques are developed in E-commerce and one of the most essential components is the representation of IDs, including user ID, item ID, product ID, store ID, brand ID, category ID etc. The classical encoding based methods (like one-hot encoding) are inefficient in that it suffers sparsity problems due to its high dimension, and it cannot reflect the relationships among IDs, either homogeneous or heterogeneous ones. In this paper, we propose an embedding based framework to learn and transfer the representation of IDs. As the the implicit feedbacks of users, a tremendous amount of item ID sequences can be easily collected from the interactive sessions. By jointly using these informative sequences and the structural connections among IDs, all types of IDs can be embedded into one low-dimensional semantic space. Subsequently, the learned representations are utilized and transferred in four scenarios: (i) measuring the similarity between items, (ii) transferring from seen items to unseen items, (iii) transferring across different domains, (iv) transferring across different tasks. We deploy and evaluate the proposed approach in Hema App and the results validate its effectiveness.
Chinese LIWC Lexicon Expansion via Hierarchical Classification of Word Embeddings with Sememe Attention
Zeng, Xiangkai (Beihang University) | Yang, Cheng (Tsinghua University) | Tu, Cunchao (Tsinghua University) | Liu, Zhiyuan (Tsinghua University) | Sun, Maosong (Tsinghua University)
Linguistic Inquiry and Word Count (LIWC) is a word counting software tool which has been used for quantitative text analysis in many fields. Due to its success and popularity, the core lexicon has been translated into Chinese and many other languages. However, the lexicon only contains several thousand of words, which is deficient compared with the number of common words in Chinese. Current approaches often require manually expanding the lexicon, but it often takes too much time and requires linguistic experts to extend the lexicon. To address this issue, we propose to expand the LIWC lexicon automatically. Specifically, we consider it as a hierarchical classification problem and utilize the Sequence-to-Sequence model to classify words in the lexicon. Moreover, we use the sememe information with the attention mechanism to capture the exact meanings of a word, so that we can expand a more precise and comprehensive lexicon. The experimental results show that our model has a better understanding of word meanings with the help of sememes and achieves significant and consistent improvements compared with the state-of-the-art methods. The source code of this paper can be obtained from https://github.com/thunlp/Auto_CLIWC.
PRNN: Recurrent Neural Network with Persistent Memory
Zhao, Kui, Li, Yuechuan, Zhang, Chi, Yang, Cheng, Zhu, Shenghuo
Although Recurrent Neural Network (RNN) has been a powerful tool for modeling sequential data, its performance is inadequate when processing sequences with multiple patterns. In this paper, we address this challenge by introducing an external memory and constructing a novel persistent memory augmented RNN (termed as PRNN). The PRNN captures the principle patterns in training sequences and stores them in an external memory. By leveraging the persistent memory, the proposed method can adaptively update states according to the similarities between encoded inputs and memory slots, leading to a stronger capacity in assimilating sequences with multiple patterns. Content-based addressing is suggested in memory accessing, and gradient descent is utilized for implicitly updating the memory. Our approach can be further extended by combining the prior knowledge of data. Experiments on several datasets demonstrate the effectiveness of the proposed method.