Not enough data to create a plot.
Try a different view from the menu above.
Yang, Min
PI-VEGAN: Physics Informed Variational Embedding Generative Adversarial Networks for Stochastic Differential Equations
Gao, Ruisong, Wang, Yufeng, Yang, Min, Chen, Chuanjun
We present a new category of physics-informed neural networks called physics informed variational embedding generative adversarial network (PI-VEGAN), that effectively tackles the forward, inverse, and mixed problems of stochastic differential equations. In these scenarios, the governing equations are known, but only a limited number of sensor measurements of the system parameters are available. We integrate the governing physical laws into PI-VEGAN with automatic differentiation, while introducing a variational encoder for approximating the latent variables of the actual distribution of the measurements. These latent variables are integrated into the generator to facilitate accurate learning of the characteristics of the stochastic partial equations. Our model consists of three components, namely the encoder, generator, and discriminator, each of which is updated alternatively employing the stochastic gradient descent algorithm. We evaluate the effectiveness of PI-VEGAN in addressing forward, inverse, and mixed problems that require the concurrent calculation of system parameters and solutions. Numerical results demonstrate that the proposed method achieves satisfactory stability and accuracy in comparison with the previous physics-informed generative adversarial network (PI-WGAN).
PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts
Li, Yunshui, Hui, Binyuan, Yin, ZhiChao, Yang, Min, Huang, Fei, Li, Yongbin
Perceiving multi-modal information and fulfilling dialogues with humans is a long-term goal of artificial intelligence. Pre-training is commonly regarded as an effective approach for multi-modal dialogue. However, due to the limited availability of multi-modal dialogue data, there is still scarce research on multi-modal dialogue pre-training. Yet another intriguing challenge emerges from the encompassing nature of multi-modal dialogue, which involves various modalities and tasks. Moreover, new forms of tasks may arise at unpredictable points in the future. Hence, it is essential for designed multi-modal dialogue models to possess sufficient flexibility to adapt to such scenarios. This paper proposes \textbf{PaCE}, a unified, structured, compositional multi-modal dialogue pre-training framework. It utilizes a combination of several fundamental experts to accommodate multiple dialogue-related tasks and can be pre-trained using limited dialogue and extensive non-dialogue multi-modal data. Furthermore, we propose a progressive training method where old experts from the past can assist new experts, facilitating the expansion of their capabilities. Experimental results demonstrate that PaCE achieves state-of-the-art results on eight multi-modal dialog benchmarks.
Speech-Text Dialog Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment
Yu, Tianshu, Gao, Haoyu, Lin, Ting-En, Yang, Min, Wu, Yuchuan, Ma, Wentao, Wang, Chao, Huang, Fei, Li, Yongbin
Recently, speech-text pre-training methods have shown remarkable success in many speech and natural language processing tasks. However, most previous pre-trained models are usually tailored for one or two specific tasks, but fail to conquer a wide range of speech-text tasks. In addition, existing speech-text pre-training methods fail to explore the contextual information within a dialogue to enrich utterance representations. In this paper, we propose Speech-text dialog Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model. Concretely, to consider the temporality of speech modality, we design a novel temporal position prediction task to capture the speech-text alignment. This pre-training task aims to predict the start and end time of each textual word in the corresponding speech waveform. In addition, to learn the characteristics of spoken dialogs, we generalize a response selection task from textual dialog pre-training to speech-text dialog pre-training scenarios. Experimental results on four different downstream speech-text tasks demonstrate the superiority of SPECTRA in learning speech-text alignment and multi-turn dialog context.
A Comprehensive Survey on Deep Learning for Relation Extraction: Recent Advances and New Frontiers
Zhao, Xiaoyan, Deng, Yang, Yang, Min, Wang, Lingzhi, Zhang, Rui, Cheng, Hong, Lam, Wai, Shen, Ying, Xu, Ruifeng
Relation extraction (RE) involves identifying the relations between entities from unstructured texts. RE serves as the foundation for many natural language processing (NLP) applications, such as knowledge graph completion, question answering, and information retrieval. In recent years, deep neural networks have dominated the field of RE and made noticeable progress. Subsequently, the large pre-trained language models (PLMs) have taken the state-of-the-art of RE to a new level. This survey provides a comprehensive review of existing deep learning techniques for RE. First, we introduce RE resources, including RE datasets and evaluation metrics. Second, we propose a new taxonomy to categorize existing works from three perspectives (text representation, context encoding, and triplet prediction). Third, we discuss several important challenges faced by RE and summarize potential techniques to tackle these challenges. Finally, we outline some promising future directions and prospects in this field. This survey is expected to facilitate researchers' collaborative efforts to tackle the challenges of real-life RE systems.
Iterative Forward Tuning Boosts In-context Learning in Language Models
Yang, Jiaxi, Hui, Binyuan, Yang, Min, Li, Binhua, Huang, Fei, Li, Yongbin
Large language models (LLMs) have exhibited an emergent in-context learning (ICL) ability. However, the ICL models that can solve ordinary cases are hardly extended to solve more complex tasks by processing the demonstration examples once. This single-turn ICL is incoordinate with the decision making process of humans by learning from analogy. In this paper, we propose an effective and efficient two-stage framework to boost ICL in LLMs by exploiting a dual form between Transformer attention and gradient descent-based optimization. Concretely, we divide the ICL process into "Deep-Thinking" and inference stages. The "Deep-Thinking" stage performs iterative forward optimization of demonstrations, which is expected to boost the reasoning abilities of LLMs at test time by "thinking" demonstrations multiple times. It produces accumulated meta-gradients by manipulating the Key-Value matrices in the self-attention modules of the Transformer. Then, the inference stage only takes the test query as input without concatenating demonstrations and applies the learned meta-gradients through attention for output prediction. In this way, demonstrations are not required during the inference stage since they are already learned and stored in the definitive meta-gradients. LLMs can be effectively and efficiently adapted to downstream tasks. Extensive experiments on ten classification and multiple-choice datasets show that our method achieves substantially better performance than standard ICL in terms of both accuracy and efficiency.
Topic-Guided Self-Introduction Generation for Social Media Users
Xu, Chunpu, Li, Jing, Li, Piji, Yang, Min
Millions of users are active on social media. To allow users to better showcase themselves and network with others, we explore the auto-generation of social media self-introduction, a short sentence outlining a user's personal interests. While most prior work profiles users with tags (e.g., ages), we investigate sentence-level self-introductions to provide a more natural and engaging way for users to know each other. Here we exploit a user's tweeting history to generate their self-introduction. The task is non-trivial because the history content may be lengthy, noisy, and exhibit various personal interests. To address this challenge, we propose a novel unified topic-guided encoder-decoder (UTGED) framework; it models latent topics to reflect salient user interest, whose topic mixture then guides encoding a user's history and topic words control decoding their self-introduction. For experiments, we collect a large-scale Twitter dataset, and extensive results show the superiority of our UTGED to the advanced encoder-decoder models without topic modeling.
Self-Distillation with Meta Learning for Knowledge Graph Completion
Li, Yunshui, Liu, Junhao, Li, Chengming, Yang, Min
In this paper, we propose a selfdistillation framework with meta learning(MetaSD) for knowledge graph completion with dynamic pruning, which aims to learn compressed graph embeddings and tackle the longtail samples. Specifically, we first propose a dynamic pruning technique to obtain a small pruned model from a large source model, where the pruning mask of the pruned model could be updated adaptively per epoch after the model weights are updated. The pruned model is supposed to be more sensitive to difficult to memorize samples(e.g., longtail samples) than the source model. Then, we propose a onestep meta selfdistillation method for distilling comprehensive knowledge from the source model to the pruned model, where the two models coevolve in a dynamic manner during training. In particular, we exploit the performance of the pruned model, which is trained alongside the source model in one iteration, to improve the source models knowledge transfer ability for the next iteration via meta learning. Extensive experiments show that MetaSD achieves competitive performance compared to strong baselines, while being 10x smaller than baselines.
Unsupervised Dialogue Topic Segmentation with Topic-aware Utterance Representation
Gao, Haoyu, Wang, Rui, Lin, Ting-En, Wu, Yuchuan, Yang, Min, Huang, Fei, Li, Yongbin
Dialogue Topic Segmentation (DTS) plays an essential role in a variety of dialogue modeling tasks. Previous DTS methods either focus on semantic similarity or dialogue coherence to assess topic similarity for unsupervised dialogue segmentation. However, the topic similarity cannot be fully identified via semantic similarity or dialogue coherence. In addition, the unlabeled dialogue data, which contains useful clues of utterance relationships, remains underexploited. In this paper, we propose a novel unsupervised DTS framework, which learns topic-aware utterance representations from unlabeled dialogue data through neighboring utterance matching and pseudo-segmentation. Extensive experiments on two benchmark datasets (i.e., DialSeg711 and Doc2Dial) demonstrate that our method significantly outperforms the strong baseline methods. For reproducibility, we provide our code and data at:https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/dial-start.
Type-enhanced Ensemble Triple Representation via Triple-aware Attention for Cross-lingual Entity Alignment
Zhang, Zhishuo, Tan, Chengxiang, Wang, Haihang, Zhao, Xueyan, Yang, Min
Entity alignment(EA) is a crucial task for integrating cross-lingual and cross-domain knowledge graphs(KGs), which aims to discover entities referring to the same real-world object from different KGs. Most existing methods generate aligning entity representation by mining the relevance of triple elements via embedding-based methods, paying little attention to triple indivisibility and entity role diversity. In this paper, a novel framework named TTEA -- Type-enhanced Ensemble Triple Representation via Triple-aware Attention for Cross-lingual Entity Alignment is proposed to overcome the above issues considering ensemble triple specificity and entity role features. Specifically, the ensemble triple representation is derived by regarding relation as information carrier between semantic space and type space, and hence the noise influence during spatial transformation and information propagation can be smoothly controlled via specificity-aware triple attention. Moreover, our framework uses triple-ware entity enhancement to model the role diversity of triple elements. Extensive experiments on three real-world cross-lingual datasets demonstrate that our framework outperforms state-of-the-art methods.
OTIEA:Ontology-enhanced Triple Intrinsic-Correlation for Cross-lingual Entity Alignment
Zhang, Zhishuo, Tan, Chengxiang, Zhao, Xueyan, Yang, Min, Jiang, Chaoqun
Cross-lingual and cross-domain knowledge alignment without sufficient external resources is a fundamental and crucial task for fusing irregular data. As the element-wise fusion process aiming to discover equivalent objects from different knowledge graphs (KGs), entity alignment (EA) has been attracting great interest from industry and academic research recent years. Most of existing EA methods usually explore the correlation between entities and relations through neighbor nodes, structural information and external resources. However, the complex intrinsic interactions among triple elements and role information are rarely modeled in these methods, which may lead to the inadequate illustration for triple. In addition, external resources are usually unavailable in some scenarios especially cross-lingual and cross-domain applications, which reflects the little scalability of these methods. To tackle the above insufficiency, a novel universal EA framework (OTIEA) based on ontology pair and role enhancement mechanism via triple-aware attention is proposed in this paper without introducing external resources. Specifically, an ontology-enhanced triple encoder is designed via mining intrinsic correlations and ontology pair information instead of independent elements. In addition, the EA-oriented representations can be obtained in triple-aware entity decoder by fusing role diversity. Finally, a bidirectional iterative alignment strategy is deployed to expand seed entity pairs. The experimental results on three real-world datasets show that our framework achieves a competitive performance compared with baselines.