Goto

Collaborating Authors

 Wang, Ping


Semantics-Preserved Distortion for Personal Privacy Protection in Information Management

arXiv.org Artificial Intelligence

Although machine learning and especially deep learning methods have played an important role in the field of information management, privacy protection is an important and concerning topic in current machine learning models. In information management field, a large number of texts containing personal information are produced by users every day. As the model training on information from users is likely to invade personal privacy, many methods have been proposed to block the learning and memorizing of the sensitive data in raw texts. In this paper, we try to do this more linguistically via distorting the text while preserving the semantics. In practice, we leverage a recently our proposed metric, Neighboring Distribution Divergence, to evaluate the semantic preservation during the distortion. Based on the metric, we propose two frameworks for semantics-preserved distortion, a generative one and a substitutive one. We conduct experiments on named entity recognition, constituency parsing, and machine reading comprehension tasks. Results from our experiments show the plausibility and efficiency of our distortion as a method for personal privacy protection. Moreover, we also evaluate the attribute attack on three privacy-related tasks in the current natural language processing field, and the results show the simplicity and effectiveness of our data-based improvement approach compared to the structural improvement approach. Further, we also investigate the effects of privacy protection in specific medical information management in this work and show that the medical information pre-training model using our approach can effectively reduce the memory of patients and symptoms, which fully demonstrates the practicality of our approach.


A Bidirectional Tree Tagging Scheme for Joint Medical Relation Extraction

arXiv.org Artificial Intelligence

Joint medical relation extraction refers to extracting triples, composed of entities and relations, from the medical text with a single model. One of the solutions is to convert this task into a sequential tagging task. However, in the existing works, the methods of representing and tagging the triples in a linear way failed to the overlapping triples, and the methods of organizing the triples as a graph faced the challenge of large computational effort. In this paper, inspired by the tree-like relation structures in the medical text, we propose a novel scheme called Bidirectional Tree Tagging (BiTT) to form the medical relation triples into two two binary trees and convert the trees into a word-level tags sequence. Based on BiTT scheme, we develop a joint relation extraction model to predict the BiTT tags and further extract medical triples efficiently. Our model outperforms the best baselines by 2.0\% and 2.5\% in F1 score on two medical datasets. What's more, the models with our BiTT scheme also obtain promising results in three public datasets of other domains.


A Sequence Tagging based Framework for Few-Shot Relation Extraction

arXiv.org Artificial Intelligence

Relation Extraction (RE) refers to extracting the relation triples in the input text. Existing neural work based systems for RE rely heavily on manually labeled training data, but there are still a lot of domains where sufficient labeled data does not exist. Inspired by the distance-based few-shot named entity recognition methods, we put forward the definition of the few-shot RE task based on the sequence tagging joint extraction approaches, and propose a few-shot RE framework for the task. Besides, we apply two actual sequence tagging models to our framework (called Few-shot TPLinker and Few-shot BiTT), and achieves solid results on two few-shot RE tasks constructed from a public dataset.


Node Selection Toward Faster Convergence for Federated Learning on Non-IID Data

arXiv.org Artificial Intelligence

Federated Learning (FL) is a distributed learning paradigm that enables a large number of resource-limited nodes to collaboratively train a model without data sharing. The non-independent-and-identically-distributed (non-i.i.d.) data samples invoke discrepancy between global and local objectives, making the FL model slow to converge. In this paper, we proposed Optimal Aggregation algorithm for better aggregation, which finds out the optimal subset of local updates of participating nodes in each global round, by identifying and excluding the adverse local updates via checking the relationship between the local gradient and the global gradient. Then, we proposed a Probabilistic Node Selection framework (FedPNS) to dynamically change the probability for each node to be selected based on the output of Optimal Aggregation. FedPNS can preferentially select nodes that propel faster model convergence. The unbiasedness of the proposed FedPNS design is illustrated and the convergence rate improvement of FedPNS over the commonly adopted Federated Averaging (FedAvg) algorithm is analyzed theoretically. Experimental results demonstrate the effectiveness of FedPNS in accelerating the FL convergence rate, as compared to FedAvg with random node selection.


RAGA: Relation-aware Graph Attention Networks for Global Entity Alignment

arXiv.org Artificial Intelligence

Entity alignment (EA) is the task to discover entities referring to the same real-world object from different knowledge graphs (KGs), which is the most crucial step in integrating multi-source KGs. The majority of the existing embeddings-based entity alignment methods embed entities and relations into a vector space based on relation triples of KGs for local alignment. As these methods insufficiently consider the multiple relations between entities, the structure information of KGs has not been fully leveraged. In this paper, we propose a novel framework based on Relation-aware Graph Attention Networks to capture the interactions between entities and relations. Our framework adopts the self-attention mechanism to spread entity information to the relations and then aggregate relation information back to entities. Furthermore, we propose a global alignment algorithm to make one-to-one entity alignments with a fine-grained similarity matrix. Experiments on three real-world cross-lingual datasets show that our framework outperforms the state-of-the-art methods.


Few-shot Image Classification with Multi-Facet Prototypes

arXiv.org Artificial Intelligence

The aim of few-shot learning (FSL) is to learn how to recognize image categories from a small number of training examples. A central challenge is that the available training examples are normally insufficient to determine which visual features are most characteristic of the considered categories. To address this challenge, we organize these visual features into facets, which intuitively group features of the same kind (e.g. features that are relevant to shape, color, or texture). This is motivated from the assumption that (i) the importance of each facet differs from category to category and (ii) it is possible to predict facet importance from a pre-trained embedding of the category names. In particular, we propose an adaptive similarity measure, relying on predicted facet importance weights for a given set of categories. This measure can be used in combination with a wide array of existing metric-based methods. Experiments on miniImageNet and CUB show that our approach improves the state-of-the-art in metric-based FSL.


Fast-Convergent Federated Learning with Adaptive Weighting

arXiv.org Artificial Intelligence

Federated learning (FL) enables resource-constrained edge nodes to collaboratively learn a global model under the orchestration of a central server while keeping privacy-sensitive data locally. The non-independent-and-identically-distributed (non-IID) data samples across participating nodes slow model training and impose additional communication rounds for FL to converge. In this paper, we propose Federated Adaptive Weighting (FedAdp) algorithm that aims to accelerate model convergence under the presence of nodes with non-IID dataset. We observe the implicit connection between the node contribution to the global model aggregation and data distribution on the local node through theoretical and empirical analysis. We then propose to assign different weights for updating the global model based on node contribution adaptively through each training round. The contribution of participating nodes is first measured by the angle between the local gradient vector and the global gradient vector, and then, weight is quantified by a designed non-linear mapping function subsequently. The simple yet effective strategy can reinforce positive (suppress negative) node contribution dynamically, resulting in communication round reduction drastically. Its superiority over the commonly adopted Federated Averaging (FedAvg) is verified both theoretically and experimentally. With extensive experiments performed in Pytorch and PySyft, we show that FL training with FedAdp can reduce the number of communication rounds by up to 54.1% on MNIST dataset and up to 45.4% on FashionMNIST dataset, as compared to FedAvg algorithm.


A Concept-based Abstraction-Aggregation Deep Neural Network for Interpretable Document Classification

arXiv.org Machine Learning

Using attention weights to identify information that is important for models' decision making is a popular approach to interpret attention-based neural networks, which is commonly realized via creating a heat-map for every single document based on attention weights. However, this interpretation method is fragile. In this paper, we propose a corpus-level explanation approach, which aims to capture causal relationships between keywords and model predictions via learning importance of keywords for predicted labels across a training corpus based on attention weights. Using this idea as the fundamental building block, we further propose a concept-based explanation method that can automatically learn higher-level concepts and their importance to model prediction task. Our concept-based explanation method is built upon a novel Abstraction-Aggregation Network, which can automatically cluster important keywords during an end-to-end training process. We apply these methods to the document classification task and show that they are powerful in extracting semantically meaningful keywords and concepts. Our consistency analysis results based on an attention-based Na\"ive Bayes Classifier also demonstrate these keywords and concepts are important for model predictions.


Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks

arXiv.org Machine Learning

In practice, however, downstream tasks such as has gained a lot of attention in recent years [1, 5, 10, 31, 35, 37], link prediction require specific contextual information that can where a low-dimensional vector representation of each node in be extracted from the subgraphs related to the nodes provided as the graph is used for downstream applications such as link prediction input to the task. To tackle this challenge, we develop SLiCE, a [1, 5, 39] or multi-hop reasoning [8, 13, 42]. Many of the framework bridging static representation learning methods using existing methods focus on obtaining a static vector representation global information from the entire graph with localized attention per node that is agnostic to any specific context and is typically driven mechanisms to learn contextual node representations. We obtained by learning the importance of all of the node's immediate first pre-train our model in a self-supervised manner by introducing and multi-hop neighbors in the graph. However, we argue higher-order semantic associations and masking nodes, and that nodes in a heterogeneous network exhibit a different behavior, then fine-tune our model for a specific link prediction task. Instead based on different relation types and their participation in diverse of training node representations by aggregating information from network communities. Further, most downstream tasks such as link all semantic neighbors connected via metapaths, we automatically prediction are dependent on the specific contextual information learn the composition of different metapaths that characterize the related to the input nodes that can be extracted in the form of task context for a specific task without the need for any predefined specific subgraphs.


Platoon trajectories generation: A unidirectional interconnected LSTM-based car following model

arXiv.org Machine Learning

Car following models have been widely applied and made remarkable achievements in traffic engineering. However, the traffic micro-simulation accuracy of car following models in a platoon level, especially during traffic oscillations, still needs to be enhanced. Rather than using traditional individual car following models, we proposed a new trajectory generation approach to generate platoon level trajectories given the first leading vehicle's trajectory. In this paper, we discussed the temporal and spatial error propagation issue for the traditional approach by a car following block diagram representation. Based on the analysis, we pointed out that error comes from the training method and the model structure. In order to fix that, we adopt two improvements on the basis of the traditional LSTM based car following model. We utilized a scheduled sampling technique during the training process to solve the error propagation in the temporal dimension. Furthermore, we developed a unidirectional interconnected LSTM model structure to extract trajectories features from the perspective of the platoon. As indicated by the systematic empirical experiments, the proposed novel structure could efficiently reduce the temporal and spatial error propagation. Compared with the traditional LSTM based car following model, the proposed model has almost 40% less error. The findings will benefit the design and analysis of micro-simulation for platoon level car following models.