Goto

Collaborating Authors

 Jin, Di


Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing

arXiv.org Artificial Intelligence

A significant progress has been made in deep-learning based Automatic Essay Scoring (AES) systems in the past two decades. The performance commonly measured by the standard performance metrics like Quadratic Weighted Kappa (QWK), and accuracy points to the same. However, testing on common-sense adversarial examples of these AES systems reveal their lack of natural language understanding capability. Inspired by common student behaviour during examinations, we propose a task agnostic adversarial evaluation scheme for AES systems to test their natural language understanding capabilities and overall robustness.


Hooks in the Headline: Learning to Generate Headlines with Controlled Styles

arXiv.org Artificial Intelligence

Current summarization systems only produce plain, factual headlines, but do not meet the practical needs of creating memorable titles to increase exposure. We propose a new task, Stylistic Headline Generation (SHG), to enrich the headlines with three style options (humor, romance and clickbait), in order to attract more readers. With no style-specific article-headline pair (only a standard headline summarization dataset and mono-style corpora), our method TitleStylist generates style-specific headlines by combining the summarization and reconstruction tasks into a multitasking framework. We also introduced a novel parameter sharing scheme to further disentangle the style from the text. Through both automatic and human evaluation, we demonstrate that TitleStylist can generate relevant, fluent headlines with three target styles: humor, romance, and clickbait. The attraction score of our model generated headlines surpasses that of the state-of-the-art summarization model by 9.68%, and even outperforms human-written references.


Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment

arXiv.org Artificial Intelligence

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness of these models by exposing the maliciously crafted adversarial examples. In this paper, we present the TextFooler, a general attack framework, to generate natural adversarial texts. By successfully applying it to two fundamental natural language tasks, text classification and textual entailment, against various target models, convolutional and recurrent neural networks as well as the most powerful pre-trained BERT, we demonstrate the advantages of this framework in three ways: (i) effective---it outperforms state-of-the-art attacks in terms of success rate and perturbation rate; (ii) utility-preserving---it preserves semantic content and grammaticality, and remains correctly classified by humans; and (iii) efficient---it generates adversarial text with computational complexity linear in the text length.


Unsupervised Text Style Transfer via Iterative Matching and Translation

arXiv.org Artificial Intelligence

Text style transfer seeks to learn how to automatically rewrite sentences from a source domain to the target domain in different styles, while simultaneously preserving their semantic contents. A major challenge in this task stems from the lack of parallel data that connects the source and target styles. Existing approaches try to disentangle content and style, but this is quite difficult and often results in poor content-preservation and grammaticality. In contrast, we propose a novel approach by first constructing a pseudo-parallel resource that aligns a subset of sentences with similar content between source and target corpus. And then a standard sequence-to-sequence model can be applied to learn the style transfer. Subsequently, we iteratively refine the learned style transfer function while improving upon the imperfections in our original alignment. Our method is applied to the tasks of sentiment modification and formality transfer, where it outperforms state-of-the-art systems by a large margin. As an auxiliary contribution, we produced a publicly-available test set with human-generated style transfers for future community use.


Advancing PICO Element Detection in Medical Text via Deep Neural Networks

arXiv.org Artificial Intelligence

In evidence-based medicine (EBM), structured medical questions are always favored for efficient search of the best available evidence for treatments. PICO element detection is widely used to help structurize the clinical studies and question by identifying the sentences in a given medical text that belong to one of the four components: Participants (P), Intervention (I), Comparison (C), and Outcome (O). In this work, we propose a hierarchical deep neural network (DNN) architecture that contains dual bi-directional long short-term memory (bi-LSTM) layers to automatically detect the PICO element in medical texts. Within the model, the lower layer of bi-LSTM is for sentence encoding while the upper one is to contextualize the encoded sentence representation vector. In addition, we adopt adversarial and virtual adversarial training to regularize the model. Overall, we advance the PICO element detection to new state-of-the-art performance, outperforming the previous works by at least 4\% in F1 score for all P/I/O categories.


Robust Detection of Link Communities in Large Social Networks by Exploiting Link Semantics

AAAI Conferences

Community detection has been extensively studied for various applications, focusing primarily on network topologies. Recent research has started to explore node contents to identify semantically meaningful communities and interpret their structures using selected words. However, links in real networks typically have semantic descriptions, e.g., comments and emails in social media, supporting the notion of communities of links. Indeed, communities of links can better describe multiple roles that nodes may play and provide a richer characterization of community behaviors than communities of nodes. The second issue in community finding is that most existing methods assume network topologies and descriptive contents to be consistent and to carry the compatible information of node group membership, which is generally violated in real networks. These methods are also restricted to interpret one community with one topic. The third problem is that the existing methods have used top ranked words or phrases to label topics when interpreting communities. However, it is often difficult to comprehend the derived topics using words or phrases, which may be irrelevant. To address these issues altogether, we propose a new unified probabilistic model that can be learned by a dual nested expectation-maximization algorithm. Our new method explores the intrinsic correlation between communities and topics to discover link communities robustly and extract adequate community summaries in sentences instead of words for topic labeling at the same time. It is able to derive more than one topical summary per community to provide rich explanations. We present experimental results to show the effectiveness of our new approach, and evaluate the quality of the results by a case study.


A Network-Specific Markov Random Field Approach to Community Detection

AAAI Conferences

Markov Random Field (MRF) is a powerful framework for developing probabilistic models of complex problems. MRF models possess rich structures to represent properties and constraints of a problem. It has been successful on many application problems, particularly those of computer vision and image processing, where data are structured, e.g., pixels are organized on grids. The problem of identifying communities in networks, which is essential for network analysis, is in principle analogous to finding objects in images. It is surprising that MRF has not yet been explored for network community detection. It is challenging to apply MRF to network analysis problems where data are organized on graphs with irregular structures. Here we present a network-specific MRF approach to community detection. The new method effectively encodes the structural properties of an irregular network in an energy function (the core of an MRF model) so that the minimization of the function gives rise to the best community structures. We analyzed the new MRF-based method on several synthetic benchmarks and real-world networks, showing its superior performance over the state-of-the-art methods for community identification.


Joint Identification of Network Communities and Semantics via Integrative Modeling of Network Topologies and Node Contents

AAAI Conferences

The objective of discovering network communities, an essential step in complex systems analysis, is two-fold: identification of functional modules and their semantics at the same time. However, most existing community-finding methods have focused on finding communities using network topologies, and the problem of extracting module semantics has not been well studied and node contents, which often contain semantic information of nodes and networks, have not been fully utilized. We considered the problem of identifying network communities and module semantics at the same time. We introduced a novel generative model with two closely correlated parts, one for communities and the other for semantics. We developed a co-learning strategy to jointly train the two parts of the model by combining a nested EM algorithm and belief propagation. By extracting the latent correlation between the two parts, our new method is not only robust for finding communities and semantics, but also able to provide more than one semantic explanation to a community. We evaluated the new method on artificial benchmarks and analyzed the semantic interpretability by a case study. We compared the new method with eight state-of-the-art methods on ten real-world networks, showing its superior performance over the existing methods.


Detect Overlapping Communities via Ranking Node Popularities

AAAI Conferences

Detection of overlapping communities has drawn much attention lately as they are essential properties of real complex networks. Despite its influence and popularity, the well studied and widely adopted stochastic model has not been made effective for finding overlapping communities. Here we extend the stochastic model method to detection of overlapping communities with the virtue of autonomous determination of the number of communities. Our approach hinges upon the idea of ranking node popularities within communities and using a Bayesian method to shrink communities to optimize an objective function based on the stochastic generative model. We evaluated the novel approach, showing its superior performance over five state-of-the-art methods, on large real networks and synthetic networks with ground-truths of overlapping communities.


Semantic Community Identification in Large Attribute Networks

AAAI Conferences

Identification of modular or community structures of a network is a key to understanding the semantics and functions of the network. While many network community detection methods have been developed, which primarily explore network topologies, they provide little semantic information of the communities discovered. Although structures and semantics are closely related, little effort has been made to discover and analyze these two essential network properties together. By integrating network topology and semantic information on nodes, e.g., node attributes, we study the problems of detection of communities and inference of their semantics simultaneously. We propose a novel nonnegative matrix factorization (NMF) model with two sets of parameters, the community membership matrix and community attribute matrix, and present efficient updating rules to evaluate the parameters with a convergence guarantee. The use of node attributes improves upon community detection and provides a semantic interpretation to the resultant network communities. Extensive experimental results on synthetic and real-world networks not only show the superior performance of the new method over the state-of-the-art approaches, but also demonstrate its ability to semantically annotate the communities.