Goto

Collaborating Authors

 Information Extraction


UnifiedABSA: A Unified ABSA Framework Based on Multi-task Instruction Tuning

arXiv.org Artificial Intelligence

Aspect-Based Sentiment Analysis (ABSA) aims to provide fine-grained aspect-level sentiment information. There are many ABSA tasks, and the current dominant paradigm is to train task-specific models for each task. However, application scenarios of ABSA tasks are often diverse. This solution usually requires a large amount of labeled data from each task to perform excellently. These dedicated models are separately trained and separately predicted, ignoring the relationship between tasks. To tackle these issues, we present UnifiedABSA, a general-purpose ABSA framework based on multi-task instruction tuning, which can uniformly model various tasks and capture the inter-task dependency with multi-task learning. Extensive experiments on two benchmark datasets show that UnifiedABSA can significantly outperform dedicated models on 11 ABSA tasks and show its superiority in terms of data efficiency.


Fixing Model Bugs with Natural Language Patches

arXiv.org Artificial Intelligence

Current approaches for fixing systematic problems in NLP models (e.g. regex patches, finetuning on more data) are either brittle, or labor-intensive and liable to shortcuts. In contrast, humans often provide corrections to each other through natural language. Taking inspiration from this, we explore natural language patches -- declarative statements that allow developers to provide corrective feedback at the right level of abstraction, either overriding the model (``if a review gives 2 stars, the sentiment is negative'') or providing additional information the model may lack (``if something is described as the bomb, then it is good''). We model the task of determining if a patch applies separately from the task of integrating patch information, and show that with a small amount of synthetic data, we can teach models to effectively use real patches on real data -- 1 to 7 patches improve accuracy by ~1-4 accuracy points on different slices of a sentiment analysis dataset, and F1 by 7 points on a relation extraction dataset. Finally, we show that finetuning on as many as 100 labeled examples may be needed to match the performance of a small set of language patches.


Enhancing Intra-class Information Extraction for Heterophilous Graphs: One Neural Architecture Search Approach

arXiv.org Artificial Intelligence

In recent years, Graph Neural Networks (GNNs) have been popular in graph representation learning which assumes the homophily property, i.e., the connected nodes have the same label or have similar features. However, they may fail to generalize into the heterophilous graphs which in the low/medium level of homophily. Existing methods tend to address this problem by enhancing the intra-class information extraction, i.e., either by designing better GNNs to improve the model effectiveness, or re-designing the graph structures to incorporate more potential intra-class nodes from distant hops. Despite the success, we observe two aspects that can be further improved: (a) enhancing the ego feature information extraction from node itself which is more reliable in extracting the intra-class information; (b) designing node-wise GNNs can better adapt to the nodes with different homophily ratios. In this paper, we propose a novel method IIE-GNN (Intra-class Information Enhanced Graph Neural Networks) to achieve two improvements. A unified framework is proposed based on the literature, in which the intra-class information from the node itself and neighbors can be extracted based on seven carefully designed blocks. With the help of neural architecture search (NAS), we propose a novel search space based on the framework, and then provide an architecture predictor to design GNNs for each node. We further conduct experiments to show that IIE-GNN can improve the model performance by designing node-wise GNNs to enhance intra-class information extraction.


Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers

arXiv.org Artificial Intelligence

A supervised text classification model relies on labelled datasets to train the model (Sebastiani, 2002). From an experimental perspective, the design and evaluation of classification models typically rely on data pertaining to fixed periods of time. Recent research demonstrates that such models, while showing competitive performance in their experimental environment, underperform when they need to classify new data that is distant in time from that observed during training (Alkhalifa and Zubiaga, 2022). This deterioration of performance has been demonstrated for different classification tasks, including topic classification (Rocha, Mourão, Pereira, Gonçalves, and Meira, 2008), sentiment classification (Lukes and Søgaard, 2018), hate speech detection (Florio, Basile, Polignano, Basile, and Patti, 2020), stance detection (Alkhalifa, Kochkina, and Zubiaga, 2021) and political ideology detection (Röttger and Pierrehumbert, 2021). This performance drop can happen for multiple reasons, including among others the evolution in language use (Smith, 2004) or the evolution of public opinion (Bonilla and Mo, 2019) and its extent may vary (Alkhalifa et al., 2021). This poses an important challenge and limitation on such models when one plans to continue using the model over a long period of time to classify new, incoming data, as can be the case with a stream of user-generated contents (Cheng, Chen, Lee, and Li, 2021).


Natural Language Processing for Systems Engineering: Automatic Generation of Systems Modelling Language Diagrams

arXiv.org Artificial Intelligence

The design of complex engineering systems is an often long and articulated process that highly relies on engineers' expertise and professional judgment. As such, the typical pitfalls of activities involving the human factor often manifest themselves in terms of lack of completeness or exhaustiveness of the analysis, inconsistencies across design choices or documentation, as well as an implicit degree of subjectivity. An approach is proposed to assist systems engineers in the automatic generation of systems diagrams from unstructured natural language text. Natural Language Processing (NLP) techniques are used to extract entities and their relationships from textual resources (e.g., specifications, manuals, technical reports, maintenance reports) available within an organisation, and convert them into Systems Modelling Language (SysML) diagrams, with particular focus on structure and requirement diagrams. The intention is to provide the users with a more standardised, comprehensive and automated starting point onto which subsequently refine and adapt the diagrams according to their needs. The proposed approach is flexible and open-domain. It consists of six steps which leverage open-access tools, and it leads to an automatic generation of SysML diagrams without intermediate modelling requirement, but through the specification of a set of parameters by the user. The applicability and benefits of the proposed approach are shown through six case studies having different textual sources as inputs, and benchmarked against manually defined diagram elements.


Detecting Conspiracy Theory Against COVID-19 Vaccines

arXiv.org Artificial Intelligence

Since the beginning of the vaccination trial, social media has been flooded with anti-vaccination comments and conspiracy beliefs. As the day passes, the number of COVID- 19 cases increases, and online platforms and a few news portals entertain sharing different conspiracy theories. The most popular conspiracy belief was the link between the 5G network spreading COVID-19 and the Chinese government spreading the virus as a bioweapon, which initially created racial hatred. Although some disbelief has less impact on society, others create massive destruction. For example, the 5G conspiracy led to the burn of the 5G Tower, and belief in the Chinese bioweapon story promoted an attack on the Asian-Americans. Another popular conspiracy belief was that Bill Gates spread this Coronavirus disease (COVID-19) by launching a mass vaccination program to track everyone. This Conspiracy belief creates distrust issues among laypeople and creates vaccine hesitancy. This study aims to discover the conspiracy theory against the vaccine on social platforms. We performed a sentiment analysis on the 598 unique sample comments related to COVID-19 vaccines. We used two different models, BERT and Perspective API, to find out the sentiment and toxicity of the sentence toward the COVID-19 vaccine.


Social Media Sentiment Analysis Using Twitter Datasets - DataScienceCentral.com

#artificialintelligence

Several hundreds of thousands of raw data files are uploaded by users every day to social media sites. Online user data provides access to an enormous amount of information regarding products, services, places, and events, which makes it suitable for sentiment analysis. Valuable information can be extracted by analyzing the sentiment of the data. It is a method for interpreting opinions within a text that uses Natural Language Processing (NLP) to extract positive, negative, and natural meanings from user-generated content shared on social media platforms. Sentiment analysis has been previously applied to products or movie reviews to understand customers' interests better and, thus, improve outcomes and service offerings.


Human Review Workflow with AWS A2I

#artificialintelligence

Some deep learning and machine learning applications need to ensure accuracy through human oversight of sensitive data. This provides the collection of health data, increases the model accuracy, and helps continuous improvements with updated predictions. Data augmentation is a critical process for data companies that spend tons of dollars on this. Today, we will create a sentiment analysis workflow on Amazon Human Review Workflow, an Amazon A2I service.


BERT-ASC: Implicit Aspect Representation Learning through Auxiliary-Sentence Construction for Sentiment Analysis

arXiv.org Artificial Intelligence

Aspect-based sentiment analysis (ABSA) task aim at associating a piece of text with a set of aspects and meanwhile infer their respective sentimental polarities. The state-of-the-art approaches are built upon fine-tuning of various pre-trained language models. They commonly attempt to learn aspect-specific representation from the corpus. Unfortunately, the aspect is often expressed implicitly through a set of representatives and thus renders implicit mapping process unattainable unless sufficient labeled examples are available. However, high-quality labeled examples may not be readily available in real-world scenarios. In this paper, we propose to jointly address aspect categorization and aspect-based sentiment subtasks in a unified framework. Specifically, we first introduce a simple but effective mechanism to construct an auxiliary-sentence for the implicit aspect based on the semantic information in the corpus. Then, we encourage BERT to learn the aspect-specific representation in response to the automatically constructed auxiliary-sentence instead of the aspect itself. Finally, we empirically evaluate the performance of the proposed solution by a comparative study on real benchmark datasets for both ABSA and Targeted-ABSA tasks. Our extensive experiments show that it consistently achieves state-of-the-art performance in terms of aspect categorization and aspect-based sentiment across all datasets and the improvement margins are considerable. The code of BERT-ASC is available in GitHub: https://github.com/amurtadha/BERT-ASC.


When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications

arXiv.org Artificial Intelligence

Open Information Extraction (OpenIE) has been used in the pipelines of various NLP tasks. Unfortunately, there is no clear consensus on which models to use in which tasks. Muddying things further is the lack of comparisons that take differing training sets into account. In this paper, we present an application-focused empirical survey of neural OpenIE models, training sets, and benchmarks in an effort to help users choose the most suitable OpenIE systems for their applications. We find that the different assumptions made by different models and datasets have a statistically significant effect on performance, making it important to choose the most appropriate model for one's applications. We demonstrate the applicability of our recommendations on a downstream Complex QA application.