Goto

Collaborating Authors

 Information Extraction


Weakly-supervised Domain Adaption for Aspect Extraction via Multi-level Interaction Transfer

arXiv.org Artificial Intelligence

Fine-grained aspect extraction is an essential sub-task in aspect based opinion analysis. It aims to identify the aspect terms (a.k.a. opinion targets) of a product or service in each sentence. However, expensive annotation process is usually involved to acquire sufficient token-level labels for each domain. To address this limitation, some previous works propose domain adaptation strategies to transfer knowledge from a sufficiently labeled source domain to unlabeled target domains. But due to both the difficulty of fine-grained prediction problems and the large domain gap between domains, the performance remains unsatisfactory. This work conducts a pioneer study on leveraging sentence-level aspect category labels that can be usually available in commercial services like review sites to promote token-level transfer for the extraction purpose. Specifically, the aspect category information is used to construct pivot knowledge for transfer with assumption that the interactions between sentence-level aspect category and token-level aspect terms are invariant across domains. To this end, we propose a novel multi-level reconstruction mechanism that aligns both the fine-grained and coarse-grained information in multiple levels of abstractions. Comprehensive experiments demonstrate that our approach can fully utilize sentence-level aspect category labels to improve cross-domain aspect extraction with a large performance gain.


Part-1: Introduction to Natural Language Processing (NLP)

#artificialintelligence

Natural language processing (NLP) is a field of artificial intelligence in which computers analyze, understand, and derive meaning information from human language in a smart and useful way. By utilizing NLP, developers can organize and structure knowledge to perform tasks such as automatic summarization, translation, named entity recognition, relationship extraction, sentiment analysis, speech recognition, and topic segmentation. NLP is characterized as a difficult problem in computer science. Human language is rarely precise or plainly spoken. To understand human language is to understand not only the words but the concepts and how they're linked together to create meaning.


Social Sentiment Analysis Toward the Clean Energy Transition

#artificialintelligence

The world is in the midst of an energy transition. This massive shift aims to move away from reliance on fuels that are destructive to the climate, the environment, and people's well-being. The goal established by the UN is to "ensure access to affordable, reliable, sustainable and modern energy for all" by 2030. While governments, energy companies, and activists dominate the headlines, the progress with infrastructure and technology won't be sufficient. A successful energy transition for the good of all humanity depends on the action of individuals.


What is NLP and Why is it Important?

#artificialintelligence

Natural Language Processing (NLP) is a subfield of artificial intelligence that assists computers with understanding human language. Utilizing NLP, machines can understand unstructured online information so we can gain significant insights. As computer technology advances past their artificial requirements, companies are searching for better approaches to exploit. A sharp increase in computing speed and capacities has led to new and highly intelligent software systems, some of which are prepared to supplant or augment human services. The rise of natural language processing (NLP) is probably the best example, with intelligent chatbots prepared to change the universe of customer service and beyond.


GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information

arXiv.org Artificial Intelligence

The past several years have witnessed a huge surge in the use of social media platforms during mass convergence events such as health emergencies, natural or human-induced disasters. These non-traditional data sources are becoming vital for disease forecasts and surveillance when preparing for epidemic and pandemic outbreaks. In this paper, we present GeoCoV19, a large-scale Twitter dataset containing more than 524 million multilingual tweets posted over a period of 90 days since February 1, 2020. Moreover, we employ a gazetteer-based approach to infer the geolocation of tweets. We postulate that this large-scale, multilingual, geolocated social media data can empower the research communities to evaluate how societies are collectively coping with this unprecedented global crisis as well as to develop computational methods to address challenges such as identifying fake news, understanding communities' knowledge gaps, building disease forecast and surveillance models, among others.


New Cognitive Services capabilities are now generally available Azure updates Microsoft Azure

#artificialintelligence

Computer Vision--Advanced text extraction: The most advanced text extraction capability for Computer Vision, Read 3.0, is now generally available and expanding its language coverage beyond English and Spanish to include French, German, Portuguese, Italian, and Dutch. Read 3.0 in containers is also available in preview. Language Understanding and Text Analytics sentiment analysis in containers are now generally available. Language Understanding--Enhanced portal experience: The Language Understanding service has revamped the labeling experience, making it easier to build apps and bots that can understand the complex language structures people tend to use. For example, in this order: "I want a large chicken pizza without sauce and a medium pizza with olives," there are two different language structures within the same order.


How to Build Emotion Text Analyzer with Python (NLP)

#artificialintelligence

In this tutorial I will guide you on how to detect emotions associated with textual data which can be either classified as either positive or negative and how can you apply that knowledge in variety of applications depending on what you wanna do. In this tutorial I will guide you on how to detect emotions associated with textual data which can be either classified as either positive or negative and how can you apply that knowledge in variety of applications depending on what you wanna do. For instance you want to perform automatic analysis of customer feedback with directly reading them as either positive or negative feedback you will need to Sentiment analyzer to check the negativity or positivity of the textual data.


Neural Learning for Aspect Phrase Extraction and Classification in Sentiment Analysis

AAAI Conferences

In this study, we present an approach and a dataset for aspect-based sentiment analysis, showing how we extract and classify aspect phrases. The research field of aspect-based sentiment analysis aims at finding opinions expressed for individual characteristics of products or services in natural language texts. In the literature, reviews for common products or services such as smartphones or restaurants were mostly investigated. We describe our newly annotated dataset of German physician reviews, which presents a sensitive and linguistically complex domain, taking care to describe the annotation process and the functionality of our neural network approach. Finally, we introduce a model that can extract and classify aspect phrases in one step while obtaining an F1 score of 80%. As we employ our algorithm in a more complex domain, we believe that our study outperforms other studies.


A Survey on Temporal Reasoning for Temporal Information Extraction from Text (Extended Abstract)

arXiv.org Artificial Intelligence

Time is deeply woven into how people perceive, and communicate about the world. Almost unconsciously, we provide our language utterances with temporal cues, like verb tenses, and we can hardly produce sentences without such cues. Extracting temporal cues from text, and constructing a global temporal view about the order of described events is a major challenge of automatic natural language understanding. Temporal reasoning, the process of combining different temporal cues into a coherent temporal view, plays a central role in temporal information extraction. This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on the integration of symbolic reasoning with machine learning-based information extraction systems.


Building A User-Centric and Content-Driven Socialbot

arXiv.org Artificial Intelligence

To build Sounding Board, we develop a system architecture that is capable of accommodating dialog strategies that we designed for socialbot conversations. The architecture consists of a multi-dimensional language understanding module for analyzing user utterances, a hierarchical dialog management framework for dialog context tracking and complex dialog control, and a language generation process that realizes the response plan and makes adjustments for speech synthesis. Additionally, we construct a new knowledge base to power the socialbot by collecting social chat content from a variety of sources. An important contribution of the system is the synergy between the knowledge base and the dialog management, i.e., the use of a graph structure to organize the knowledge base that makes dialog control very efficient in bringing related content to the discussion. Using the data collected from Sounding Board during the competition, we carry out in-depth analyses of socialbot conversations and user ratings which provide valuable insights in evaluation methods for socialbots. We additionally investigate a new approach for system evaluation and diagnosis that allows scoring individual dialog segments in the conversation. Finally, observing that socialbots suffer from the issue of shallow conversations about topics associated with unstructured data, we study the problem of enabling extended socialbot conversations grounded on a document. To bring together machine reading and dialog control techniques, a graph-based document representation is proposed, together with methods for automatically constructing the graph. Using the graph-based representation, dialog control can be carried out by retrieving nodes or moving along edges in the graph. To illustrate the usage, a mixed-initiative dialog strategy is designed for socialbot conversations on news articles.