AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Mining Twitter Data with Python Part 1: Collecting Data

@machinelearnbotFeb-14-2018, 22:31:18 GMT

Twitter is a popular social network where users can share short SMS-like messages called tweets. Users share thoughts, links and pictures on Twitter, journalists comment on live events, companies promote products and engage with customers. The list of different ways to use Twitter could be really long, and with 500 millions of tweets per day, there's a lot of data to analyse and to play with. This is the first in a series of articles dedicated to mining data on Twitter using Python. In this first part, we'll see different options to collect data from Twitter.

artificial intelligence, natural language, tweet, (14 more...)

@machinelearnbot

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.53)

Add feedback

Making use of sentiment analysis

#artificialintelligenceFeb-11-2018, 17:26:15 GMT

The analysis of texts to determine the writers' or speakers' opinion and attitude expressed, and how the results can be used. Sentiment analysis is also known as opinion mining. In its simplest form, it's a way of determining how positive or negative the content of a text document is, based on the relative numbers of words it contains that are classified as either positive or negative. Positive words would include words such as'amazing', 'friendly', 'clean', 'exceeded', and'prompt'. Negative words could be words like'scam', 'unprofessional', 'rude', 'refund', and'incompetent'.

artificial intelligence, natural language, sentiment, (10 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Mars Target Encyclopedia: Rock and Soil Composition Extracted From the Literature

Wagstaff, Kiri L. (California Institute of Technology) | Francis, Raymond (California Institute of Technology) | Gowda, Thamme (California Institute of Technology) | Lu, You (Information Sciences Institute, University of Southern California ) | Riloff, Ellen (California Institute of Technology) | Singh, Karanjeet (University of Utah) | Lanza, Nina L. (California Institute of Technology)

AAAI ConferencesFeb-8-2018

We have constructed an information extraction system called the Mars Target Encyclopedia that takes in planetary science publications and extracts scientific knowledge about target compositions. The extracted knowledge is stored in a searchable database that can greatly accelerate the ability of scientists to compare new discoveries with what is already known. To date, we have applied this system to ~6000 documents and achieved 41-56% precision in the extracted information.

relation, upstream oil & gas, us government, (26 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > California (1.00)
Europe > France (0.29)
North America > United States > New Mexico (0.29)
(8 more...)

Genre: Research Report > New Finding (0.89)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Information Management (0.93)
(2 more...)

Add feedback

Multi-Entity Aspect-Based Sentiment Analysis With Context, Entity and Aspect Memory

Yang, Jun (Nanjing University) | Yang, Runqi (Nanjing University) | Wang, Chongjun (Nanjing University) | Xie, Junyuan (Nanjing University)

AAAI ConferencesFeb-8-2018

Inspired by recent works in Aspect-Based Sentiment Analysis (ABSA) on product reviews and faced with more complex posts on social media platforms mentioning multiple entities as well as multiple aspects, we define a novel task called Multi-Entity Aspect-Based Sentiment Analysis (ME-ABSA). This task aims at fine-grained sentiment analysis of (entity, aspect) combinations, making the well-studied ABSA task a special case of it. To address the task, we propose an innovative method that models Context memory, Entity memory and Aspect memory, called CEA method. Our experimental results show that our CEA method achieves a significant gain over several baselines, including the state-of-the-art method for the ABSA task, and their enhanced versions, on datasets for ME-ABSA and ABSA tasks. The in-depth analysis illustrates the significant advantage of the CEA method over baseline methods for several hard-to-predict post types. Furthermore, we show that the CEA method is capable of generalizing to new (entity, aspect) combinations with little loss of accuracy. This observation indicates that data annotation in real applications can be largely simplified.

dataset, sentiment analysis, vector, (14 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Cross-Lingual Propagation for Deep Sentiment Analysis

Dong, Xin (Rutgers University) | Melo, Gerard de (Rutgers University)

AAAI ConferencesFeb-8-2018

For many languages and domains, there is a paucity of available Given such valuable data, modern deep learning-based sentiment data and resources. In some cases, it may be challenging analysis methods excel at determining the sentiment to obtain sufficient in-domain training data, both because polarity of what is being said about companies, products, etc. there may be less data available online and because it may be (Wang et al. 2015). Unfortunately, such deep methods require somewhat harder to find annotators. Hence, a question that substantial amounts of training data, because multiple levels arises is whether one can assist deep networks by incorporating of computation, each with additional weights and parameters, external cues that enable the model to generalize better.

proc, sentiment, vector, (15 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Media (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Sentiment Lexicon Enhanced Attention-Based LSTM for Sentiment Classification

Lei, Zeyang (Tsinghua University) | Yang, Yujiu (Tsinghua University) | Yang, Min ( Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences )

AAAI ConferencesFeb-8-2018

Deep neural networks have gained great success recently for sentiment classification. However, these approaches do not fully exploit the linguistic knowledge. In this paper, we propose a novel sentiment lexicon enhanced attention-based LSTM (SLEA-LSTM) model to improve the performance of sentence-level sentiment classification. Our method successfully integrates sentiment lexicon into deep neural networks via single-head or multi-head attention mechanisms. We conduct extensive experiments on MR and SST datasets. The experimental results show that our model achieved comparable or better performance than the state-of-the-art methods.

artificial intelligence, machine learning, natural language, (14 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.18)

Genre: Research Report > New Finding (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Review Representations With User Attention and Product Attention for Sentiment Classification

Wu, Zhen (Nanjing University) | Dai, Xin-Yu (Nanjing University) | Yin, Cunyan (Nanjing University) | Huang, Shujian (Nanjing University) | Chen, Jiajun (Nanjing University)

AAAI ConferencesFeb-8-2018

Neural network methods have achieved great success in reviews sentiment classification. Recently, some works achieved improvement by incorporating user and product information to generate a review representation. However, in reviews, we observe that some words or sentences show strong user's preference, and some others tend to indicate product's characteristic. The two kinds of information play different roles in determining the sentiment label of a review. Therefore, it is not reasonable to encode user and product information together into one representation. In this paper, we propose a novel framework to encode user and product information. Firstly, we apply two individual hierarchical neural networks to generate two representations, with user attention or with product attention. Then, we design a combined strategy to make full use of the two representations for training and final prediction. The experimental results show that our model obviously outperforms other state-of-the-art methods on IMDB and Yelp datasets. Through the visualization of attention over words related to user or product, we validate our observation mentioned above.

artificial intelligence, machine learning, natural language, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Content and Context: Two-Pronged Bootstrapped Learning for Regex-Formatted Entity Extraction

Simoes, Stanley (Indian Institute of Technology Madras) | P, Deepak (Queen's University Belfast) | Sairamesh, Munu (Indian Institute of Technology Madras) | Khemani, Deepak (Indian Institute of Technology Madras) | Mehta, Sameep (IBM Research - India)

AAAI ConferencesFeb-8-2018

Regular expressions are an important building block of rule-based information extraction systems. Regexes can encode rules to recognize instances of simple entities which can then feed into the identification of more complex cross-entity relationships. Manually crafting a regex that recognizes all possible instances of an entity is difficult since an entity can manifest in a variety of different forms. Thus, the problem of automatically generalizing manually crafted seed regexes to improve the recall of IE systems has attracted research attention. In this paper, we propose a bootstrapped approach to improve the recall for extraction of regex-formatted entities, with the only source of supervision being the seed regex. Our approach starts from a manually authored high precision seed regex for the entity of interest, and uses the matches of the seed regex and the context around these matches to identify more instances of the entity. These are then used to identify a set of diverse, high recall regexes that are representative of this entity. Through an empirical evaluation over multiple real world document corpora, we illustrate the effectiveness of our approach.

artificial intelligence, natural language, text processing, (16 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.82)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)

Add feedback

Cognition-Cognizant Sentiment Analysis With Multitask Subjectivity Summarization Based on Annotators' Gaze Behavior

Mishra, Abhijit (IBM Research AI ) | Tamilselvam, Srikanth (IBM Research AI ) | Dasgupta, Riddhiman (IBM Research AI ) | Nagar, Seema (IBM Research AI ) | Dey, Kuntal (IBM Research AI )

AAAI ConferencesFeb-8-2018

For document level sentiment analysis (SA), Subjectivity Extraction, ie., extracting the relevant subjective portions of the text that cover the overall sentiment expressed in the document, is an important step. Subjectivity Extraction, however, is a hard problem for systems, as it demands a great deal of world knowledge and reasoning. Humans, on the other hand, are good at extracting relevant subjective summaries from an opinionated document (say, a movie review), while inferring the sentiment expressed in it. This capability is manifested in their eye-movement behavior while reading: words pertaining to the subjective summary of the text attract a lot more attention in the form of gaze-fixations and/or saccadic patterns. We propose a multi-task deep neural framework for document level sentiment analysis that learns to predict the overall sentiment expressed in the given input document, by simultaneously learning to predict human gaze behavior and auxiliary linguistic tasks like part-of-speech and syntactic properties of words in the document. For this, a multi-task learning algorithm based on multi-layer shared LSTM augmented with task specific classifiers is proposed. With this composite multi-task network, we obtain performance competitive with or better than state-of-the-art approaches in SA. Moreover, the availability of gaze predictions as an auxiliary output helps interpret the system better; for instance, gaze predictions reveal that the system indeed performs subjectivity extraction better, which accounts for improvement in document level sentiment analysis performance.

machine learning, natural language, sentiment analysis, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia (0.28)

Genre:

Research Report (0.48)
Overview (0.48)

Industry:

Media > Film (1.00)
Leisure & Entertainment (0.90)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification

Li, Zheng (Hong Kong University of Science and Technology) | Wei, Ying (Hong Kong University of Science and Technology) | Zhang, Yu (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

AAAI ConferencesFeb-8-2018

Cross-domain sentiment classification aims to leverage useful information in a source domain to help do sentiment classification in a target domain that has no or little supervised information. Existing cross-domain sentiment classification methods cannot automatically capture non-pivots, i.e., the domain-specific sentiment words, and pivots, i.e., the domain-shared sentiment words, simultaneously. In order to solve this problem, we propose a Hierarchical Attention Transfer Network (HATN) for cross-domain sentiment classification. The proposed HATN provides a hierarchical attention transfer mechanism which can transfer attentions for emotions across domains by automatically capturing pivots and non-pivots. Besides, the hierarchy of the attention mechanism mirrors the hierarchical structure of documents, which can help locate the pivots and non-pivots better. The proposed HATN consists of two hierarchical attention networks, with one named P-net aiming to find the pivots and the other named NP-net aligning the non-pivots by using the pivots as a bridge. Specifically, P-net firstly conducts individual attention learning to provide positive and negative pivots for NP-net. Then, P-net and NP-net conduct joint attention learning such that the HATN can simultaneously capture pivots and non-pivots and realize transferring attentions for emotions across domains. Experiments on the Amazon review dataset demonstrate the effectiveness of HATN.

classification, natural language, text classification, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback