AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Opinion Mining for Relating Subjective Expressions and Annual Earnings in US Financial Statements

Chen, Chien-Liang, Liu, Chao-Lin, Chang, Yuan-Chen, Tsai, Hsiang-Ping

arXiv.org Artificial IntelligenceOct-14-2012

Financial statements contain quantitative information and manager's subjective evaluation of firm's financial status. Using information released in U.S. 10-K filings. Both qualitative and quantitative appraisals are crucial for quality financial decisions. To extract such opinioned statements from the reports, we built tagging models based on the conditional random field (CRF) techniques, considering a variety of combinations of linguistic factors including morphology, orthography, predicate-argument structure, syntax, and simple semantics. Our results show that the CRF models are reasonably effective to find opinion holders in experiments when we adopted the popular MPQA corpus for training and testing. The contribution of our paper is to identify opinion patterns in multiword expressions (MWEs) forms rather than in single word forms. We find that the managers of corporations attempt to use more optimistic words to obfuscate negative financial performance and to accentuate the positive financial performance. Our results also show that decreasing earnings were often accompanied by ambiguous and mild statements in the reporting year and that increasing earnings were stated in assertive and positive way.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1210.3865

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Trading (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

Add feedback

Sentiment Classification Using the Meaning of Words

Amiri, Hadi (National University of Singapore) | Chua, Tat-Seng (National University of Singapore)

AAAI ConferencesJul-21-2012

Sentiment Classification (SC) is about assigning a positive, negative or neutral label to a piece of text based on its overall opinion. This paper describes our in-progress work on extracting the meaning of words for SC. In particular, we investigate the utility of sense-level polarity information for SC. We first show that methods based on common classification features are not robust and their performance varies widely across different domains. We then show that sense-level polarity information features can significantly improve the performance of SC. We use datasets in different domains to study the robustness of the designated features. Our preliminary results show that the most common sense of the words result in the most robust results across different domains. In addition our observation shows that the sense-level polarity information is useful for producing a set of high-quality seed words which can be used for further improvement of SC task.

information, natural language, text classification, (17 more...)

AAAI Conferences

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: Asia > Singapore > Central Region > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)

Add feedback

Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Liu, Kun-Lin (Shanghai Jiao Tong University) | Li, Wu-Jun (Shanghai Jiao Tong University) | Guo, Minyi (Shanghai Jiao Tong University)

AAAI ConferencesJul-21-2012

Twitter sentiment analysis (TSA) has become a hot research topic in recent years. The goal of this task is to discover the attitude or opinion of the tweets, which is typically formulated as a machine learning based text classification problem. Some methods use manually labeled data to train fully supervised models, while others use some noisy labels, such as emoticons and hashtags, for model training. In general, we can only get a limited number of training data for the fully supervised models because it is very labor-intensive and time-consuming to manually label the tweets. As for the models with noisy labels, it is hard for them to achieve satisfactory performance due to the noise in the labels although it is easy to get a large amount of data for training. Hence, the best strategy is to utilize both manually labeled data and noisy labeled data for training. However, how to seamlessly integrate these two different kinds of data into the same learning framework is still a challenge. In this paper, we present a novel model, called emoticon smoothed language model (ESLAM), to handle this challenge. The basic idea is to train a language model based on the manually labeled data, and then use the noisy emoticon data for smoothing. Experiments on real data sets demonstrate that ESLAM can effectively integrate both kinds of data to outperform those methods using only one of them.

machine learning, natural language, tweet, (18 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.48)

Industry: Information Technology > Services (0.65)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)

Add feedback

Modeling Textual Cohesion for Event Extraction

Huang, Ruihong (University of Utah) | Riloff, Ellen (University of Utah)

AAAI ConferencesJul-21-2012

Event extraction systems typically locate the role fillers for an event by analyzing sentences in isolation and identifying each role filler independently of the others. We argue that more accurate event extraction requires a view of the larger context to decide whether an entity is related to a relevant event. We propose a bottom-up approach to event extraction that initially identifies candidate role fillers independently and then uses that information as well as discourse properties to model textual cohesion. The novel component of the architecture is a sequentially structured sentence classifier that identifies event-related story contexts. The sentence classifier uses lexical associations and discourse relations across sentences, as well as domain-specific distributions of candidate role fillers within and across sentences. This approach yields state-of-the-art performance on the MUC-4 data set, achieving substantially higher precision than previous systems.

machine learning, natural language, role filler, (17 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)

Industry:

Government (0.93)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Table Header Detection and Classification

Fang, Jing (Peking University) | Mitra, Prasenjit (The Pennsylvania State University) | Tang, Zhi (Peking University) | Giles, C. Lee (The Pennsylvania State University)

AAAI ConferencesJul-21-2012

In digital libraries, a table, as a specific document component as well as a condensed way to present structured and relational data, contains rich information and often the only source of .that information. In order to explore, retrieve, and reuse that data, tables should be identified and the data extracted. Table recognition is an old field of research. However, due to the diversity of table styles, the results are still far from satisfactory, and not a single algorithm performs well on all different types of tables. In this paper, we randomly take samples from the CiteSeerX to investigate diverse table styles for automatic table extraction. We find that table headers are one of the main characteristics of complex table styles. We identify a set of features that can be used to segregate headers from tabular data and build a classifier to detect table headers. Our empirical evaluation on PDF documents shows that using a Random Forest classifier achieves an accuracy of 92%.

header, machine learning, natural language, (21 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

SenticNet 2: A Semantic and Affective Resource for Opinion Mining and Sentiment Analysis

Cambria, Erik (National University of Singapore) | Havasi, Catherine (MIT Media Lab) | Hussain, Amir (University of Stirling)

AAAI ConferencesMay-20-2012

Web 2.0 has changed the ways people communicate, collaborate, and express their opinions and sentiments. But despite social data on the Web being perfectly suitable for human consumption, they remain hardly accessible to machines. To bridge the cognitive and affective gap between word-level natural language data and the concept-level sentiments conveyed by them, we developed SenticNet 2, a publicly available semantic and affective resource for opinion mining and sentiment analysis. SenticNet 2 is built by means of sentic computing, a new paradigm that exploits both AI and Semantic Web techniques to better recognize, interpret, and process natural language opinions. By providing the semantics and sentics (that is, the cognitive and affective information) associated with over 14,000 concepts, SenticNet 2 represents one of the most comprehensive semantic resources for the development of affect-sensitive applications in fields such as social data mining, multimodal affective HCI, and social media marketing.

information, proceedings, senticnet 2, (16 more...)

AAAI Conferences

Twenty-Fifth International FLAIRS Conference

Country:

Europe > United Kingdom (0.29)
Asia > Singapore (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Health & Medicine > Health Care Providers & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Influenza Patients Are Invisible in the Web: Traditional Model Still Improves the State of the Art Web Based Influenza Surveillance

Aramaki, Eiji (University of Tokyo) | Maskawa, Sachiko (University of Tokyo) | Morita, Mizuki

AAAI ConferencesMar-25-2012

Although web-based information extraction systems draw much attention, most of such systems assume that the web directly reflects the real world. For instance, Google flu trend, which is one of the-state-of-the-art influenza surveillance systems, relies on the basic idea that the amount of the influenza related search queries directly correlates with the number of the influenza patients. However, the real patients suffering from influenza symptoms are invisible in the web, because they do not use Internet. Considering this gap, this paper employs an infectious model, assuming that a potential patient utilizes Internet at the first sign of flu. The proposed model improves two types of the state-of-the-art systems, Google based system (from 0.837 correlation to 0.928) and Twitter based system (from 0.898 correlation to 0.918). This study demonstrated that a simple model could easily improve the web-based surveillance.

artificial intelligence, information management, natural language, (19 more...)

AAAI Conferences

2012 AAAI Spring Symposium Series

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
North America > United States (0.04)
Asia > Southeast Asia (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report > New Finding (0.89)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (0.99)
Information Technology > Information Management > Search (0.92)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.69)

Add feedback

FoodMood: Measuring Global Food Sentiment One Tweet at a Time

Dixon, Natalie (Affect Lab Foundation) | Jakic, Bruno (AI Applied) | Lagerweij, Roderick (AI Applied) | Mooij, Mark (AI Applied) | Yudin, Ekaterina (Affect Lab Foundation)

AAAI ConferencesFeb-22-2012

Do Happy Meals really make us happy? Do salads make us blue? Is cake our comfort? FoodMood is an interactive data visualisation project that gives citizens a rare opportunity to engage and reflect, acknowledge, and understand the connection between emotion, obesity and food. The project explores the opportunities presented by the data-sharing world of today’s cities using global English-language tweets about food coupled with sentiment analysis. It aims to gain a better understanding of global food consumption patterns and its impact on the daily emotional well-being of people against the backdrop of country data such as Gross Domestic Product (GDP) and obesity levels. A key finding is that tweets can be used to find a relationship between certain foods, food sentiment and obesity levels in countries. Overall FoodMood shows a majority positive sentiment towards food. Other findings, although constantly evolving, indicate trends such as: globally meat enjoys a high sentiment rating and is often tweeted about; fast-food companies dominate the food consumption landscapes of most countries’ tweets although not all of them enjoy equal sentiment ratings across countries. Ultimately, FoodMood reveals a hidden layer of meaningful digital, social, and cultural data that provide a basis for further analysis.

machine learning, natural language, tweet, (19 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country:

Africa > South Africa (0.05)
Africa > Zimbabwe (0.04)
South America > Argentina (0.04)
(4 more...)

Industry:

Health & Medicine > Consumer Health (1.00)
Consumer Products & Services (1.00)
Education > Health & Safety > School Nutrition (0.55)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.50)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.36)

Add feedback

Emotional Divergence Influences Information Spreading in Twitter

Pfitzner, Rene (ETH Zurich) | Garas, Antonios (ETH Zurich) | Schweitzer, Frank (ETH Zurich)

AAAI ConferencesFeb-22-2012

We analyze data about the micro-blogging site Twitter using sentiment extraction techniques. From an information perspective, Twitter users are involved mostly in two processes: information creation and subsequent distribution (tweeting), and pure information distribution (retweeting), with pronounced preference to the first. However a rather substantial fraction of tweets are retweeted. Here, we address the role of the sentiment expressed in tweets for their potential aftermath. We find that although the overall sentiment (polarity) does not influence the probability of a tweet to be retweeted, a new measure called "emotional divergence" does have an impact. In general, tweets with high emotional diversity have a better chance of being retweeted, hence influencing the distribution of information.

artificial intelligence, natural language, tweet, (14 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Hawaii (0.04)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Valletta (0.04)

Industry:

Information Technology > Services (0.50)
Media > News (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.89)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.89)

Add feedback

Do You Feel What I Feel? Social Aspects of Emotions in Twitter Conversations

Kim, Suin (KAIST) | Bak, JinYeong (KAIST) | Oh, Alice Haeyun (KAIST)

AAAI ConferencesFeb-22-2012

We present a computational framework for understanding the social aspects of emotions in Twitter conversations. Using unannotated data and semisupervised machine learning, we look at emotional transitions, emotional influences among the conversation partners, and patterns in the overall emotional exchanges. We find that conversational partners usually express the same emotion, which we name Emotion accommodation, but when they do not, one of the conversational partners tends to respond with a positive emotion. We also show that tweets containing sympathy, apology, and complaint are significant emotion influencers. We verify the emotion classification part of our framework by a human-annotated corpus.

artificial intelligence, natural language, social media, (16 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country:

Asia > South Korea (0.05)
North America > United States > Hawaii (0.05)
North America > United States > New York (0.04)

Industry: Information Technology > Services (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.47)

Add feedback