AITopics

2012 AAAI Fall Symposium Series

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Maryland > Baltimore (0.04)
North America > Puerto Rico (0.04)
Europe > United Kingdom (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.97)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.96)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.51)

AAAI ConferencesNov-5-2012

Discovering Health Beliefs in Twitter

Bhattacharya, Sanmitra (The University of Iowa) | Tran, Hung (The University of Iowa) | Srinivasan, Padmini (The University of Iowa)

Social networking websites such as Twitter have invigorated a wide range of studies in recent years ranging from consumer opinions on products to tracking the spread of diseases. While sentiment analysis and opinion mining from tweets have been studied extensively, surveillance of beliefs, especially those related to public health, have received considerably less attention. In our previous work, we proposed a model for surveillance of health beliefs on Twitter relying on the use of hand-picked probe statements expressing various health-related propositions. In this work we extend our model to automatically discover various probes related to public health beliefs. We present a data driven approach based on two distinct datasets and study the prevalence of public belief, disbelief or doubt for newly discovered probe statements.

artificial intelligence, natural language, social media, (18 more...)

2012 AAAI Fall Symposium Series

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
South America (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(9 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)

Virtanen, Seppo, Jia, Yangqing, Klami, Arto, Darrell, Trevor

Factorized Multi-Modal Topic Model

arXiv.org Machine LearningOct-16-2012

Multi-modal data collections, such as corpora of paired images and text snippets, require analysis methods beyond single-view component and topic models. For continuous observations the current dominant approach is based on extensions of canonical correlation analysis, factorizing the variation into components shared by the different modalities and those private to each of them. For count data, multiple variants of topic models attempting to tie the modalities together have been presented. All of these, however, lack the ability to learn components private to one modality, and consequently will try to force dependencies even between minimally correlating modalities. In this work we combine the two approaches by presenting a novel HDP-based topic model that automatically learns both shared and private topics. The model is shown to be especially useful for querying the contents of one domain given samples of the other.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1210.492

Country:

Europe (0.93)
Asia > Middle East (0.15)

Genre: Research Report (0.40)

Industry: Transportation (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningOct-16-2012

Latent Dirichlet Allocation Uncovers Spectral Characteristics of Drought Stressed Plants

Wahabzada, Mirwaes, Kersting, Kristian, Bauckhage, Christian, Roemer, Christoph, Ballvora, Agim, Pinto, Francisco, Rascher, Uwe, Leon, Jens, Ploemer, Lutz

Understanding the adaptation process of plants to drought stress is essential in improving management practices, breeding strategies as well as engineering viable crops for a sustainable agriculture in the coming decades. Hyper-spectral imaging provides a particularly promising approach to gain such understanding since it allows to discover non-destructively spectral characteristics of plants governed primarily by scattering and absorption characteristics of the leaf internal structure and biochemical constituents. Several drought stress indices have been derived using hyper-spectral imaging. However, they are typically based on few hyper-spectral images only, rely on interpretations of experts, and consider few wavelengths only. In this study, we present the first data-driven approach to discovering spectral drought stress indices, treating it as an unsupervised labeling problem at massive scale. To make use of short range dependencies of spectral wavelengths, we develop an online variational Bayes algorithm for latent Dirichlet allocation with convolved Dirichlet regularizer. This approach scales to massive datasets and, hence, provides a more objective complement to plant physiological practices. The spectral topics found conform to plant physiological knowledge and can be computed in a fraction of the time compared to existing LDA approaches.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Machine Learning

1210.4919

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Food & Agriculture > Agriculture (1.00)
Education (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Chen, Chien-Liang, Liu, Chao-Lin, Chang, Yuan-Chen, Tsai, Hsiang-Ping

Opinion Mining for Relating Subjective Expressions and Annual Earnings in US Financial Statements

arXiv.org Artificial IntelligenceOct-14-2012

Financial statements contain quantitative information and manager's subjective evaluation of firm's financial status. Using information released in U.S. 10-K filings. Both qualitative and quantitative appraisals are crucial for quality financial decisions. To extract such opinioned statements from the reports, we built tagging models based on the conditional random field (CRF) techniques, considering a variety of combinations of linguistic factors including morphology, orthography, predicate-argument structure, syntax, and simple semantics. Our results show that the CRF models are reasonably effective to find opinion holders in experiments when we adopted the popular MPQA corpus for training and testing. The contribution of our paper is to identify opinion patterns in multiword expressions (MWEs) forms rather than in single word forms. We find that the managers of corporations attempt to use more optimistic words to obfuscate negative financial performance and to accentuate the positive financial performance. Our results also show that decreasing earnings were often accompanied by ambiguous and mild statements in the reporting year and that increasing earnings were stated in assertive and positive way.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1210.3865

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Trading (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

Dubey, Avinava, Hefny, Ahmed, Williamson, Sinead, Xing, Eric P.

A non-parametric mixture model for topic modeling over time

arXiv.org Machine LearningAug-21-2012

A single, stationary topic model such as latent Dirichlet allocation is inappropriate for modeling corpora that span long time periods, as the popularity of topics is likely to change over time. A number of models that incorporate time have been proposed, but in general they either exhibit limited forms of temporal variation, or require computationally expensive inference methods. In this paper we propose nonparametric Topics over Time (npTOT), a model for time-varying topics that allows an unbounded number of topics and flexible distribution over the temporal variations in those topics' popularity. We develop a collapsed Gibbs sampler for the proposed model and compare against existing models on synthetic and real document sets.

dirichlet process, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1208.4411

Country: North America > United States (0.47)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Epstein, Susan L. (Hunter College and The Graduate Center of The City University of New York) | Passonneau, Rebecca J. (Center for Computational Learning Systems, Columbia University) | Ligorio, Tiziana (Hunter College of The City University of New York) | Gordon, Joshua (Columbia University)

Toward Habitable Assistance from Spoken Dialogue Systems

Spoken dialogue is increasingly central to systems that assist people. As the tasks that people and machines speak about together become more complex, however, users’ dissatisfaction with those systems is an important concern. This paper presents a novel approach to learning for spoken dialogue systems. It describes embedded wizardry, a methodology for learning from skilled people, and applies it to a library whose patrons order books by telephone. To address the challenges inherent in this application, we introduce RFW+, a domain-independent, feature-selection method that considers feature categories. Models learned with RFW+ on embedded-wizard data improve the performance of a traditional spoken dialogue system.

checkitout, dialogue system, wizard, (16 more...)

Twenty-Fourth IAAI Conference

Country:

North America > United States > New York > New York County > New York City (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Heart Rate Topic Models

Esbroeck, Alexander Van (University of Michigan) | Chia, Chih-Chun (University of Michigan) | Syed, Zeeshan (University of Michigan)

A key challenge in reducing the burden of cardiovascular disease is matching patients to treatments that are most appropriate for them. Different cardiac assessment tools have been developed to address this goal. Recent research has focused on heart rate motifs, i.e., short-term heart rate sequences that are over- or under-represented in long-term electrocardiogram (ECG) recordings of patients experiencing cardiovascular outcomes, which provide novel and valuable information for risk stratification. However, this approach can leverage only a small number of motifs for prediction and results in difficult to interpret models. We address these limitations by identifying latent structure in the large numbers of motifs found in long-term ECG recordings. In particular, we explore the application of topic models to heart rate time series to identify functional sets of heart rate sequences and to concisely describe patients using task-independent features for various cardiovascular outcomes. We evaluate the approach on a large collection of real-world ECG data, and investigate the performance of topic mixture features for the prediction of cardiovascular mortality. The topics provided an interpretable representation of the recordings and maintained valuable information for clinical assessment when compared with motif frequencies, even after accounting for commonly used clinical risk scores.

heart rate motif, motif, topic model, (11 more...)

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Asia > Middle East > Jordan (0.05)

Genre: Research Report > Experimental Study (0.94)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Amiri, Hadi (National University of Singapore) | Chua, Tat-Seng (National University of Singapore)

Sentiment Classification Using the Meaning of Words

Sentiment Classification (SC) is about assigning a positive, negative or neutral label to a piece of text based on its overall opinion. This paper describes our in-progress work on extracting the meaning of words for SC. In particular, we investigate the utility of sense-level polarity information for SC. We first show that methods based on common classification features are not robust and their performance varies widely across different domains. We then show that sense-level polarity information features can significantly improve the performance of SC. We use datasets in different domains to study the robustness of the designated features. Our preliminary results show that the most common sense of the words result in the most robust results across different domains. In addition our observation shows that the sense-level polarity information is useful for producing a set of high-quality seed words which can be used for further improvement of SC task.

information, natural language, text classification, (17 more...)

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: Asia > Singapore > Central Region > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)

Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Liu, Kun-Lin (Shanghai Jiao Tong University) | Li, Wu-Jun (Shanghai Jiao Tong University) | Guo, Minyi (Shanghai Jiao Tong University)

Twitter sentiment analysis (TSA) has become a hot research topic in recent years. The goal of this task is to discover the attitude or opinion of the tweets, which is typically formulated as a machine learning based text classification problem. Some methods use manually labeled data to train fully supervised models, while others use some noisy labels, such as emoticons and hashtags, for model training. In general, we can only get a limited number of training data for the fully supervised models because it is very labor-intensive and time-consuming to manually label the tweets. As for the models with noisy labels, it is hard for them to achieve satisfactory performance due to the noise in the labels although it is easy to get a large amount of data for training. Hence, the best strategy is to utilize both manually labeled data and noisy labeled data for training. However, how to seamlessly integrate these two different kinds of data into the same learning framework is still a challenge. In this paper, we present a novel model, called emoticon smoothed language model (ESLAM), to handle this challenge. The basic idea is to train a language model based on the manually labeled data, and then use the noisy emoticon data for smoothing. Experiments on real data sets demonstrate that ESLAM can effectively integrate both kinds of data to outperform those methods using only one of them.

machine learning, natural language, tweet, (18 more...)

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.48)

Industry: Information Technology > Services (0.65)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)