AITopics

Microblog sentiment classification is an important research topic which has wide applications in both academia and industry. Because microblog messages are short, noisy and contain masses of acronyms and informal words, microblog sentiment classification is a very challenging task. Fortunately, collectively the contextual information about these idiosyncratic words provide knowledge about their sentiment orientations. In this paper, we propose to use the microblogs' contextual knowledge mined from a large amount of unlabeled data to help improve microblog sentiment classification. We define two kinds of contextual knowledge: word-word association and word-sentiment association. The contextual knowledge is formulated as regularization terms in supervised learning algorithms. An efficient optimization procedure is proposed to learn the model. Experimental results on benchmark datasets show that our method can consistently and significantly outperform the state-of-the-art methods.

artificial intelligence, machine learning, natural language, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.66)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Mining User Interests from Personal Photos

Xie, Pengtao (Carnegie Mellon University) | Pei, Yulong (Carnegie Mellon University) | Xie, Yuan (Carnegie Mellon University) | Xing, Eric (Carnegie Mellon University)

Personal photos are enjoying explosive growth with the popularity of photo-taking devices and social media. The vast amount of online photos largely exhibit users' interests, emotion and opinions. Mining user interests from personal photos can boost a number of utilities, such as advertising, interest based community detection and photo recommendation. In this paper, we study the problem of user interests mining from personal photos. We propose a User Image Latent Space Model to jointly model user interests and image contents. User interests are modeled as latent factors and each user is assumed to have a distribution over them. By inferring the latent factors and users' distributions, we can discover what the users are interested in. We model image contents with a four-level hierarchical structure where the layers correspond to themes, semantic regions, visual words and pixels respectively. Users' latent interests are embedded in the theme layer. Given image contents, users' interests can be discovered by doing posterior inference. We use variational inference to approximate the posteriors of latent variables and learn model parameters. Experiments on 180K Flickr photos demonstrate the effectiveness of our model.

artificial intelligence, machine learning, natural language, (17 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Jordan (0.05)

Industry: Information Technology > Services (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.48)

Wang, Xin (Jilin University;Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education) | Wang, Ying (Changchun Institute of Tech) | Zuo, Wanli (Jilin University) | Cai, Guoyong (Jilin University)

Exploring Social Context for Topic Identification in Short and Noisy Texts

With the pervasion of social media, topic identification in short texts attracts increasing attention in recent years. However, in nature the texts of social media are short and noisy, and the structures are sparse and dynamic, resulting in difficulty to identify topic categories exactly from online social media. Inspired by social science findings that preference consistency and social contagion are observed in social media, we investigate topic identification in short and noisy texts by exploring social context from the perspective of social sciences. In particular, we present a mathematical optimization formulation that incorporates the preference consistency and social contagion theories into a supervised learning method, and conduct feature selection to tackle short and noisy texts in social media, which result in a Sociological framework for Topic Identification (STI). Experimental results on real-world datasets from Twitter and Citation Network demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of social context in topic identification.

machine learning, natural language, text classification, (22 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: Asia > China > Jilin Province (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
(6 more...)

Scalable and Interpretable Data Representation for High-Dimensional, Complex Data

Kim, Been (Massachusetts Institute of Technology) | Patel, Kayur (Google) | Rostamizadeh, Afshin (Google) | Shah, Julie (Massachusetts Institute of Technology)

The majority of machine learning research has been focused on building models and inference techniques with sound mathematical properties and cutting edge performance. Little attention has been devoted to the development of data representation that can be used to improve a user's ability to interpret the data and machine learning models to solve real-world problems. In this paper, we quantitatively and qualitatively evaluate an efficient, accurate and scalable feature-compression method using latent Dirichlet allocation for discrete data. This representation can effectively communicate the characteristics of high-dimensional, complex data points. We show that the improvement of a user's interpretability through the use of a topic modeling-based compression technique is statistically significant, according to a number of metrics, when compared with other representations. Also, we find that this representation is scalable --- it maintains alignment with human classification accuracy as an increasing number of data points are shown. In addition, the learned topic layer can semantically deliver meaningful information to users that could potentially aid human reasoning about data characteristics in connection with compressed topic space.

artificial intelligence, machine learning, natural language, (20 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.35)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Hoey, Jesse (University of Waterloo) | Schroeder, Tobias (Potsdam University of Applied Sciences)

Bayesian Affect Control Theory of Self

Notions of identity and of the self have long been studied in social psychology and sociology as key guiding elements of social interaction and coordination. In the AI of the future, these notions will also play a role in producing natural, socially appropriate artificially intelligent agents that encompass subtle and complex human social and affective skills. We propose here a Bayesian generalization of the sociological affect control theory of self as a theoretical foundation for socio-affectively skilled artificial agents. This theory posits that each human maintains an internal model of his or her deep sense of "self" that captures their emotional, psychological, and socio-cultural sense of being in the world. The "self" is then externalised as an identity within any given interpersonal and institutional situation, and this situational identity is the person's local (in space and time) representation of the self. Situational identities govern the actions of humans according to affect control theory. Humans will seek situations that allow them to enact identities consistent with their sense of self. This consistency is cumulative over time: if some parts of a person's self are not actualized regularly, the person will have a growing feeling of inauthenticity that they will seek to resolve. In our present generalisation, the self is represented as a probability distribution, allowing it to be multi-modal (a person can maintain multiple different identities), uncertain (a person can be unsure about who they really are), and learnable (agents can learn the identities and selves of other agents). We show how the Bayesian affect control theory of self can underpin artificial agents that are socially intelligent.

artificial intelligence, machine learning, natural language, (19 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.49)

Industry:

Education (0.68)
Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)
(2 more...)

Representation Learning for Aspect Category Detection in Online Reviews

Zhou, Xinjie (Peking University) | Wan, Xiaojun (Peking University) | Xiao, Jianguo (Peking University)

User-generated reviews are valuable resources for decision making. Identifying the aspect categories discussed in a given review sentence (e.g., “food” and “service” in restaurant reviews) is an important task of sentiment analysis and opinion mining. Given a predefined aspect category set, most previous researches leverage hand-crafted features and a classification algorithm to accomplish the task. The crucial step to achieve better performance is feature engineering which consumes much human effort and may be unstable when the product domain changes. In this paper, we propose a representation learning approach to automatically learn useful features for aspect category detection. Specifically, a semi-supervised word embedding algorithm is first proposed to obtain continuous word representations on a large set of reviews with noisy labels. Afterwards, we propose to generate deeper and hybrid features through neural networks stacked on the word vectors. A logistic regression classifier is finally trained with the hybrid features to predict the aspect category. The experiments are carried out on a benchmark dataset released by SemEval-2014. Our approach achieves the state-of-the-art performance and outperforms the best participating team as well as a few strong baselines.

category, machine learning, natural language, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.15)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks

You, Quanzeng (University of Rochester) | Luo, Jiebo (University of Rochester) | Jin, Hailin (Adobe Research) | Yang, Jianchao (Adobe Research)

Sentiment analysis of online user generated content is important for many social media analytics tasks. Researchers have largely relied on textual sentiment analysis to develop systems to predict political elections, measure economic indicators, and so on. Recently, social media users are increasingly using images and videos to express their opinions and share their experiences. Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis. Motivated by the needs in leveraging large scale yet noisy training data to solve the extremely challenging problem of image sentiment analysis, we employ Convolutional Neural Networks (CNN). We first design a suitable CNN architecture for image sentiment analysis. We obtain half a million training samples by using a baseline sentiment algorithm to label Flickr images. To make use of such noisy machine labeled data, we employ a progressive strategy to fine-tune the deep network. Furthermore, we improve the performance on Twitter images by inducing domain transfer with a small number of manually labeled Twitter images. We have conducted extensive experiments on manually labeled Twitter images. The results show that the proposed CNN can achieve better performance in image sentiment analysis than competing algorithms.

machine learning, natural language, sentiment analysis, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.69)
Government > Voting & Elections (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Yan, Xiaohui (Institute of Computing Technology, Chinese Academy of Science) | Guo, Jiafeng (Institute of Computing Technology, Chinese Academy of Science) | Lan, Yanyan (Institute of Computing Technology, Chinese Academy of Science) | Xu, Jun (Institute of Computing Technology, Chinese Academy of Science) | Cheng, Xueqi (Institute of Computing Technology, Chinese Academy of Science)

A Probabilistic Model for Bursty Topic Discovery in Microblogs

Bursty topics discovery in microblogs is important for people to grasp essential and valuable information. However, the task is challenging since microblog posts are particularly short and noisy. This work develops a novel probabilistic model, namely Bursty Biterm Topic Model (BBTM), to deal with the task. BBTM extends the Biterm Topic Model (BTM) by incorporating the burstiness of biterms as prior knowledge for bursty topic modeling, which enjoys the following merits: 1) It can well solve the data sparsity problem in topic modeling over short texts as the same as BTM; 2) It can automatical discover high quality bursty topics in microblogs in a principled and efficient way. Extensive experiments on a standard Twitter dataset show that our approach outperforms the state-of-the-art baselines significantly.

bursty topic, machine learning, natural language, (16 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Asia (0.29)
North America > United States (0.28)
Europe (0.28)

Industry:

Law Enforcement & Public Safety (0.46)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.58)

A Tri-Role Topic Model for Domain-Specific Question Answering

Ma, Zongyang (Nanyang Technological University) | Sun, Aixin ( Nanyang Technological University ) | Yuan, Quan (Nanyang Technological University) | Cong, Gao (Nanyang Technological University)

Stack Overflow and MedHelp are examples of domain-specific community-based question answering (CQA) systems. Different from CQA systems for general topics (e.g., Yahoo! Answers, Baidu Knows), questions and answers in domain-specific CQA systems are mostly in the same topical domain, enabling more comprehensive interaction between users on fine-grained topics. In such systems, users are more likely to ask questions on unfamiliar topics and to answer questions matching their expertise. Users can also vote answers based on their judgements. In this paper, we propose a Tri-Role Topic Model (TRTM) to model the tri-roles of users (i.e., as askers, answerers, and voters, respectively) and the activities of each role including composing question, selecting question to answer, contributing and voting answers. The proposed model can be used to enhance CQA systems from many perspectives. As a case study, we conducted experiments on ranking answers for questions on Stack Overflow, a CQA system for professional and enthusiast programmers. Experimental results show that TRTM is effective in facilitating users getting ideal rankings of answers, particularly for new and less popular questions. Evaluated on nDCG, TRTM outperforms state-of-the-art methods.

artificial intelligence, natural language, question answering, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.61)

Estimating Temporal Dynamics of Human Emotions

Kim, Seungyeon (Georgia Institute of Technology) | Lee, Joonseok (Georgia Institute of Technology) | Lebanon, Guy (Amazon) | Park, Haesun (Georgia Institute of Technology)

Sentiment analysis predicts a one-dimensional quantity describing the positive or negative emotion of an author. Mood analysis extends the one-dimensional sentiment response to a multi-dimensional quantity, describing a diverse set of human emotions. In this paper, we extend sentiment and mood analysis temporally and model emotions as a function of time based on temporal streams of blog posts authored by a specific author. The model is useful for constructing predictive models and discovering scientific models of human emotions.

artificial intelligence, machine learning, natural language, (20 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
(3 more...)