Goto

Collaborating Authors

 Gamon, Michael


Modeling Tag Prediction based on Question Tagging Behavior Analysis of CommunityQA Platform Users

arXiv.org Artificial Intelligence

In community question-answering platforms, tags play essential roles in effective information organization and retrieval, better question routing, faster response to questions, and assessment of topic popularity. Hence, automatic assistance for predicting and suggesting tags for posts is of high utility to users of such platforms. To develop better tag prediction across diverse communities and domains, we performed a thorough analysis of users' tagging behavior in 17 StackExchange communities. We found various common inherent properties of this behavior in those diverse domains. We used the findings to develop a flexible neural tag prediction architecture, which predicts both popular tags and more granular tags for each question. Our extensive experiments and obtained performance show the effectiveness of our model


MS-LaTTE: A Dataset of Where and When To-do Tasks are Completed

arXiv.org Artificial Intelligence

Tasks are a fundamental unit of work in the daily lives of people, who are increasingly using digital means to keep track of, organize, triage and act on them. These digital tools -- such as task management applications -- provide a unique opportunity to study and understand tasks and their connection to the real world, and through intelligent assistance, help people be more productive. By logging signals such as text, timestamp information, and social connectivity graphs, an increasingly rich and detailed picture of how tasks are created and organized, what makes them important, and who acts on them, can be progressively developed. Yet the context around actual task completion remains fuzzy, due to the basic disconnect between actions taken in the real world and telemetry recorded in the digital world. Thus, in this paper we compile and release a novel, real-life, large-scale dataset called MS-LaTTE that captures two core aspects of the context surrounding task completion: location and time. We describe our annotation framework and conduct a number of analyses on the data that were collected, demonstrating that it captures intuitive contextual properties for common tasks. Finally, we test the dataset on the two problems of predicting spatial and temporal task co-occurrence, concluding that predictors for co-location and co-time are both learnable, with a BERT fine-tuned model outperforming several other baselines. The MS-LaTTE dataset provides an opportunity to tackle many new modeling challenges in contextual task understanding and we hope that its release will spur future research in task intelligence more broadly.


Characterizing Stage-Aware Writing Assistance in Collaborative Document Authoring

arXiv.org Artificial Intelligence

Writing is a complex non-linear process that begins with a mental model of intent, and progresses through an outline of ideas, to words on paper (and their subsequent refinement). Despite past research in understanding writing, Web-scale consumer and enterprise collaborative digital writing environments are yet to greatly benefit from intelligent systems that understand the stages of document evolution, providing opportune assistance based on authors' situated actions and context. In this paper, we present three studies that explore temporal stages of document authoring. We first survey information workers at a large technology company about their writing habits and preferences, concluding that writers do in fact conceptually progress through several distinct phases while authoring documents. We also explore, qualitatively, how writing stages are linked to document lifespan. We supplement these qualitative findings with an analysis of the longitudinal user interaction logs of a popular digital writing platform over several million documents. Finally, as a first step towards facilitating an intelligent digital writing assistant, we conduct a preliminary investigation into the utility of user interaction log data for predicting the temporal stage of a document. Our results support the benefit of tools tailored to writing stages, identify primary tasks associated with these stages, and show that it is possible to predict stages from anonymous interaction logs. Together, these results argue for the benefit and feasibility of more tailored digital writing assistance.


SemEval-2020 Task 7: Assessing Humor in Edited News Headlines

arXiv.org Artificial Intelligence

This paper describes the SemEval-2020 shared task "Assessing Humor in Edited News Headlines." The task's dataset contains news headlines in which short edits were applied to make them funny, and the funniness of these edited headlines was rated using crowdsourcing. This task includes two subtasks, the first of which is to estimate the funniness of headlines on a humor scale in the interval 0-3. The second subtask is to predict, for a pair of edited versions of the same original headline, which is the funnier version. To date, this task is the most popular shared computational humor task, attracting 48 teams for the first subtask and 31 teams for the second.


Actionable Email Intent Modeling With Reparametrized RNNs

AAAI Conferences

Emails in the workplace are often intentional calls to action for its recipients. We propose to annotate these emails for what action its recipient will take. We argue that our approach of action-based annotation is more scalable and theory-agnostic than traditional speech-act-based email intent annotation, while still carrying important semantic and pragmatic information. We show that our action-based annotation scheme achieves good inter-annotator agreement. We also show that we can leverage threaded messages from other domains, which exhibit comparable intents in their conversation, with domain adaptive RAINBOW (Recurrently AttentIve Neural Bag-Of-Words). On a collection of datasets consisting of IRC, Reddit, and email, our reparametrized RNNs outperform common multitask/multidomain approaches on several speech act related tasks. We also experiment with a minimally supervised scenario of email recipient action classification, and find the reparametrized RNNs learn a useful representation.


Happy, Nervous or Surprised? Classification of Human Affective States in Social Media

AAAI Conferences

Sentiment classification has been a well-investigated research area in the computational linguistics community. However, most of the research is primarily focused on detecting simply the polarity in text, often needing extensive manual labeling of ground truth. Additionally, little attention has been directed towards a finer analysis of human moods and affective states. Motivated by research in psychology, we propose and develop a classifier of several human affective states in social media. Starting with about 200 moods, we utilize mechanical turk studies to derive naturalistic signals from posts shared on Twitter about a variety of affects of individuals. This dataset is then deployed in an affect classification task with promising results. Our findings indicate that different types of affect involve different emotional content and usage styles; hence the performance of the classifier on various affects can differ considerably.


Not All Moods Are Created Equal! Exploring Human Emotional States in Social Media

AAAI Conferences

Emotional states of individuals, also known as moods, are central to the expression of thoughts, ideas and opinions, and in turn impact attitudes and behavior. As social media tools are increasingly used by individuals to broadcast their day-to-day happenings, or to report on an external event of interest, understanding the rich ‘landscape’ of moods will help us better interpret and make sense of the behavior of millions of individuals. Motivated by literature in psychology, we study a popular representation of human mood landscape, known as the ‘circumplex model’ that characterizes affective experience through two dimensions: valence and activation. We identify more than 200 moods frequent on Twitter, through mechanical turk studies and psychology literature sources, and report on four aspects of mood expression: the relationship between (1) moods and usage levels, including linguistic diversity of shared content (2) moods and the social ties individuals form, (3) moods and amount of network activity of individuals, and (4) moods and participatory patterns of individuals such as link sharing and conversational engagement. Our results provide at-scale naturalistic assessments and extensions of existing conceptualizations of human mood in social media contexts.


Predicting the Importance of Newsfeed Posts and Social Network Friends

AAAI Conferences

As users of social networking websites expand their network of friends, they are often flooded with newsfeed posts and status updates, most of which they consider to be "unimportant" and not newsworthy. In order to better understand how people judge the importance of their newsfeed, we conducted a study in which Facebook users were asked to rate the importance of their newsfeed posts as well as their friends. We learned classifiers of newsfeed and friend importance to identify predictive sets of features related to social media properties, the message text, and shared background information. For classifying friend importance, the best performing model achieved 85% accuracy and 25% error reduction. By leveraging this model for classifying newsfeed posts, the best newsfeed classifier achieved 64% accuracy and 27% error reduction.