Goto

Collaborating Authors

 Genre


Grief-Stricken in a Crowd: The Language of Bereavement and Distress in Social Media

AAAI Conferences

People turn to social media to express their emotions surrounding major life events. Death of a loved one is one scenario in which people share their feelings in the semi-public space of social networking sites. In this paper, we present the results of a two-part investigation of grief and distress in the context of messages posted to the profiles of deceased MySpace users. We present coding system for identifying emotion distressed content, followed by a detailed analysis of language use that lays a foundation for natural language processing (NLP) tasks, such as automatic detection of bereavement-related distress. Our findings suggest that in addition to words bearing positive or negative sentiment, linguistic style can be an indicator of messages that demonstrate distress in the space of post-mortem social media content. These results contribute to research in computational linguistics by identifying linguistic features that can be used for automatic classification as well as to research on death and bereavement by enumerating attributes of distressed self-expression in a post-mortem context.


Modeling Polarizing Topics: When Do Different Political Communities Respond Differently to the Same News?

AAAI Conferences

Political discourse in the United States is getting increasingly polarized. This polarization frequently causes different communities to react very differently to the same news events. Political blogs as a form of social media provide an unique insight into this phenomenon. We present a multitarget, semisupervised latent variable model, MCR-LDA to model this process by analyzing political blogs posts and their comment sections from different political communities jointly to predict the degree of polarization that news topics cause. Inspecting the model after inference reveals topics and the degree to which it triggers polarization. In this approach, community responses to news topics are observed using sentiment polarity and comment volume which serves as a proxy for the level of interest in the topic. In this context, we also present computational methods to assign sentiment polarity to the comments which serve as targets for latent variable models that predict the polarity based on the topics in the blog content. Our results show that the joint modeling of communities with different political beliefs using MCR-LDA does not sacrifice accuracy in sentiment polarity prediction when compared to approaches that are tailored to specific communities and additionally provides a view of the polarization in responses from the different communities.


People Are Strange When You're a Stranger: Impact and Influence of Bots on Social Networks

AAAI Conferences

Bots are, for many Web and social media users, the source of many dangerous attacks or the carrier of unwanted messages, such as spam. Nevertheless, crawlers and software agents are a precious tool for analysts, and they are continuously executed to collect data or to test distributed applications. However, no one knows which is the real potential of a bot whose purpose is to control a community, to manipulate consensus, or to influence user behavior. It is commonly believed that the better an agent simulates human behavior in a social network, the more it can succeed to generate an impact in that community. We contribute to shed light on this issue through an online social experiment aimed to study to what extent a bot with no trust, no profile, and no aims to reproduce human behavior, can become popular and influential in a social media. Results show that a basic social probing activity can be used to acquire social relevance on the network and that the so-acquired popularity can be effectively leveraged to drive users in their social connectivity choices. We also register that our bot activity unveiled hidden social polarization patterns in the community and triggered an emotional response of individuals that brings to light subtle privacy hazards perceived by the user base.


Tutorials

AAAI Conferences

The ICWSM 2012 conference tutorials will be How to Analyze Massive Social Network Datasets without a Cluster, presented by Derek Ruths; Charting Collections of Connections in Social Media: Creating Maps and Measures with NodeXL, presented by Marc Smith; Evidenced-Based Social Design of Online Communities: Getting to Critical Mass and Encouraging Contributions, presented by Paul Resnick and Robert Kraut; Sentiment Mining from User Generated Content, presented by Lyle Ungar and Ronen Feldman; and Information Extraction for Social Media Anaylsis, presented by Denilson Barbosa.


Isabelle/PIDE as Platform for Educational Tools

arXiv.org Artificial Intelligence

The Isabelle/PIDE platform addresses the question whether proof assistants of the LCF family are suitable as technological basis for educational tools. The traditionally strong logical foundations of systems like HOL, Coq, or Isabelle have so far been counter-balanced by somewhat inaccessible interaction via the TTY (or minor variations like the well-known Proof General / Emacs interface). Thus the fundamental question of math education tools with fully-formal background theories has often been answered negatively due to accidental weaknesses of existing proof engines. The idea of "PIDE" (which means "Prover IDE") is to integrate existing provers like Isabelle into a larger environment, that facilitates access by end-users and other tools. We use Scala to expose the proof engine in ML to the JVM world, where many user-interfaces, editor frameworks, and educational tools already exist. This shall ultimately lead to combined mathematical assistants, where the logical engine is in the background, without obstructing the view on applications of formal methods, formalized mathematics, and math education in particular.


Towards an Intelligent Tutor for Mathematical Proofs

arXiv.org Artificial Intelligence

Computer-supported learning is an increasingly important form of study since it allows for independent learning and individualized instruction. In this paper, we discuss a novel approach to developing an intelligent tutoring system for teaching textbook-style mathematical proofs. We characterize the particularities of the domain and discuss common ITS design models. Our approach is motivated by phenomena found in a corpus of tutorial dialogs that were collected in a Wizard-of-Oz experiment. We show how an intelligent tutor for textbook-style mathematical proofs can be built on top of an adapted assertion-level proof assistant by reusing representations and proof search strategies originally developed for automated and interactive theorem proving. The resulting prototype was successfully evaluated on a corpus of tutorial dialogs and yields good results.


Backdoors to Acyclic SAT

arXiv.org Artificial Intelligence

Backdoor sets, a notion introduced by Williams et al. in 2003, are certain sets of key variables of a CNF formula F that make it easy to solve the formula; by assigning truth values to the variables in a backdoor set, the formula gets reduced to one or several polynomial-time solvable formulas. More specifically, a weak backdoor set of F is a set X of variables such that there exits a truth assignment t to X that reduces F to a satisfiable formula F[t] that belongs to a polynomial-time decidable base class C. A strong backdoor set is a set X of variables such that for all assignments t to X, the reduced formula F[t] belongs to C. We study the problem of finding backdoor sets of size at most k with respect to the base class of CNF formulas with acyclic incidence graphs, taking k as the parameter. We show that 1. the detection of weak backdoor sets is W[2]-hard in general but fixed-parameter tractable for r-CNF formulas, for any fixed r>=3, and 2. the detection of strong backdoor sets is fixed-parameter approximable. Result 1 is the the first positive one for a base class that does not have a characterization with obstructions of bounded size. Result 2 is the first positive one for a base class for which strong backdoor sets are more powerful than deletion backdoor sets. Not only SAT, but also #SAT can be solved in polynomial time for CNF formulas with acyclic incidence graphs. Hence Result 2 establishes a new structural parameter that makes #SAT fixed-parameter tractable and that is incomparable with known parameters such as treewidth and clique-width. We obtain the algorithms by a combination of an algorithmic version of the Erd\"os-P\'osa Theorem, Courcelle's model checking for monadic second order logic, and new combinatorial results on how disjoint cycles can interact with the backdoor set.


MAV Stabilization using Machine Learning and Onboard Sensors

arXiv.org Artificial Intelligence

Past automation work with miniature aerial vehicles (MAVs) at Cornell has produced interesting results [1] and presented additional challenges. During past projects, results have often been limited not by insufficiencies in planning algorithms, but by navigation errors stemming from inadequate control in the face of realistic, breezy operating environments. In many cases the MAVs will simply drift off the desired path (Figure 1). Thus, this project focuses on refining the basic motion of the same platform, and in particular, minimizing its drift. Our work focuses on reduction of low frequency drift in gps-denied environments. Similar work has been done, some using neural networks [4] or using adaptive-fuzzy control methods [5] to stabilize a quadrotor. Though this research has produced promising results, these methods were demonstrated only in simulation, not via live testing. 1 Figure 1: Desired path vs. actual path due to drift.


The CQC Algorithm: Cycling in Graphs to Semantically Enrich and Enhance a Bilingual Dictionary

Journal of Artificial Intelligence Research

Bilingual machine-readable dictionaries are knowledge resources useful in many automatic tasks. However, compared to monolingual computational lexicons like WordNet, bilingual dictionaries typically provide a lower amount of structured information such as lexical and semantic relations, and often do not cover the entire range of possible translations for a word of interest. In this paper we present Cycles and Quasi-Cycles (CQC), a novel algorithm for the automated disambiguation of ambiguous translations in the lexical entries of a bilingual machine-readable dictionary. The dictionary is represented as a graph, and cyclic patterns are sought in this graph to assign an appropriate sense tag to each translation in a lexical entry. Further, we use the algorithm's output to improve the quality of the dictionary itself, by suggesting accurate solutions to structural problems such as misalignments, partial alignments and missing entries. Finally, we successfully apply CQC to the task of synonym extraction.


Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences

arXiv.org Machine Learning

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmetic-geometric mean inequality that would prove that the expected convergence rate of without-replacement sampling is faster than that of with-replacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worst-case bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.