Plotting

 Massachusetts Institute of Technology


Sequential Voting Promotes Collective Discovery in Social Recommendation Systems

AAAI Conferences

One goal of online social recommendation systems is to harness the wisdom of crowds in order to identify high quality content. Yet the sequential voting mechanisms that are commonly used by these systems are at odds with existing theoretical and empirical literature on optimal aggregation. This literature suggests that sequential voting will promote herding---the tendency for individuals to copy the decisions of others around them---and hence lead to suboptimal content recommendation. Is there a problem with our practice, or a problem with our theory? Previous attempts at answering this question have been limited by a lack of objective measurements of content quality. Quality is typically defined endogenously as the popularity of content in absence of social influence. The flaw of this metric is its presupposition that the preferences of the crowd are aligned with underlying quality. Domains in which content quality can be defined exogenously and measured objectively are thus needed in order to better assess the design choices of social recommendation systems. In this work, we look to the domain of education, where content quality can be measured via how well students are able to learn from the material presented to them. Through a behavioral experiment involving a simulated massive open online course (MOOC) run on Amazon Mechanical Turk, we show that sequential voting systems can surface better content than systems that elicit independent votes.


Tweet Acts: A Speech Act Classifier for Twitter

AAAI Conferences

Speech acts are a way to conceptualize speech as action. This holds true for communication on any platform, including social media platforms such as Twitter. In this paper, we explored speech act recognition on Twitter by treating it as a multi-class classification problem. We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features. We trained and tested a logistic regression classifier using a data set of manually labelled tweets. Our method achieved a state-of-the-art performance with an average F1 score of more than 0.70. We also explored classifiers with three different granularities (Twitter-wide, type-specific and topic-specific) in order to find the right balance between generalization and overfitting for our task.


Topic Modeling in Twitter: Aggregating Tweets by Conversations

AAAI Conferences

We propose a new pooling technique for topic modeling in Twitter, which groups together tweets occurring in the same user-to-user conversation. Under this scheme, tweets and their replies are aggregated into a single document and the users who posted them are considered co-authors. To compare this new scheme against existing ones, we train topic models using Latent Dirichlet Allocation (LDA) and the Author-Topic Model (ATM) on datasets consisting of tweets pooled according to the different methods. Using the underlying categories of the tweets in this dataset as a noisy ground truth, we show that this new technique outperforms other pooling methods in terms of clustering quality and document retrieval.


Mobility Sequence Extraction and Labeling Using Sparse Cell Phone Data

AAAI Conferences

Human mobility modeling for either transportation system development or individual location based services has a tangible impact on people's everyday experience. In recent years cell phone data has received a lot of attention as a promising data source because of the wide coverage, long observation period, and low cost. The challenge in utilizing such data is how to robustly extract people's trip sequences from sparse and noisy cell phone data and endow the extracted trips with semantic meaning, i.e., trip purposes.In this study we reconstruct trip sequences from sparse cell phone records. Next we propose a Bayesian trip purpose classification method and compare it to a Markov random field based trip purpose clustering method, representing scenarios with and without labeled training data respectively. This procedure shows how the cell phone data, despite their coarse granularity and sparsity, can be turned into a low cost, long term, and ubiquitous sensor network for mobility related services.


How Important Is Weight Symmetry in Backpropagation?

AAAI Conferences

Gradient backpropagation (BP) requires symmetric feedforward and feedback connections — the same weights must be used for forward and backward passes. This "weight transport problem'' (Grossberg 1987) is thought to be one of the main reasons to doubt BP's biologically plausibility. Using 15 different classification datasets, we systematically investigate to what extent BP really depends on weight symmetry. In a study that turned out to be surprisingly similar in spirit to Lillicrap et al.'s demonstration (Lillicrap et al. 2014) but orthogonal in its results, our experiments indicate that: (1) the magnitudes of feedback weights do not matter to performance (2) the signs of feedback weights do matter — the more concordant signs between feedforward and their corresponding feedback connections, the better (3) with feedback weights having random magnitudes and 100% concordant signs, we were able to achieve the same or even better performance than SGD. (4) some normalizations/stabilizations are indispensable for such asymmetric BP to work, namely Batch Normalization (BN) (Ioffe and Szegedy 2015) and/or a "Batch Manhattan'' (BM) update rule.


Modeling Human Ad Hoc Coordination

AAAI Conferences

Whether in groups of humans or groups of computer agents, collaboration is most effective between individuals who have the ability to coordinate on a joint strategy for collective action. However, in general a rational actor will only intend to coordinate if that actor believes the other group members have the same intention. This circular dependence makes rational coordination difficult in uncertain environments if communication between actors is unreliable and no prior agreements have been made. An important normative question with regard to coordination in these ad hoc settings is therefore how one can come to believe that other actors will coordinate, and with regard to systems involving humans, an important empirical question is how humans arrive at these expectations. We introduce an exact algorithm for computing the infinitely recursive hierarchy of graded beliefs required for rational coordination in uncertain environments, and we introduce a novel mechanism for multiagent coordination that uses it. Our algorithm is valid in any environment with a finite state space, and extensions to certain countably infinite state spaces are likely possible. We test our mechanism for multiagent coordination as a model for human decisions in a simple coordination game using existing experimental data. We then explore via simulations whether modeling humans in this way may improve human-agent collaboration.


Apprenticeship Scheduling for Human-Robot Teams

AAAI Conferences

Resource optimization and scheduling is a costly, challenging problem that affects almost every aspect of our lives. One example that affects each of us is health care: Poor systems design and scheduling of resources can lead to higher rates of patient noncompliance and burnout of health care providers, as highlighted by the Institute of Medicine (Brandenburg et al. 2015). In aerospace manufacturing, every minute re-scheduling in response to dynamic disruptions in the build process of a Boeing 747 can cost up to $100.000. The military is also highly invested in the effective use of resources. In missile defense, for example, operators must =solve a challenging weapon-to-target problem, balancing the cost of expendable, defensive weapons while hedging against uncertainty in adversaries’ tactics. Researchers in artificial intelligence (AI) planning and scheduling strive to develop algorithms to improve resource allocation. However, there are two primary challenges. First, optimal task allocation and sequencing with upper and lower-bound temporal constraints (i.e., deadlines and wait constraints) is NP-Hard (Bertsimas and Weismantel 2005). Approximation techniques for scheduling exist and typically rely on the algorithm designer crafting heuristics based on domain expertise to decompose or structure the scheduling problem and prioritize the manner in which resources are allocated and tasks are sequenced (Tang and Parker 2005; Jones, Dias, and Stentz 2011). The second problem is this aforementioned reliance on crafting clever heuristics based on domain knowledge. Manually capturing domain knowledge within a scheduling algorithm remains a challenging process and leaves much to be desired (Ryan et al. 2013). The aim of my thesis is to develop an autonomous system that 1) learns the heuristics and implicit rules-of-thumb developed by domain experts from years of experience, 2) embeds and leverages this knowledge within a scalable resource optimization framework, and 3) provides decision support in a way that engages users and benefits them in their decision-making process. By intelligently leveraging the ability of humans to learn heuristics and the speed of modern computation, we can improve the ability to coordinate resources in these time and safety-critical domains.


Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments

AAAI Conferences

Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general framework for multiagent sequential decision-making under uncertainty. Although Dec-POMDPs are typically intractable to solve for real-world problems, recent research on macro-actions (i.e., temporally-extended actions) has significantly increased the size of problems that can be solved. However, current methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. To accommodate more realistic scenarios, when such information is not available, this paper presents a policy-based reinforcement learning approach, which learns the agent policies based solely on trajectories generated by previous interaction with the environment (e.g., demonstrations). We show that our approach is able to generate valid macro-action controllers and develop an expectationmaximization (EM) algorithm (called Policy-based EM or PoEM), which has convergence guarantees for batch learning. Our experiments show PoEM is a scalable learning method that can learn optimal policies and improve upon hand-coded “expert” solutions.


Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs

AAAI Conferences

Many solution methods for Markov Decision Processes (MDPs) exploit structure in the problem and are based on value function factorization. Especially multiagent settings, however, are known to suffer from an exponential increase in value component sizes as interactions become denser, restricting problem sizes and types that can be handled. We present an approach to mitigate this limitation for certain types of multiagent systems, exploiting a property that can be thought of as "anonymous influence" in the factored MDP. We show how representational benefits from anonymity translate into computational efficiencies, both for variable elimination in a factor graph and for the approximate linear programming solution to factored MDPs. Our methods scale to factored MDPs that were previously unsolvable, such as the control of a stochastic disease process over densely connected graphs with 50 nodes and 25 agents.


Affective Personalization of a Social Robot Tutor for Children’s Second Language Skills

AAAI Conferences

Though substantial research has been dedicated towards using technology to improve education, no current methods are as effective as one-on-one tutoring. A critical, though relatively understudied, aspect of effective tutoring is modulating the student's affective state throughout the tutoring session in order to maximize long-term learning gains. We developed an integrated experimental paradigm in which children play a second-language learning game on a tablet, in collaboration with a fully autonomous social robotic learning companion. As part of the system, we measured children's valence and engagement via an automatic facial expression analysis system. These signals were combined into a reward signal that fed into the robot's affective reinforcement learning algorithm. Over several sessions, the robot played the game and personalized its motivational strategies (using verbal and non-verbal actions) to each student. We evaluated this system with 34 children in preschool classrooms for a duration of two months. We saw that (1) children learned new words from the repeated tutoring sessions, (2) the affective policy personalized to students over the duration of the study, and (3) students who interacted with a robot that personalized its affective feedback strategy showed a significant increase in valence, as compared to students who interacted with a non-personalizing robot. This integrated system of tablet-based educational content, affective sensing, affective policy learning, and an autonomous social robot holds great promise for a more comprehensive approach to personalized tutoring.