Goto

Collaborating Authors

 Massachusetts Institute of Technology


Predicting Latent Narrative Mood Using Audio and Physiologic Data

AAAI Conferences

Inferring the latent emotive content of a narrative requires consideration of para-linguistic cues (e.g. pitch), linguistic content (e.g. vocabulary) and the physiological state of the narrator (e.g. heart-rate). In this study we utilized a combination of auditory, text, and physiological signals to predict the mood (happy or sad) of 31 narrations from subjects engaged in personal story-telling. We extracted 386 audio and 222 physiological features (using the Samsung Simband) from the data. A subset of 4 audio, 1 text, and 5 physiologic features were identified using Sequential Forward Selection (SFS) for inclusion in a Neural Network (NN). These features included subject movement, cardiovascular activity, energy in speech, probability of voicing, and linguistic sentiment (i.e. negative or positive). We explored the effects of introducing our selected features at various layers of the NN and found that the location of these features in the network topology had a significant impact on model performance. To ensure the real-time utility of the model, classification was performed over 5 second intervals. We evaluated our model’s performance using leave-one-subject-out crossvalidation and compared the performance to 20 baseline models and a NN with all features included in the input layer.


Active Video Summarization: Customized Summaries via On-line Interaction with the User

AAAI Conferences

To facilitate the browsing of long videos, automatic video summarization provides an excerpt that represents its content. In the case of egocentric and consumer videos, due to their personal nature, adapting the summary to specific user's preferences is desirable. Current approaches to customizable video summarization obtain the user's preferences prior to the summarization process. As a result, the user needs to manually modify the summary to further meet the preferences. In this paper, we introduce Active Video Summarization (AVS), an interactive approach to gather the user's preferences while creating the summary. AVS asks questions about the summary to update it on-line until the user is satisfied. To minimize the interaction, the best segment to inquire next is inferred from the previous feedback. We evaluate AVS in the commonly used UTEgo dataset. We also introduce a new dataset for customized video summarization (CSumm) recorded with a Google Glass. The results show that AVS achieves an excellent compromise between usability and quality. In 41% of the videos, AVS is considered the best over all tested baselines, including summaries manually generated. Also, when looking for specific events in the video, AVS provides an average level of satisfaction higher than those of all other baselines after only six questions to the user.


Associative Memory Using Dictionary Learning and Expander Decoding

AAAI Conferences

An associative memory is a framework of content-addressable memory that stores a collection of message vectors (or a dataset) over a neural network while enabling a neurally feasible mechanism to recover any message in the dataset from its noisy version. Designing an associative memory requires addressing two main tasks: 1) learning phase: given a dataset, learn a concise representation of the dataset in the form of a graphical model (or a neural network), 2) recall phase: given a noisy version of a message vector from the dataset, output the correct message vector via a neurally feasible algorithm over the network learnt during the learning phase. This paper studies the problem of designing a class of neural associative memories which learns a network representation for a large dataset that ensures correction against a large number of adversarial errors during the recall phase. Specifically, the associative memories designed in this paper can store dataset containing exp( n ) n -length message vectors over a network with O ( n ) nodes and can tolerate Ω( n / polylog) adversarial errors. This paper carries out this memory design by mapping the learning phase and recall phase to the tasks of dictionary learning with a square dictionary and iterative error correction in an expander code, respectively.


Distributed Negative Sampling for Word Embeddings

AAAI Conferences

Word2Vec recently popularized dense vector word representations as fixed-length features for machine learning algorithms and is in widespread use today. In this paper we investigate one of its core components, Negative Sampling, and propose efficient distributed algorithms that allow us to scale to vocabulary sizes of more than 1 billion unique words and corpus sizes of more than 1 trillion words.


Mixed Discrete-Continuous Planning with Convex Optimization

AAAI Conferences

Robots operating in the real world must be able to handle both discrete and continuous change. Many robot behaviors can be controlled through numeric parameters (called control variables), which affect the rate of the continuous change. Previous approaches capable of reasoning efficiently with control variables impose severe restrictions that limit the expressivity of the problems that can be solved. A broad class of robotic applications require, for example, convex quadratic constraints on state variables and control variables that are jointly constrained and that affect multiple state variables simultaneously. However, extensions to prior approaches are not straightforward, since these characteristics are non-linear and hard to scale. We introduce cqScotty, a heuristic forward search planner that solves these problems efficiently. While naive formulations of consistency checks are not convex and do not scale, cqScotty uses an efficient convex formulation, in the form of a Second Order Cone Program (SOCP), that is very fast to solve. We demonstrate the scalability of our approach on three new realistic domains.


Collaborative Planning with Encoding of Users' High-Level Strategies

AAAI Conferences

The generation of near-optimal plans for multi-agent systems with numerical states and temporal actions is computationally challenging. Current off-the-shelf planners can take a very long time before generating a near-optimal solution. In an effort to reduce plan computation time, increase the quality of the resulting plans, and make them more interpretable by humans, we explore collaborative planning techniques that actively involve human users in plan generation. Specifically, we explore a framework in which users provide high-level strategies encoded as soft preferences to guide the low-level search of the planner. Through human subject experimentation, we empirically demonstrate that this approach results in statistically significant improvements to plan quality, without substantially increasing computation time. We also show that the resulting plans achieve greater similarity to those generated by humans with regard to the produced sequences of actions, as compared to plans that do not incorporate user-provided strategies.


Logical Filtering and Smoothing: State Estimation in Partially Observable Domains

AAAI Conferences

State estimation is the task of estimating the state of a partially observable dynamical system given a sequence of executed actions and observations. In logical settings, state estimation can be realized via logical filtering, which is exact but can be intractable. We propose logical smoothing, a form of backwards reasoning that works in concert with approximated logical filtering to refine past beliefs in light of new observations.  We characterize the notion of logical smoothing together with an algorithm for backwards-forwards state estimation.  We also present an approximation of our smoothing algorithm that is space efficient. We prove properties of our algorithms, and experimentally demonstrate their behaviour, contrasting them with state estimation methods for planning. Smoothing and backwards-forwards reasoning are important techniques for reasoning about partially observable dynamical systems, introducing the logical analogue of effective techniques from control theory and dynamic programming.


Model AI Assignments 2017

AAAI Conferences

The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning experience, we here present abstracts of six AI assignments from the 2017 session that are easily adoptable, playfully engaging, and flexible for a variety of instructor needs.


When and Why Are Deep Networks Better Than Shallow Ones?

AAAI Conferences

While the universal approximation property holds both for hierarchical and shallow networks, deep networks can approximate the class of compositional functions as well as shallow networks but with exponentially lower number of training parameters and sample complexity. Compositional functions are obtained as a hierarchy of local constituent functions, where "local functions'' are functions with low dimensionality. This theorem proves an old conjecture by Bengio on the role of depth in networks, characterizing precisely the conditions under which it holds. It also suggests possible answers to the the puzzle of why high-dimensional deep networks trained on large training sets often do not seem to show overfit.


Non-Deterministic Planning with Temporally Extended Goals: LTL over Finite and Infinite Traces

AAAI Conferences

Temporally extended goals are critical to the specification of a diversity of real-world planning problems. Here we examine the problem of non-deterministic planning with temporally extended goals specified in linear temporal logic (LTL), interpreted over either finite or infinite traces. Unlike existing LTL planners, we place no restrictions on our LTL formulae beyond those necessary to distinguish finite from infinite interpretations. We generate plans by compiling LTL temporally extended goals into problem instances described in the Planning Domain Definition Language that are solved by a state-of-the-art fully observable non-deterministic planner. We propose several different compilations based on translations of LTL to (Büchi) alternating or (Büchi) non-deterministic finite state automata, and evaluate various properties of the competing approaches. We address a diverse spectrum of LTL planning problems that, to this point, had not been solvable using AI planning techniques, and do so in a manner that demonstrates highly competitive performance.