Goto

Collaborating Authors

 Media


Integrating the Cognitive with the Physical: Musical Path Planning for an Improvising Robot

AAAI Conferences

Embodied cognition is a theory stating that the processes and functions comprising the human mind are influenced by a person's physical body. Embodied musical cognition is a theory of the musical mind stating that the person's body largely influences his or her musical experiences and actions (such as performing, learning, or listening to music). In this work, a proof of concept demonstrating the utility of an embodied musical cognition for robotic musicianship is described. Though alternative theories attempting to explain human musical cognition exist (such as cognitivism and connectionism), this work contends that the integration of physical constraints and musical knowledge is vital for a robot in order to optimize note generating decisions based on limitations of sound generating motion and enable more engaging performance through increased coherence between the generated music and sound accompanying motion. Moreover, such a system allows for efficient and autonomous exploration of the relationship between music and physicality and the resulting music that is contingent on such a connection.


Leveraging Video Descriptions to Learn Video Question Answering

AAAI Conferences

We propose a scalable approach to learn video-based question answering (QA): to answer a free-form natural language question about the contents of a video. Our approach automatically harvests a large number of videos and descriptions freely available online. Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated. Next, we use these candidate QA pairs to train a number of video-based QA methods extended from MN (Sukhbaatar et al. 2015), VQA (Antol et al. 2015), SA (Yao et al. 2015), and SS (Venugopalan et al. 2015). In order to handle non-perfect candidate QA pairs, we propose a self-paced learning procedure to iteratively identify them and mitigate their effects in training. Finally, we evaluate performance on manually generated video-based QA pairs. The results show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.


Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving

AAAI Conferences

Math word problems provide a natural abstraction to a range of natural language understanding problems that involve reasoning about quantities, such as interpreting election results, news about casualties, and the financial section of a newspaper. Units associated with the quantities often provide information that is essential to support this reasoning. This paper proposes a principled way to capture and reason about units and shows how it can benefit an arithmetic word problem solver. This paper presents the concept of Unit Dependency Graphs (UDGs), which provides a compact representation of the dependencies between units of numbers mentioned in a given problem. Inducing the UDG alleviates the brittleness of the unit extraction system and allows for a natural way to leverage domain knowledge about unit compatibility, for word problem solving. We introduce a decomposed model for inducing UDGs with minimal additional annotations, and use it to augment the expressions used in the arithmetic word problem solver of (Roy and Roth 2015) via a constrained inference framework. We show that introduction of UDGs reduces the error of the solver by over 10 %, surpassing all existing systems for solving arithmetic word problems. In addition, it also makes the system more robust to adaptation to new vocabulary and equation forms .


Visual Object Tracking for Unmanned Aerial Vehicles: A Benchmark and New Motion Models

AAAI Conferences

Despite recent advances in the visual tracking community, most studies so far have focused on the observation model. As another important component in the tracking system, the motion model is much less well-explored especially for some extreme scenarios. In this paper, we consider one such scenario in which the camera is mounted on an unmanned aerial vehicle (UAV) or drone. We build a benchmark dataset of high diversity, consisting of 70 videos captured by drone cameras. To address the challenging issue of severe camera motion, we devise simple baselines to model the camera motion by geometric transformation based on background feature points. An extensive comparison of recent state-of-the-art trackers and their motion model variants on our drone tracking dataset validates both the necessity of the dataset and the effectiveness of the proposed methods. Our aim for this work is to lay the foundation for further research in the UAV tracking area.


Video Recovery via Learning Variation and Consistency of Images

AAAI Conferences

Matrix completion algorithms have been popularly used to recover images with missing entries, and they are proved to be very effective. Recent works utilized tensor completion models in video recovery assuming that all video frames are homogeneous and correlated. However, real videos are made up of different episodes or scenes, i.e. heterogeneous. Therefore, a video recovery model which utilizes both video spatiotemporal consistency and variation is necessary. To solve this problem, we propose a new video recovery method Sectional Trace Norm with Variation and Consistency Constraints (STN-VCC). In our model, capped L1-norm regularization is utilized to learn the spatial-temporal consistency and variation between consecutive frames in video clips. Meanwhile, we introduce a new low-rank model to capture the low-rank structure in video frames with a better approximation of rank minimization than traditional trace norm. An efficient optimization algorithm is proposed, and we also provide a proof of convergence in the paper. We evaluate the proposed method via several video recovery tasks and experiment results show that our new method consistently outperforms other related approaches.


Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving

AAAI Conferences

Math word problems provide a natural abstraction to a range of natural language understanding problems that involve reasoning about quantities, such as interpreting election results, news about casualties, and the financial section of a newspaper. Units associated with the quantities often provide information that is essential to support this reasoning. This paper proposes a principled way to capture and reason about units and shows how it can benefit an arithmetic word problem solver. This paper presents the concept of Unit Dependency Graphs (UDGs), which provides a compact representation of the dependencies between units of numbers mentioned in a given problem. Inducing the UDG alleviates the brittleness of the unit extraction system and allows for a natural way to leverage domain knowledge about unit compatibility, for word problem solving. We introduce a decomposed model for inducing UDGs with minimal additional annotations, and use it to augment the expressions used in the arithmetic word problem solver of (Roy and Roth 2015) via a constrained inference framework. We show that introduction of UDGs reduces the error of the solver by over 10 %, surpassing all existing systems for solving arithmetic word problems. In addition, it also makes the system more robust to adaptation to new vocabulary and equation forms .


Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving

AAAI Conferences

Math word problems provide a natural abstraction to a range of natural language understanding problems that involve reasoning about quantities, such as interpreting election results, news about casualties, and the financial section of a newspaper. Units associated with the quantities often provide information that is essential to support this reasoning. This paper proposes a principled way to capture and reason about units and shows how it can benefit an arithmetic word problem solver. This paper presents the concept of Unit Dependency Graphs (UDGs), which provides a compact representation of the dependencies between units of numbers mentioned in a given problem. Inducing the UDG alleviates the brittleness of the unit extraction system and allows for a natural way to leverage domain knowledge about unit compatibility, for word problem solving. We introduce a decomposed model for inducing UDGs with minimal additional annotations, and use it to augment the expressions used in the arithmetic word problem solver of (Roy and Roth 2015) via a constrained inference framework. We show that introduction of UDGs reduces the error of the solver by over 10 %, surpassing all existing systems for solving arithmetic word problems. In addition, it also makes the system more robust to adaptation to new vocabulary and equation forms .


Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving

AAAI Conferences

Math word problems provide a natural abstraction to a range of natural language understanding problems that involve reasoning about quantities, such as interpreting election results, news about casualties, and the financial section of a newspaper. Units associated with the quantities often provide information that is essential to support this reasoning. This paper proposes a principled way to capture and reason about units and shows how it can benefit an arithmetic word problem solver. This paper presents the concept of Unit Dependency Graphs (UDGs), which provides a compact representation of the dependencies between units of numbers mentioned in a given problem. Inducing the UDG alleviates the brittleness of the unit extraction system and allows for a natural way to leverage domain knowledge about unit compatibility, for word problem solving. We introduce a decomposed model for inducing UDGs with minimal additional annotations, and use it to augment the expressions used in the arithmetic word problem solver of (Roy and Roth 2015) via a constrained inference framework. We show that introduction of UDGs reduces the error of the solver by over 10 %, surpassing all existing systems for solving arithmetic word problems. In addition, it also makes the system more robust to adaptation to new vocabulary and equation forms .


Variational Autoencoder for Semi-Supervised Text Classification

AAAI Conferences

Although semi-supervised variational autoencoder (SemiVAE) works in image classification task, it fails in text classification task if using vanilla LSTM as its decoder. From a perspective of reinforcement learning, it is verified that the decoder's capability to distinguish between different categorical labels is essential. Therefore, Semi-supervised Sequential Variational Autoencoder (SSVAE) is proposed, which increases the capability by feeding label into its decoder RNN at each time-step. Two specific decoder structures are investigated and both of them are verified to be effective. Besides, in order to reduce the computational complexity in training, a novel optimization method is proposed, which estimates the gradient of the unlabeled objective function by sampling, along with two variance reduction techniques. Experimental results on Large Movie Review Dataset (IMDB) and AG's News corpus show that the proposed approach significantly improves the classification accuracy compared with pure-supervised classifiers, and achieves competitive performance against previous advanced methods. State-of-the-art results can be obtained by integrating other pretraining-based methods.


SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions

AAAI Conferences

Knowledge graph embedding represents entities and relations in knowledge graph as low-dimensional, continuous vectors, and thus enables knowledge graph compatible with machine learning models. Though there have been a variety of models for knowledge graph embedding, most methods merely concentrate on the fact triples, while supplementary textual descriptions of entities and relations have not been fully employed. To this end, this paper proposes the semantic space projection (SSP) model which jointly learns from the symbolic triples and textual descriptions. Our model builds interaction between the two information sources, and employs textual descriptions to discover semantic relevance and offer precise semantic embedding. Extensive experiments show that our method achieves substantial improvements against baselines on the tasks of knowledge graph completion and entity classification.