Media
Bach in 2014: Music Composition with Recurrent Neural Network
Liu, I-Ting, Ramakrishnan, Bhiksha
We propose a framework for computer music composition that uses resilient propagation (RProp) and long short term memory (LSTM) recurrent neural network. In this paper, we show that LSTM network learns the structure and characteristics of music pieces properly by demonstrating its ability to recreate music. We also show that predicting existing music using RProp outperforms Back propagation through time (BPTT).
Deep Multi-Instance Transfer Learning
Kotzias, Dimitrios, Denil, Misha, Blunsom, Phil, de Freitas, Nando
We present a new approach for transferring knowledge from groups to individuals that comprise them. We evaluate our method in text, by inferring the ratings of individual sentences using full-review ratings. This approach, which combines ideas from transfer learning, deep learning and multi-instance learning, reduces the need for laborious human labelling of fine-grained data when abundant labels are available at the group level.
Matrix Completion on Graphs
Kalofolias, Vassilis, Bresson, Xavier, Bronstein, Michael, Vandergheynst, Pierre
The problem of finding the missing values of a matrix given a few of its entries, called matrix completion, has gathered a lot of attention in the recent years. Although the problem under the standard low rank assumption is NP-hard, Cand\`es and Recht showed that it can be exactly relaxed if the number of observed entries is sufficiently large. In this work, we introduce a novel matrix completion model that makes use of proximity information about rows and columns by assuming they form communities. This assumption makes sense in several real-world problems like in recommender systems, where there are communities of people sharing preferences, while products form clusters that receive similar ratings. Our main goal is thus to find a low-rank solution that is structured by the proximities of rows and columns encoded by graphs. We borrow ideas from manifold learning to constrain our solution to be smooth on these graphs, in order to implicitly force row and column proximities. Our matrix recovery model is formulated as a convex non-smooth optimization problem, for which a well-posed iterative scheme is provided. We study and evaluate the proposed matrix completion on synthetic and real data, showing that the proposed structured low-rank recovery model outperforms the standard matrix completion model in many situations.
A Generative Product-of-Filters Model of Audio
Liang, Dawen, Hoffman, Matthew D., Mysore, Gautham J.
We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain. PoF makes similar assumptions to those used in the classic homomorphic filtering approach to signal processing, but replaces hand-designed decompositions built of basic signal processing operations with a learned decomposition based on statistical inference. This paper formulates the PoF model and derives a mean-field method for posterior inference and a variational EM algorithm to estimate the model's free parameters. We demonstrate PoF's potential for audio processing on a bandwidth expansion task, and show that PoF can serve as an effective unsupervised feature extractor for a speaker identification task.
Noise Benefits in Expectation-Maximization Algorithms
This dissertation shows that careful injection of noise into sample data can substantially speed up Expectation-Maximization algorithms. Expectation-Maximization algorithms are a class of iterative algorithms for extracting maximum likelihood estimates from corrupted or incomplete data. The convergence speed-up is an example of a noise benefit or "stochastic resonance" in statistical signal processing. The dissertation presents derivations of sufficient conditions for such noise-benefits and demonstrates the speed-up in some ubiquitous signal-processing algorithms. These algorithms include parameter estimation for mixture models, the $k$-means clustering algorithm, the Baum-Welch algorithm for training hidden Markov models, and backpropagation for training feedforward artificial neural networks. This dissertation also analyses the effects of data and model corruption on the more general Bayesian inference estimation framework. The main finding is a theorem guaranteeing that uniform approximators for Bayesian model functions produce uniform approximators for the posterior pdf via Bayes theorem. This result also applies to hierarchical and multidimensional Bayesian models.
Learning to Act Greedily: Polymatroid Semi-Bandits
Kveton, Branislav, Wen, Zheng, Ashkan, Azin, Valko, Michal
Many important optimization problems, such as the minimum spanning tree and minimum-cost flow, can be solved optimally by a greedy method. In this work, we study a learning variant of these problems, where the model of the problem is unknown and has to be learned by interacting repeatedly with the environment in the bandit setting. We formalize our learning problem quite generally, as learning how to maximize an unknown modular function on a known polymatroid. We propose a computationally efficient algorithm for solving our problem and bound its expected cumulative regret. Our gap-dependent upper bound is tight up to a constant and our gap-free upper bound is tight up to polylogarithmic factors. Finally, we evaluate our method on three problems and demonstrate that it is practical.
DUM: Diversity-Weighted Utility Maximization for Recommendations
Ashkan, Azin, Kveton, Branislav, Berkovsky, Shlomo, Wen, Zheng
The need for diversification of recommendation lists manifests in a number of recommender systems use cases. However, an increase in diversity may undermine the utility of the recommendations, as relevant items in the list may be replaced by more diverse ones. In this work we propose a novel method for maximizing the utility of the recommended items subject to the diversity of user's tastes, and show that an optimal solution to this problem can be found greedily. We evaluate the proposed method in two online user studies as well as in an offline analysis incorporating a number of evaluation metrics. The results of evaluations show the superiority of our method over a number of baselines.
Deep Exponential Families
Ranganath, Rajesh, Tang, Linpeng, Charlin, Laurent, Blei, David M.
We describe \textit{deep exponential families} (DEFs), a class of latent variable models that are inspired by the hidden structures used in deep neural networks. DEFs capture a hierarchy of dependencies between latent variables, and are easily generalized to many settings through exponential families. We perform inference using recent "black box" variational inference techniques. We then evaluate various DEFs on text and combine multiple DEFs into a model for pairwise recommendation data. In an extensive study, we show that going beyond one layer improves predictions for DEFs. We demonstrate that DEFs find interesting exploratory structure in large data sets, and give better predictive performance than state-of-the-art models.
Capturing Triadic Conversations โ A Visual Director System for Dynamic Interactive Narratives
Xue, Bingjie (Drexel University) | Rank, Stefan (Drexel University)
Film cinematography has been developed and applied for more than a century to involve and engage the viewer in visual storytelling. Interactive storytelling games can benefit from these cinematic conventions to enhance visual experience. However, even conversation scenes in games are highly dynamic, and pre-authoring camera parameters using cinematography principles is often insufficient. This paper proposes an automatic Visual Director System focused on dynamic conversation scenes involving three characters and reports on work in progress on a prototype applied to the recreation of a movie scene. Based on principles of cinematography and the study of film scenes, cinematic conventions for triadic conversations are encoded modularly as an artificial intelligence game component that selects suitable shots for dynamic scenes.
Narrative Causal Impetus: Governance through Situational Shift in Game of Thrones
Cardier, Beth (Sirius-Beta Inc.)
As a story unfolds, it constructs a depiction of events, and at the same time, it also builds conceptual structure at a higher, interpretive level. This higher-level structure provides the terms for understanding the unfolding story, indicating what kinds of features and consequences characterize it โ a story ontology . The process by which a tale constructs a story ontology is not straightforward, and in many ways is just as complex as the action at the event level. It involves an interaction between inferred situations and contexts, each with their own networks of terms and structures, which jostle for dominance. I refer to this interaction as governance . In this work, I demonstrate an example of governance at both levels, using a scene from the series Game of Thrones . When the interpretive terms of a story emerge, an understanding of what kinds of events might come next โ the possible causal implications โ are also conveyed, even if they are unexpected.