Media
Pseudo-Supervised Training Improves Unsupervised Melody Segmentation
Lattner, Stefan (Austrian Research Institute for Artificial Intelligence) | Chacón, Carlos Eduardo Cancino (Austrian Research Institute for Artificial Intelligence) | Grachten, Maarten (Austrian Research Institute for Artificial Intelligence)
An important aspect of music perception in humans is the ability to segment streams of musical events into structural units such as motifs and phrases.A promising approach to the computational modeling of music segmentation employs the statistical and information-theoretic properties of musical data, based on the hypothesis that these properties can (at least partly) account for music segmentation in humans. Prior work has shown that in particular the information content of music events, as estimated from a generative probabilistic model of those events, is a good indicator for segment boundaries.In this paper we demonstrate that, remarkably, a substantial increase in segmentation accuracy can be obtained by not using information content estimates directly, but rather in a bootstrapping fashion. More specifically, we use information content estimates computed from a generative model of the data as a target for a feed-forward neural network that is trained to estimate the information content directly from the data. We hypothesize that the improved segmentation accuracy of this bootstrapping approach may be evidence that the generative model provides noisy estimates of the information content, which are smoothed by the feed-forward neural network, yielding more accurate information content estimates.
Computational Invention of Cadences and Chord Progressions by Conceptual Chord-Blending
Eppe, Manfred (IIIA-CSIC, ICSI) | Confalonieri, Roberto (IIIA-CSIC) | MacLean, Ewen (University of Edinburgh) | Kaliakatsos, Maximos (Uniersity of Thessaloniki) | Cambouropoulos, Emilios (University of Thessaloniki) | Schorlemmer, Marco (IIIA-CSIC) | Codescu, Mihai (University of Magdeburg) | Kühnberger, Kai-Uwe (University of Osnabrück)
We present a computational framework for chord invention based on a cognitive-theoretic perspective on conceptual blending. The framework builds on algebraic specifications, and solves two musicological problems. It automatically finds transitions between chord progressions of different keys or idioms, and it substitutes chords in a chord progression by other chords of a similar function, as a means to create novel variations. The approach is demonstrated with several examples where jazz cadences are invented by blending chords in cadences from earlier idioms, and where novel chord progressions are generated by inventing transition chords.
Artificial Intelligence in the Concertgebouw
Arzt, Andreas (Johannes Kepler University Linz) | Frostel, Harald (Johannes Kepler University Linz) | Gadermaier, Thassilo (Austrian Research Institute for Artificial Intelligence) | Gasser, Martin (Austrian Research Institute for Artificial Intelligence) | Grachten, Maarten (Austrian Research Institute for Artificial Intelligence) | Widmer, Gerhard (Johannes Kepler University Linz)
In this paper we present a real-world application (the first of its kind) of machine listening in the context of a live concert in a world-famous concert hall - the Concertgebouw in Amsterdam. A real-time music tracking algorithm listens to the Royal Concertgebouw Orchestra performing Richard Strauss' Alpensinfonie and follows the progress in the sheet music, i.e., continuously tracks the most likely position of the live music in the printed score. This information, in turn, is used to enrich the concert experience for members of the audience by streaming synchronised visual content (the sheet music, explanatory text and videos) onto tablet computers in the concert hall. The main focus of this paper is on the challenges involved in tracking live orchestral music, i.e., how to deal with heavily polyphonic music, how to prepare the data needed, and how to achieve the necessary robustness and precision.
Tackling Data Sparseness in Recommendation using Social Media based Topic Hierarchy Modeling
Zhu, Xingwei (Tsinghua University) | Ming, Zhao-Yan (DigiPen Institute of Technology) | Hao, Yu (Tsinghua University) | Zhu, Xiaoyan (Tsinghua University)
Recommendation systems play an important role in E-Commerce. However, their potential usefulness in real world applications is greatly limited by the availability of historical rating records from the customers. This paper presents a novel method to tackle the problem of data sparseness in user ratings with rich and timely domain information from social media. We first extract multiple side information for products from their relevant social media contents. Next, we convert the information into weighted topic-item ratings and inject them into an extended latent factor based recommendation model in an optimized approach. Our evaluation on two real world datasets demonstrates the superiority of our method over state-of-the-art methods.
A Unified Probabilistic Model of User Activities and Relations on Social Networking Sites
Yu, Xiaofeng (HP Labs China) | Xie, Junqing (HP Labs China) | Wang, Shuai (HP Labs China)
In this work, we investigate the bidirectional mutual interactions (BMI) between users' activities and user-user relationships on social networking sites. We analyze and study the fundamental mechanism that drives the characteristics and dynamics of BMI is the underlying social influence. We make an attempt at a unified probabilistic approach, called joint activity and relation (JAR), for modeling and predicting users' activities and user-user relationships simultaneously in a single coherent framework. Instead of incorporating social influence in an ad hoc manner, we show that social influence can be captured quantitatively. Based on JAR, we learn social influence between users and users' personal preferences for both user activity prediction and user-user relation discovery through statistical inference. To address the challenges of the introduced multiple layers of hidden variables in JAR, we propose a new learning algorithm based on expectation maximization (EM) and we further propose a powerful and efficient generalization of the EM based algorithm for model fitting.We show that JAR exploits mutual interactions and benefits, by taking advantage of the learned social influence and users' personal preferences, for enhanced user activity prediction and user-user relation discovery. We further experiment with real world dataset to verify the claimed advantages achieving substantial performance gains.
Towards Domain-Specific Semantic Relatedness: A Case Study from Geography
Sen, Shilad (Macalester College) | Johnson, Isaac (University of Minnesota) | Harper, Rebecca (Wilamette College) | Mai, Huy ( Brandeis University ) | Olsen, Samuel Horlbeck (Macalester College) | Mathers, Benjamin (Macalester College) | Vonessen, Laura Souza (University of Arizona) | Wright, Matthew (University of Minnesota) | Hecht, Brent (University of Minnesota)
Semantic relatedness (SR) measures form the algorithmic foundation of intelligent technologies in domains ranging from artificial intelligence to human-computer interaction. Although SR has been researched for decades, this work has focused on developing general SR measures rooted in graph and text mining algorithms that perform reasonably well for many different types of concepts. This paper introduces domain-specific SR, which augments general SR by identifying, capturing, and synthesizing domain-specific relationships between concepts. Using the domain of geography as a case study, we show that domain-specific SR — and even geography-specific signals alone (e.g. distance, containment) without sophisticated graph or text mining algorithms — significantly outperform the SR state-of-the-art for geographic concepts. In addition to substantially improving SR measures for geospatial technologies, an area that is rapidly increasing in importance, this work also unlocks an important new direction for SR research: SR measures that incorporate domain-specific customizations to increase accuracy.
Large Scale Homophily Analysis in Twitter Using a Twixonomy
Faralli, Stefano (Università di Roma "La Sapienza") | Stilo, Giovanni (Università di Roma "La Sapienza") | Velardi, Paola (Università di Roma "La Sapienza")
In this paper we perform a large-scale homophily analysis on Twitter using a hierarchical representation of users' interests which we call a Twixonomy. In order to build a population, community, or single-user Twixonomy we first associate "topical" friends in users' friendship lists (i.e. friends representing an interest rather than a social relation between peers) with Wikipedia categories. A word-sense disambiguation algorithm is used to select the appropriate wikipage for each topical friend. Starting from the set of wikipages representing "primitive" interests, we extract all paths connecting these pages with topmost Wikipedia category nodes, and we then prune the resulting graph G efficiently so as to induce a direct acyclic graph. This graph is the Twixonomy. Then, to analyze homophily, we compare different methods to detect communities in a peer friends Twitter network, and then for each community we compute the degree of homophily on the basis of a measure of pairwise semantic similarity.We show that the Twixonomy provides a means for describing users' interests in a compact and readable way and allows for a fine-grained homophily analysis. Furthermore, we show that mid-low level categories in the Twixonomy represent the best balance between informativeness and compactness of the representation.
Trailer Generation via a Point Process-Based Visual Attractiveness Model
Xu, Hongteng (Georgia Institute of Technology) | Zhen, Yi (Georgia Institute of Technology) | Zha, Hongyuan (Georgia Institute of Technology)
Producing attractive trailers for videos needs human expertise and creativity, and hence is challenging and costly. Different from video summarization that focuses on capturing storylines or important scenes, trailer generation aims at producing trailers that are attractive so that viewers will be eager to watch the original video. In this work, we study the problem of automatic trailer generation, in which an attractive trailer is produced given a video and a piece of music. We propose a surrogate measure of video attractiveness named fixation variance, and learn a novel self-correcting point process-based attractiveness model that can effectively describe the dynamics of attractiveness of a video. Furthermore, based on the attractiveness model learned from existing training trailers, we propose an efficient graph-based trailer generation algorithm to produce a max-attractiveness trailer. Experiments demonstrate that our approach outperforms the state-of-the-art trailer generators in terms of both quality and efficiency.
Face Clustering in Videos with Proportion Prior
Tang, Zhiqiang (Chinese Academy of Sciences) | Zhang, Yifan (Chinese Academy of Sciences) | Li, Zechao (Nanjing University of Science and Technology) | Lu, Hanqing (Chinese Academy of Sciences)
In this paper, we investigate the problem of face clustering in real-world videos. In many cases, the distribution of the face data is unbalanced. In movies or TV series videos, the leading casts appear quite often and the others appear much less. However, many clustering algorithms cannot well handle such severe unbalance between the data distribution, resulting in that the large class is split apart, and the small class is merged into the large ones and thus missing. On the other hand, the data distribution proportion information may be known beforehand. For example, we can obtain such information by counting the spoken lines of the characters in the script text. Hence, we propose to make use of the proportion prior to regularize the clustering. A Hidden Conditional Random Field(HCRF) model is presented to incorporate the proportion prior. In experiments on a public data set from real-world videos, we observe improvements on clustering performance against state-of-the-art methods.
Cross-Domain Collaborative Filtering with Review Text
Xin, Xin (Beijing Institute of Technology) | Liu, Zhirun (Beijing Institute of Technology) | Lin, Chin-Yew (Microsoft Research Asia) | Huang, Heyan (Beijing Institute of Technology) | Wei, Xiaochi (Beijing Institute of Technology) | Guo, Ping (Beijing Normal University)
Most existing cross-domain recommendation algorithms focus on modeling ratings, while ignoring review texts. The review text, however, contains rich information, which can be utilized to alleviate data sparsity limitations, and interpret transfer patterns. In this paper, we investigate how to utilize the review text to improve cross-domain collaborative filtering models. The challenge lies in the existence of non-linear properties in some transfer patterns. Given this, we extend previous transfer learning models in collaborative filtering, from linear mapping functions to non-linear ones, and propose a cross-domain recommendation framework with the review text incorporated. Experimental verifications have demonstrated, for new users with sparse feedback, utilizing the review text obtains 10% improvement in the AUC metric, and the nonlinear method outperforms the linear ones by 4%.