Goto

Collaborating Authors

 Media


A Comparison of Playlist Generation Strategies for Music Recommendation and a New Baseline Scheme

AAAI Conferences

The digitalization of music and the instant availability of millions of tracks on the Internet require new approaches to support the user in the exploration of these huge music collections. One possible approach to address this problem, which can also be found on popular online music platforms, is the use of user-created or automatically generated playlists (mixes). The automated generation of such playlists represents a particular type of the music recommendation problem with two special characteristics. First, the tracks of the list are usually consumed immediately at recommendation time; secondly, songs are listened to mostly in consecutive order so that the sequence of the recommended tracks can be relevant. In the past years, a number of different approaches for playlist generation have been proposed in the literature. In this paper, we review the existing core approaches to playlist generation, discuss aspects of appropriate offline evaluation designs and report the results of a comparative evaluation based on different datasets. Based on the insights from these experiments, we propose a comparably simple and computationally tractable new baseline algorithm for future comparisons, which is based on track popularity and artist information and is competitive with more sophisticated techniques in our evaluation settings.


Rotunde — A Smart Meeting Cinematography Initiative — Tools, Datasets, and Benchmarks for Cognitive Interpretation and Control

AAAI Conferences

The cognitive interpretation of perceptual data (e.g., from video, depth, motion sensors) requires the representational and inferential mediation of commonsense and qualitative abstractions of space, actions, events, change, and interaction. General methods and benchmarks for high-level cognitive interpretation, and their seamless integration and access within large-scale projects concerned with cognitive vision, robotics, hybrid-intelligent systems are necessary. We present the Rotunde initiative as a particular instance of a challenging smart meeting cinematography concept primarily concerning human activity interpretation. The Rotunde initiative aims to release general tools (e.g., for reasoning and control), methodological and performance benchmarks, and developmental aids (e.g., management and visualisation of complex spatio-temporal data) for the cognitive interpretation of interaction.


Personalized Text-Based Music Retrieval

AAAI Conferences

We consider the problem of personalized text-based music retrieval where users' history of preferences are taken into account in addition to their issued textual queries.Current retrieval methods mostly rely on songs meta-data. This limits the query vocabulary. Moreover, it is very costly to gather this information in large collections of music. Alternatively, we use music annotations retrieved from social tagging Websites such as last.fm and use them as textual descriptions of songs. Considering a user's profile and using preference patterns of music among all users, as in collaborative filtering approaches, can be useful in providing personalized and more satisfactory results. The main challenge is how to include both users' profiles and the songs meta-data in the retrieval model. In this paper, we propose a hierarchical probabilistic model that takes into account the users' preference history as well as tag co-occurrences in songs. Our model is an extension of LDA where topics are formed as joint clusterings of songs and tags. These topics capture the tag associations and user preferences and correspond to different music tastes. Each user's profile is represented as a distribution over topics which shows the user's interests in different types of music.We will explain how our model can be used for contextual retrieval. Our experimental results show significant improvement in retrieval when user profiles are taken into account.


Enforcing Meter in Finite-Length Markov Sequences

AAAI Conferences

Markov processes are increasingly used to generate finite-length sequences that imitate a given style. However, Markov processes are notoriously difficult to control. Recently, Markov constraints have been introduced to give users some control on generated sequences. Markov constraints reformulate finite-length Markov sequence generation in the framework of constraint satisfaction (CSP). However, in practice, this approach is limited to local constraints and its performance is low for global constraints, such as cardinality or arithmetic constraints. This limitation prevents generated sequences to follow structural properties which are independent of the style, but inherent to the domain, such as meter. In this article, we introduce meter, a constraint that ensures a sequence is 1) Markovian with regards to a given corpus and 2) follows metrical rules expressed as cumulative cost functions. Additionally, meter can simultaneously enforce cardinality constraints. We propose a domain consistency algorithm whose complexity is pseudo-polynomial. This result is obtained thanks to a theorem on the growth of sumsets by Khovanskii. We illustrate our constraint on meter-constrained music generation problems that were so far not solvable by any other technique.


A Hierarchical Aspect-Sentiment Model for Online Reviews

AAAI Conferences

To help users quickly understand the major opinions from massive online reviews, it is important to automatically reveal the latent structure of the aspects, sentiment polarities, and the association between them. However, there is little work available to do this effectively. In this paper, we propose a hierarchical aspect sentiment model (HASM) to discover a hierarchical structure of aspect-based sentiments from unlabeled online reviews. In HASM, the whole structure is a tree. Each node itself is a two-level tree, whose root represents an aspect and the children represent the sentiment polarities associated with it. Each aspect or sentiment polarity is modeled as a distribution of words. To automatically extract both the structure and parameters of the tree, we use a Bayesian nonparametric model, recursive Chinese Restaurant Process (rCRP), as the prior and jointly infer the aspect-sentiment tree from the review texts. Experiments on two real datasets show that our model is comparable to two other hierarchical topic models in terms of quantitative measures of topic trees. It is also shown that our model achieves better sentence-level classification accuracy than previously proposed aspect-sentiment joint models.


Don’t Be Spoiled by Your Friends: Spoiler Detection in TV Program Tweets

AAAI Conferences

Providing a convenient mechanism for accessing the Internet, smartphones have led to the rapid growth of Social Networking Services (SNSs) such as Twitter and have served as a major platform for SNSs. Nowadays, people are able to check conveniently the SNS messages posted by their friends and followers via their smartphones. As a consequence, people are exposed to spoilers of TV programs that they follow. So far, there are two previous works that explored the detection of spoilers in texts, not SNS: (1) keyword matching method and (2) machine-learning method based on Latent Dirichlet Allocation (LDA). The keyword matching method evaluates most tweets as spoilers; hence its poor recall performance. The other method based on LDA, although successful on large text, works poorly on short segments of text such as those found on Twitter and evaluates most tweets as non-spoilers. This paper presents four features that are significant in the classification of spoiler tweets. Using those features, we classified spoiler tweets pertaining to a reality TV show (“Dancing with the Stars”). We experimentally compared our method with previous methods, with our method achieving substantially higher precision compared to the keyword matching and LDA-based methods while maintaining comparable recalls.


A Virtual Archive for the History of AI

AI Magazine

Publications that have influenced the growth of artificial intelligence are often difficult to obtain.  We first collected titles of several thousand publications from many well-known sources and then selected about 1850 titles considered to be especially influential.  We have identified, and in a few cases created, online versions of about half of these “classics in AI.”  Searchable text of the documents enables additional analysis of trends and influences.  Integration into the rest of the AITopics information portal contextualizes the classic publications.


Commonsense Reasoning and Large Network Analysis: A Computational Study of ConceptNet 4

arXiv.org Artificial Intelligence

Our aim is to compute the minimal data-set implied by the assertions of the English language, extract it from the database, and store it in files of our own format. Towards this direction we read the table of assertions (conceptnet assertion) and keep the entries that have their language id set to en. According to Table A.1 in Appendix A, every assertion is associated with entries from the database tables conceptnet concept (Table A.2), conceptnet relation (Table A.3), nl frequency (Table A.4), conceptnet frame (Table A.5), conceptnet surfaceform (Table A.6), and conceptnet rawassertion (Table A.7). Through conceptnet rawassertion the assertions are also associated with the actual sentences which are located in the table corpus sentence (Table A.6). Moreover, we do not need any other table from the database, as the important entries from all the above tables are contained in among these tables. It turns out that reading once the assertions and then all the entries referenced from the assertions in the English language is not enough to produce a minimal consistent data-set. Section 1.1 explains why, and gives a high-level overview of the process that we follow in order to compute the closure of the data-set implied by the assertions of the English language. However, before we describe these reasons we mention which fields we are going to keep from each table of the original ConceptNet 4 database.


Cognitive Interpretation of Everyday Activities: Toward Perceptual Narrative Based Visuo-Spatial Scene Interpretation

arXiv.org Artificial Intelligence

We position a narrative-centred computational model for high-level knowledge representation and reasoning in the context of a range of assistive technologies concerned with "visuo-spatial perception and cognition" tasks. Our proposed narrative model encompasses aspects such as \emph{space, events, actions, change, and interaction} from the viewpoint of commonsense reasoning and learning in large-scale cognitive systems. The broad focus of this paper is on the domain of "human-activity interpretation" in smart environments, ambient intelligence etc. In the backdrop of a "smart meeting cinematography" domain, we position the proposed narrative model, preliminary work on perceptual narrativisation, and the immediate outlook on constructing general-purpose open-source tools for perceptual narrativisation. ACM Classification: I.2 Artificial Intelligence: I.2.0 General -- Cognitive Simulation, I.2.4 Knowledge Representation Formalisms and Methods, I.2.10 Vision and Scene Understanding: Architecture and control structures, Motion, Perceptual reasoning, Shape, Video analysis General keywords: cognitive systems; human-computer interaction; spatial cognition and computation; commonsense reasoning; spatial and temporal reasoning; assistive technologies


Iterative Grassmannian Optimization for Robust Image Alignment

arXiv.org Machine Learning

Robust high-dimensional data processing has witnessed an exciting development in recent years, as theoretical results have shown that it is possible using convex programming to optimize data fit to a low-rank component plus a sparse outlier component. This problem is also known as Robust PCA, and it has found application in many areas of computer vision. In image and video processing and face recognition, the opportunity to process massive image databases is emerging as people upload photo and video data online in unprecedented volumes. However, data quality and consistency is not controlled in any way, and the massiveness of the data poses a serious computational challenge. In this paper we present t-GRASTA, or "Transformed GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm)". t-GRASTA iteratively performs incremental gradient descent constrained to the Grassmann manifold of subspaces in order to simultaneously estimate a decomposition of a collection of images into a low-rank subspace, a sparse part of occlusions and foreground objects, and a transformation such as rotation or translation of the image. We show that t-GRASTA is 4 $\times$ faster than state-of-the-art algorithms, has half the memory requirement, and can achieve alignment for face images as well as jittered camera surveillance images.