AITopics | Media

Collaborating Authors

Media

The SERA Ecosystem: Socially Expressive Robotics Architecture for Autonomous Human-Robot Interaction

Ribeiro, Tiago (Universidade de Lisboa) | Pereira, André (Yale University) | Tullio, Eugenio Di (Universidade de Lisboa) | Paiva, Ana (Universidade de Lisboa)

AAAI ConferencesMar-16-2016

Based on the development of several different HRI scenarios using different robots, we have been establishing the SERA ecosystem. SERA is composed of both a model and tools for integrating an AI agent with a robotic embodiment, in humanrobot interaction scenarios. We present the model, and several of the reusable tools that were developed, namely Thalamus, Skene and Nutty Tracks. Finally we exemplify how such tools and model have been used and integrated in five different HRI scenarios using the NAO, Keepon and EMYS robots. Figure 1: Our methodology as an intersection of CGI animation, Human-robot interaction (HRI) systems are spreading as a IVA and robotics techniques.

artificial intelligence, robot, scenario, (16 more...)

AAAI Conferences

2016 AAAI Spring Symposium Series

Country:

Europe > Portugal > Lisbon > Lisbon (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Industry:

Leisure & Entertainment > Games (1.00)
Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.61)

Add feedback

End-to-End Attention-based Large Vocabulary Speech Recognition

Bahdanau, Dzmitry, Chorowski, Jan, Serdyuk, Dmitriy, Brakel, Philemon, Bengio, Yoshua

arXiv.org Artificial IntelligenceMar-14-2016

ABSTRACT Many of the current state-of-the-art Large V ocabulary Continuous Speech Recognition Systems (L VCSR) are hybrids of neural networks and Hidden Markov Models (HMMs). Most of these systems contain separate components that deal with the acoustic modelling, language modelling and sequence decoding. We investigate a more direct approach in which the HMM is replaced with a Recurrent Neural Network (RNN) that performs sequence prediction directly at the character level. Alignment between the input features and the desired character sequence is learned automatically by an attention mechanism built into the RNN. For each predicted character, the attention mechanism scans the input sequence and chooses relevant frames. We propose two methods to speed up this operation: limiting the scan to a subset of most promising frames and pooling over time the information contained in neighboring frames, thereby reducing source sequence length. Index Terms -- neural networks, L VCSR, attention, speech recognition, ASR 1. INTRODUCTION Deep neural networks have become popular acoustic models for state-of-the-art large vocabulary speech recognition systems (Hinton et al., 2012a). However, in these systems most of the other components, such as Hidden Markov Models (HMMs), Gaussian Mixture Models (GMMs) andn -gram language models, are the same as in their predecessors. These combinations of neural networks and statistical models are often referred to as hybrid systems.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Artificial Intelligence

1508.04395

Country:

North America > Canada (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.82)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Discriminative models for robust image classification

Srinivas, Umamahesh

arXiv.org Machine LearningMar-8-2016

A variety of real-world tasks involve the classification of images into pre-determined categories. Designing image classification algorithms that exhibit robustness to acquisition noise and image distortions, particularly when the available training data are insufficient to learn accurate models, is a significant challenge. This dissertation explores the development of discriminative models for robust image classification that exploit underlying signal structure, via probabilistic graphical models and sparse signal representations. Probabilistic graphical models are widely used in many applications to approximate high-dimensional data in a reduced complexity set-up. Learning graphical structures to approximate probability distributions is an area of active research. Recent work has focused on learning graphs in a discriminative manner with the goal of minimizing classification error. In the first part of the dissertation, we develop a discriminative learning framework that exploits the complementary yet correlated information offered by multiple representations (or projections) of a given signal/image. Specifically, we propose a discriminative tree-based scheme for feature fusion by explicitly learning the conditional correlations among such multiple projections in an iterative manner. Experiments reveal the robustness of the resulting graphical model classifier to training insufficiency.

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Machine Learning

1603.02736

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Education (1.00)
Energy (0.67)
(5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
(6 more...)

Add feedback

Communicating Semantics: Reference by Description

Guha, Ramanathan V, Gupta, Vineet

arXiv.org Artificial IntelligenceMar-7-2016

Messages often refer to entities such as people, places and events. Correct identification of the intended reference is an essential part of communication. Lack of shared unique names often complicates entity reference. Shared knowledge can be used to construct uniquely identifying descriptive references for entities with ambiguous names. We introduce a mathematical model for `Reference by Description', derive results on the conditions under which, with high probability, programs can construct unambiguous references to most entities in the domain of discourse and provide empirical validation of these results.

data mining, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

1511.06341

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.92)
Media > Television (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)
(3 more...)

Add feedback

TribeFlow: Mining & Predicting User Trajectories

Figueiredo, Flavio, Ribeiro, Bruno, Almeida, Jussara, Faloutsos, Christos

arXiv.org Machine LearningFeb-19-2016

Which song will Smith listen to next? Which restaurant will Alice go to tomorrow? Which product will John click next? These applications have in common the prediction of user trajectories that are in a constant state of flux over a hidden network (e.g. website links, geographic location). What users are doing now may be unrelated to what they will be doing in an hour from now. Mindful of these challenges we propose TribeFlow, a method designed to cope with the complex challenges of learning personalized predictive models of non-stationary, transient, and time-heterogeneous user trajectories. TribeFlow is a general method that can perform next product recommendation, next song recommendation, next location prediction, and general arbitrary-length user trajectory prediction without domain-specific knowledge. TribeFlow is more accurate and up to 413x faster than top competitors.

data mining, machine learning, natural language, (23 more...)

arXiv.org Machine Learning

1511.01032

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

Add feedback

An End-to-End Neural Network for Polyphonic Piano Music Transcription

Sigtia, Siddharth, Benetos, Emmanouil, Dixon, Simon

arXiv.org Machine LearningFeb-11-2016

We present a supervised neural network model for polyphonic piano music transcription. The architecture of the proposed model is analogous to speech recognition systems and comprises an acoustic model and a music language model. The acoustic model is a neural network used for estimating the probabilities of pitches in a frame of audio. The language model is a recurrent neural network that models the correlations between pitch combinations over time. The proposed model is general and can be used to transcribe polyphonic music without imposing any constraints on the polyphony. The acoustic and language model predictions are combined using a probabilistic graphical model. Inference over the output variables is performed using the beam search algorithm. We perform two sets of experiments. We investigate various neural network architectures for the acoustic models and also investigate the effect of combining acoustic and music language model predictions using the proposed architecture. We compare performance of the neural network based acoustic models with two popular unsupervised acoustic models. Results show that convolutional neural network acoustic models yields the best performance across all evaluation metrics. We also observe improved performance with the application of the music language models. Finally, we present an efficient variant of beam search that improves performance and reduces run-times by an order of magnitude, making the model suitable for real-time applications.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

1508.01774

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Collaborative filtering via sparse Markov random fields

Tran, Truyen, Phung, Dinh, Venkatesh, Svetha

arXiv.org Machine LearningFeb-8-2016

Recommender systems play a central role in providing individualized access to information and services. This paper focuses on collaborative filtering, an approach that exploits the shared structure among mind-liked users and similar items. In particular, we focus on a formal probabilistic framework known as Markov random fields (MRF). We address the open problem of structure learning and introduce a sparsity-inducing algorithm to automatically estimate the interaction structures between users and between items. Item-item and user-user correlation networks are obtained as a by-product. Large-scale experiments on movie recommendation and date matching datasets demonstrate the power of the proposed method.

artificial intelligence, machine learning, parameterization, (18 more...)

arXiv.org Machine Learning

1602.02842

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry:

Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

Modeling User Exposure in Recommendation

Liang, Dawen, Charlin, Laurent, McInerney, James, Blei, David M.

arXiv.org Machine LearningFeb-4-2016

Collaborative filtering analyzes user preferences for items (e.g., books, movies, restaurants, academic papers) by exploiting the similarity patterns across users. In implicit feedback settings, all the items, including the ones that a user did not consume, are taken into consideration. But this assumption does not accord with the common sense understanding that users have a limited scope and awareness of items. For example, a user might not have heard of a certain paper, or might live too far away from a restaurant to experience it. In the language of causal analysis, the assignment mechanism (i.e., the items that a user is exposed to) is a latent variable that may change for various user/item combinations. In this paper, we propose a new probabilistic approach that directly incorporates user exposure to items into collaborative filtering. The exposure is modeled as a latent variable and the model infers its value from data. In doing so, we recover one of the most successful state-of-the-art approaches as a special case of our model, and provide a plug-in method for conditioning exposure on various forms of exposure covariates (e.g., topics in text, venue locations). We show that our scalable inference algorithm outperforms existing benchmarks in four different domains both with and without exposure covariates.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1510.07025

Country:

North America > Canada (0.46)
North America > United States > New York (0.15)

Genre: Research Report > Promising Solution (0.34)

Industry:

Media > Music (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

News Across Languages - Cross-Lingual Document Similarity and Event Tracking

Rupnik, Jan, Muhic, Andrej, Leban, Gregor, Skraba, Primoz, Fortuna, Blaz, Grobelnik, Marko

Journal of Artificial Intelligence ResearchJan-30-2016

In today's world, we follow news which is distributed globally. Significant events are reported by different sources and in different languages. In this work, we address the problem of tracking of events in a large multilingual stream. Within a recently developed system Event Registry we examine two aspects of this problem: how to compare articles in different languages and how to link collections of articles in different languages which refer to the same event. Taking a multilingual stream and clusters of articles from each language, we compare different cross-lingual document similarity measures based on Wikipedia. This allows us to compute the similarity of any two articles regardless of language. Building on previous work, we show there are methods which scale well and can compute a meaningful similarity between articles from languages with little or no direct overlap in the training data. Using this capability, we then propose an approach to link clusters of articles across languages which represent the same event. We provide an extensive evaluation of the system as a whole, as well as an evaluation of the quality and robustness of the similarity measure and the linking algorithm.

cross-lingual document similarity, matrix, similarity, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4780

AI Access Foundation

10981

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
Europe > Ukraine (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Industry: Media > News (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Early Predictions of Movie Success: the Who, What, and When of Profitability

Lash, Michael T., Zhao, Kang

arXiv.org Artificial IntelligenceJan-29-2016

This paper proposes a decision support system to aid movie investment decisions at the early stage of movie productions. The system predicts the success of a movie based on its profitability by leveraging historical data from various sources. Using social network analysis and text mining techniques, the system automatically extracts several groups of features, including "who" are on the cast, "what" a movie is about, "when" a movie will be released, as well as "hybrid" features that match "who" with "what", and "when" with "what". Experiment results with movies during an 11-year period showed that the system outperforms benchmark methods by a large margin in predicting movie profitability. Novel features we proposed also made great contributions to the prediction. In addition to designing a decision support system with practical utilities, our analysis of key factors for movie profitability may also have implications for theoretical research on team performance and the success of creative work.

data mining, decision support system, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/07421222.2016.1243969

1506.05382

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.04)
North America > Canada (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Add feedback