Overview
Kernels for Vector-Valued Functions: a Review
Alvarez, Mauricio A., Rosasco, Lorenzo, Lawrence, Neil D.
Kernel methods are among the most popular techniques in machine learning. From a frequentist/discriminative perspective they play a central role in regularization theory as they provide a natural choice for the hypotheses space and the regularization functional through the notion of reproducing kernel Hilbert spaces. From a Bayesian/generative perspective they are the key in the context of Gaussian processes, where the kernel function is also known as the covariance function. Traditionally, kernel methods have been used in supervised learning problem with scalar outputs and indeed there has been a considerable amount of work devoted to designing and learning kernels. More recently there has been an increasing interest in methods that deal with multiple outputs, motivated partly by frameworks like multitask learning. In this paper, we review different methods to design or learn valid kernel functions for multiple outputs, paying particular attention to the connection between probabilistic and functional methods.
Transforming Graph Representations for Statistical Relational Learning
Rossi, Ryan A., McDowell, Luke K., Aha, David W., Neville, Jennifer
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation--for the nodes, links, and features--can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.
Proximity-Based Non-uniform Abstractions for Approximate Planning
Baum, J., Nicholson, A. E., Dix, T. I.
In a deterministic world, a planning agent can be certain of the consequences of its planned sequence of actions. Not so, however, in dynamic, stochastic domains where Markov decision processes are commonly used. Unfortunately these suffer from the `curse of dimensionality': if the state space is a Cartesian product of many small sets (`dimensions'), planning is exponential in the number of those dimensions. Our new technique exploits the intuitive strategy of selectively ignoring various dimensions in different parts of the state space. The resulting non-uniformity has strong implications, since the approximation is no longer Markovian, requiring the use of a modified planner. We also use a spatial and temporal proximity measure, which responds to continued planning as well as movement of the agent through the state space, to dynamically adapt the abstraction as planning progresses. We present qualitative and quantitative results across a range of experimental domains showing that an agent exploiting this novel approximation method successfully finds solutions to the planning problem using much less than the full state space. We assess and analyse the features of domains which our method can exploit.
Pragmatic Analysis of Crowd-Based Knowledge Production Systems with iCAT Analytics: Visualizing Changes to the ICD-11 Ontology
Pöschko, Jan (Graz University of Technology) | Strohmaier, Markus (Graz University of Technology) | Tudorache, Tania (Stanford University) | Noy, Natalya F. (Stanford University) | Musen, Mark A. (Stanford University)
While in the past taxonomic and ontological knowledge was traditionally produced by small groups of co-located experts, today the production of such knowledge has a radically different shape and form. For example, potentially thousands of health professionals, scientists, and ontology experts will collaboratively construct, evaluate and maintain the most recent version of the International Classification of Diseases (ICD-11), a large ontology of diseases and causes of deaths managed by the World Health Organization. In this work, we present a novel web-based tool — iCAT Analytics — that allows to investigate systematically crowd-based processes in knowledge-production systems. To enable such investigation, the tool supports interactive exploration of pragmatic aspects of ontology engineering such as how a given ontology evolved and the nature of changes, discussions and interactions that took place during its production process. While iCAT Analytics was motivated by ICD-11, it could potentially be applied to any crowd-based ontology-engineering project. We give an introduction to the features of iCAT Analytics and present some insights specifically for ICD-11.
Autonomous Agents Research in Robotics: A Report from the Trenches
Kaminka, Gal A. (Bar Ilan University)
This paper surveys research in robotics in the AAMAS (Au- tonomous Agents and Multi-Agent Systems) community. It argues that the autonomous agents community can, and has, impact on robotics. Moreover, it argues that agents re- searchers should proactively seek to impact the robotics com- munity, to prevent independent re-discovery of known results, and to benefit autonomous agents science. To support these claims, I provide evidence from my own research into multi- robot teams, and from others’.
The Role of AI in Wisdom of the Crowds for the Social Construction of Knowledge on Sustainability
Maher, Mary Lou (University of Maryland)
One of the original applications of crowdsourcing the construction of knowledge is Wikipedia, which relies entirely on people to contribute, extend, and modify the representation of knowledge. This paper presents a case for combining AI and wisdom of the crowds for the social construction of knowledge. Our social-computational approach to collective intelligence combines the strengths of human cognitive diversity in producing content and the capabilities of an AI, through methods such as topic modeling, to link and synthesize across these human contributions. In addition to drawing from established domains such as Wikipedia for inspiration and guidance, we present the design of a system that incorporates AI into wisdom of the crowds to develop a knowledge base on sustainability. In this setting the AI plays the role of scholar, as might many of the other participants, drawing connections and synthesizing across contributions. We close with a general discussion, speculating on educational implications and other roles that an AI can play within an otherwise collective human intelligence.
Web Resources Recommendation based on Dynamic Prediction of User Consumption on the Social Web
Rojas-Potosi, Luis Antonio (Universidad del Cauca) | Suarez-Meza, Luis Javier (Universidad del Cauca) | Ordoñez-Ante, Leandro (Universidad del Cauca) | Corrales, Juan Carlos (Universidad del Cauca)
The Web is a giant repository of resources (Service and content), where Discovery and Recommendation systems are used to deliver the best ranked list of relevant web resources that meet user requirements. Nowadays, these systems are based on the simulation and automation of the user search criteria, considering the relation between consumption trends and the different kinds of users’ relationships with their virtual and physical environment, based on the information from the Social Web and mobile device sensors among others. These systems are executed once an explicit query of the user has been received; however, there are resources that are useful in specific situations, where these resources have high probability to be consumed, but, due to absence of a query they are not recommended to the users. In this regard, the question is: how to make a successful Web Resource Recommendation without the user query? In order to answer the question, this research proposal presents a novel approach to Recommend Web Resources based on Dynamic Prediction of User Consumption on the Social Web, which emulates the user behavior, the resource dynamism and the context opportunities, in real time, catching the best situations to make an asynchronous (unexpected by the user) recommendation of a useful Resources; and boost Web Resources consumption.
Personalisation of Social Web Services in the Enterprise Using Spreading Activation for Multi-Source, Cross-Domain Recommendations
Heitmann, Benjamin (National University of Ireland, Galway) | Dabrowski, Maciej (National University of Ireland, Galway) | Passant, Alexandre (National University of Ireland, Galway) | Hayes, Conor (National University of Ireland, Galway) | Griffin, Keith (Cisco Systems)
Existing personalisation approaches, such as collaborative filtering or content based recommendations, are highly dependent on the domain and/or the source of the data. Therefore, there is a need for more accurate means to capture and model the interests of the user across domains, and to interlink them in a semantically-enhanced interest graph. We propose a new approach for multi-source, cross-genre recommendations that can exploit the heterogeneous nature of user profile data, which has been aggregated from multiple personalised web services, such as blogs, wikis and microblogs. Our approach is based on the Spreading Activation model that exploits intrinsic links between entities across a number of data sources. The proposed method is highly customizable and applicable both to generic and specific recommendation scenarios and use cases. With the growing number of Social Web applications in the enterprise (blogs, wikis, micro blogging, etc.), it becomes difficult for knowledge workers to avoid content overload and to quickly identify relevant people, communities and information. We demonstrate the application of our approach in an industrial use case that involves recommendation of social semantic data across multiple services in a distributed collaborative environment.
Visualizing Information Diffusion and Polarization with Key Statements
Salway, Andrew (Uni Research, Bergen) | Diakopoulos, Nicholas (University of Bergen) | Elgesem, Dag (University of Bergen )
This paper reports ongoing work in the “Networks of Texts and People” project, which is developing methods to visualize the social and epistemological contexts of information contained in blogs. Here, we propose an approach to visualize information diffusion and polarization in the blogosphere, with two novel characteristics. Firstly, we demonstrate how text content can be analyzed and visualized as key statements, rather than as keywords. Secondly, we sketch and discuss ideas for a visual analytic tool that integrates data about blog networks with data about the occurrence of related key statements in blog posts.
Coping with the Document Frequency Bias in Sentiment Classification
Rafrafi, Abdelhalim (University Pierre et Marie Curie) | Guigue, Vincent (University Pierre et Marie Curie) | Gallinari, Patrick (University Pierre et Marie Curie)
In this article, we study the polarity detection problem using linear supervised classifiers. We show the interest of penalizing the document frequencies in the regularization process to increase the accuracy. We propose a systematic comparison of different loss and regularization functions on this particular task using the Amazon dataset. Then, we evaluate our models according to three criteria: accuracy, sparsity and subjectivity. The subjectivity is measured by projecting our dictionary and optimized weight vector on the SentiWordNet lexicon. This original approach highlights a bias in the selection of the relevant terms during the regularization procedure: frequent terms are overweighted compared to their intrinsic subjectivities.We show that this bias appears whatever the chosen loss or regularization and on all datasets: it is closely link to the gradient descent technique. Penalizing the document frequency during the learning step enables us to improve significantly our performances. A lot of sentimental markers appear rarely and thus, are unappreciated by statistical learning algorithms. Explicitly boosting their influences leads to increasing the accuracy in the sentiment classification task.