Education
Efficient Protocols for Distributed Classification and Optimization
Daume, Hal III, Phillips, Jeff M., Saha, Avishek, Venkatasubramanian, Suresh
In distributed learning, the goal is to perform a learning task over data distributed across multiple nodes with minimal (expensive) communication. Prior work (Daume III et al., 2012) proposes a general model that bounds the communication required for learning classifiers while allowing for $\eps$ training error on linearly separable data adversarially distributed across nodes. In this work, we develop key improvements and extensions to this basic model. Our first result is a two-party multiplicative-weight-update based protocol that uses $O(d^2 \log{1/\eps})$ words of communication to classify distributed data in arbitrary dimension $d$, $\eps$-optimally. This readily extends to classification over $k$ nodes with $O(kd^2 \log{1/\eps})$ words of communication. Our proposed protocol is simple to implement and is considerably more efficient than baselines compared, as demonstrated by our empirical results. In addition, we illustrate general algorithm design paradigms for doing efficient learning over distributed data. We show how to solve fixed-dimensional and high dimensional linear programming efficiently in a distributed setting where constraints may be distributed across nodes. Since many learning problems can be viewed as convex optimization problems where constraints are generated by individual points, this models many typical distributed learning scenarios. Our techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative-weight-update framework more generally to a distributed setting. As a consequence, our methods extend to the wide range of problems solvable using these techniques.
Kernels for Vector-Valued Functions: a Review
Alvarez, Mauricio A., Rosasco, Lorenzo, Lawrence, Neil D.
Kernel methods are among the most popular techniques in machine learning. From a frequentist/discriminative perspective they play a central role in regularization theory as they provide a natural choice for the hypotheses space and the regularization functional through the notion of reproducing kernel Hilbert spaces. From a Bayesian/generative perspective they are the key in the context of Gaussian processes, where the kernel function is also known as the covariance function. Traditionally, kernel methods have been used in supervised learning problem with scalar outputs and indeed there has been a considerable amount of work devoted to designing and learning kernels. More recently there has been an increasing interest in methods that deal with multiple outputs, motivated partly by frameworks like multitask learning. In this paper, we review different methods to design or learn valid kernel functions for multiple outputs, paying particular attention to the connection between probabilistic and functional methods.
Applications of fuzzy logic to Case-Based Reasoning
Subbotin, Igor Ya., Voskoglou, Michael Gr.
Broadly construed Case-Based Reasoning (CBR) is the process of solving new problems based on the solution of past problems. The CBR systems' expertise is embodied in a collection (library) of past cases rather, than being encoded in classical rules. Each case typically contains a description of the problem plus a solution and/or the outcomes. When a problem is successfully solved, the experience is retained in order to solve similar problems in future. When an attempt to solve a problem fails, the reason for the failure is identified and remembered in order to avoid the same mistake in future. Thus CBR is a cyclic and integrated process of solving a problem, learning from this experience, solving a new problem, etc.
Relax and Localize: From Value to Algorithms
Rakhlin, Alexander, Shamir, Ohad, Sridharan, Karthik
We show a principled way of deriving online learning algorithms from a minimax analysis. Various upper bounds on the minimax value, previously thought to be non-constructive, are shown to yield algorithms. This allows us to seamlessly recover known methods and to derive new ones. Our framework also captures such "unorthodox" methods as Follow the Perturbed Leader and the R^2 forecaster. We emphasize that understanding the inherent complexity of the learning problem leads to the development of algorithms. We define local sequential Rademacher complexities and associated algorithms that allow us to obtain faster rates in online learning, similarly to statistical learning theory. Based on these localized complexities we build a general adaptive method that can take advantage of the suboptimality of the observed sequence. We present a number of new algorithms, including a family of randomized methods that use the idea of a "random playout". Several new versions of the Follow-the-Perturbed-Leader algorithms are presented, as well as methods based on the Littlestone's dimension, efficient methods for matrix completion with trace norm, and algorithms for the problems of transductive learning and prediction with static experts.
Learning from Humans as an I-POMDP
Woodward, Mark P., Wood, Robert J.
The interactive partially observable Markov decision process (I-POMDP) is a recently developed framework which extends the POMDP to the multi-agent setting by including agent models in the state space. This paper argues for formulating the problem of an agent learning interactively from a human teacher as an I-POMDP, where the agent \emph{programming} to be learned is captured by random variables in the agent's state space, all \emph{signals} from the human teacher are treated as observed random variables, and the human teacher, modeled as a distinct agent, is explicitly represented in the agent's state space. The main benefits of this approach are: i. a principled action selection mechanism, ii. a principled belief update mechanism, iii. support for the most common teacher \emph{signals}, and iv. the anticipated production of complex beneficial interactions. The proposed formulation, its benefits, and several open questions are presented.
Transforming Graph Representations for Statistical Relational Learning
Rossi, Ryan A., McDowell, Luke K., Aha, David W., Neville, Jennifer
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation--for the nodes, links, and features--can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.
Sifu: Interactive Crowd-Assisted Language Learning
Chan, Cheng-wei (National Taiwan University) | Hsu, Jane Yung-jen ( National Taiwan University )
This paper introduces SIFU, a system that recruits in real time native speakers as online volunteer tutors to help answer questions from Chinese language learners in reading news articles. SIFU integrates the strengths of two effective online language learning methods: reading online news and communicating with online native speakers. SIFU recruits volunteers from an online social network rather than recruits workers from Amazon Mechanical Turk.Initial experiments showed that the proposed approach is able to effectively recruit online volunteer tutors, adequately answer the learners' questions, and efficiently obtain an answer for the learner. Our field deployment illustrates that SIFU is very useful in assisting Chinese learners in reading Chinese news articles and online volunteer tutors are willing to help Chinese learners when they are on social network service.
Design Probes into Nutrigenomics: From Data to User Experiences
Kera, Denisa (National University of Singapore)
Do quantified and origin) and molecular aspects of our bodies like DNA can tweeting, heavily monitored and selfreporting animals, converge. Consumer genomics websites, crowdsourcing of humans, environments and food create some new biodata but also social networking over genes, together uniformity, a dangerously homogenous, objectified and with services monitoring food flows and food authenticity standardized collective or these data offer some new can create new models of research in nutrigenomics and opportunity for interaction? Are we creating new symbiotic projects related to dieting, health and relations over these data that can lead to a new sense of lifestyle choices. How to connect various scales from community or we are witnessing some depersonalization molecules to institutions and what will be the function of and objectification? How to make meaning out of large these interactions and interfaces? How to create quantities of data and how to bring user experience to data meaningful interaction across scales and large datasets?
Integration of Online Learning into HTN Planning for Robotic Tasks
Magnenat, Stéphane (ETH Zurich) | Chappelier, Jean-Cédric (EPFL) | Mondada, Francesco (EPFL)
This paper extends hierarchical task network (HTN) planning with lightweight learning, considering that in robotics, actions have a non-zero probability of failing. Our work applies to A*-based HTN planners with lifting. We prove that the planner finds the plan of maximal expected utility, while retaining its lifting capability and efficient heuristic-based search. We show how to learn the probabilities online, which allows a robot to adapt by replanning on execution failures. The idea behind this work is to use the HTN domain to constrain the space of possibilities, and then to learn on the constrained space in a way requiring few training samples, rendering the method applicable to autonomous mobile robots.
The Role of AI in Wisdom of the Crowds for the Social Construction of Knowledge on Sustainability
Maher, Mary Lou (University of Maryland)
One of the original applications of crowdsourcing the construction of knowledge is Wikipedia, which relies entirely on people to contribute, extend, and modify the representation of knowledge. This paper presents a case for combining AI and wisdom of the crowds for the social construction of knowledge. Our social-computational approach to collective intelligence combines the strengths of human cognitive diversity in producing content and the capabilities of an AI, through methods such as topic modeling, to link and synthesize across these human contributions. In addition to drawing from established domains such as Wikipedia for inspiration and guidance, we present the design of a system that incorporates AI into wisdom of the crowds to develop a knowledge base on sustainability. In this setting the AI plays the role of scholar, as might many of the other participants, drawing connections and synthesizing across contributions. We close with a general discussion, speculating on educational implications and other roles that an AI can play within an otherwise collective human intelligence.