Accuracy
Imitation learning of motor primitives and language bootstrapping in robots
Cederborg, Thomas, Oudeyer, Pierre-Yves
Imitation learning in robots, also called programing by demonstration, has made important advances in recent years, allowing humans to teach context dependant motor skills/tasks to robots. We propose to extend the usual contexts investigated to also include acoustic linguistic expressions that might denote a given motor skill, and thus we target joint learning of the motor skills and their potential acoustic linguistic name. In addition to this, a modification of a class of existing algorithms within the imitation learning framework is made so that they can handle the unlabeled demonstration of several tasks/motor primitives without having to inform the imitator of what task is being demonstrated or what the number of tasks are, which is a necessity for language learning, i.e; if one wants to teach naturally an open number of new motor skills together with their acoustic names. Finally, a mechanism for detecting whether or not linguistic input is relevant to the task is also proposed, and our architecture also allows the robot to find the right framing for a given identified motor primitive. With these additions it becomes possible to build an imitator that bridges the gap between imitation learning and language learning by being able to learn linguistic expressions using methods from the imitation learning community. In this sense the imitator can learn a word by guessing whether a certain speech pattern present in the context means that a specific task is to be executed. The imitator is however not assumed to know that speech is relevant and has to figure this out on its own by looking at the demonstrations: indeed, the architecture allows the robot to transparently also learn tasks which should not be triggered by an acoustic word, but for example by the color or position of an object or a gesture made by someone in the environment. To demonstrate this ability to find the ...
Towards Analyzing Micro-Blogs for Detection and Classification of Real-Time Intentions
Banerjee, Nilanjan (IBM Research - India) | Chakraborty, Dipanjan (IBM Research - India) | Joshi, Anupam (IBM Research - India) | Mittal, Sumit (IBM Research - India, New Delhi) | Rai, Angshu (IBM Research - India) | Ravindran, Balaraman (Indian Institute of Technology, Madras)
Micro-blog forums, such as Twitter, constitute a powerful medium today that people use to express their thoughts and intentions on a daily, and in many cases, hourly, basis. Extracting ‘Real-Time Intention’ (RTI) of a user from such short text updates is a huge opportunity towards web personalization and social net- working around dynamic user context. In this paper, we explore the novel problem of detecting and classifying RTIs from micro-blogs. We find that employing a heuristic based ensemble approach on a reduced dimension of the feature space, based on a wide spectrum of linguistic and statistical features of RTI expressions, achieves significant improvement in detect- ing RTIs compared to word-level features used in many social media classification tasks today. Our solution approach takes into account various salient characteristics of micro-blogs towards such classification – high dimensionality, sparseness of data, limited context, grammatical in-correctness, etc.
Network Sampling Designs for Relational Classification
Ahmed, Nesreen K. (Purdue University) | Neville, Jennifer (Purdue University) | Kompella, Ramana (Purdue University)
Relational classification has been extensively studied recently due to its applications in social, biological, technological, and information networks. Much of the work in relational learning has focused on analyzing input data that comprise a single network. Although machine learning researchers have considered the issue of how to sample training and test sets from the input network (for evaluation), the mechanisms which are used to construct the input networks have largely been ignored. In most cases, the input network has itself been sampled from a larger target network (e.g., Facebook) and often the researcher is unaware of how the input network was constructed or what impact that may have on evaluation of the relational models. Since the goal in evaluating relational classification algorithms is to accurately assess their performance on the larger target network, it is critical to understand what impact the initial sampling method may have on our estimates of classification accuracy.In this paper, we present different sampling methods and systematically study their impact on evaluation of relational classification. Our results indicate that the choice of sampling method can impact classification performance, and thus consequently affects the accuracy of evaluation.
Unsupervised Real-Time Company Name Disambiguation in Twitter
Muñoz, Agustín D. Delgado (UNED University) | Unanue, Raquel Martínez (UNED University) | García-Plaza, Alberto Pérez (UNED University) | Fresno, Víctor (UNED University)
This paper presents a new approach to disambiguate company names in the Twitter social network. We have focused on making lighter the processing of comparing company profiles with tweets in order to obtain a competitive real-time system. With this aim, we only use the home page of each company as information source to create a unique profile. On the other hand, we compute the similarity of a tweet in connection to a profile by comparing the content of the tweet with the profile. Both steps do not use any other external information source and all the process is developed in an unsupervised way. We have tested our application with the test WePS-3 CLEF ORM corpus obtaining encouraging results.
A Supervised Approach to Predict Company Acquisition with Factual and Topic Features Using Profiles and News Articles on TechCrunch
Xiang, Guang (Carnegie Mellon University) | Zheng, Zeyu (Carnegie Mellon University) | Wen, Miaomiao (Carnegie Mellon University) | Hong, Jason (Carnegie Mellon University) | Rose, Carolyn (Carnegie Mellon University) | Liu, Chao (Microsoft Research)
Merger and Acquisition (M&A) prediction has been an interesting and challenging research topic in the past a few decades. However, past work has only adopted numerical features in building models, and yet the valuable textual information from the great variety of social media sites has not been touched at all. To fully explore this information, we used the profiles and news articles for companies and people on TechCrunch, the leading and largest public database for the tech world, which anybody can edit. Specifically, we explored topic features via topic modeling techniques, as well as a set of other novel features of our design within a machine learning framework. We conducted experiments of the largest scale in the literature, and achieved a high true positive rate (TP) between 60% to 79.8% with a false positive rate (FP) mostly between 0% and 8.3% over company categories with a small number of missing attributes in the CrunchBase profiles.
Identifying Microblogs for Targeted Contextual Advertising
Dave, Kushal Shailesh (International Institute of Information Technology, Hyderabad) | Varma, Vasudeva (International Institute of Information Technology, Hyderabad)
Micro-blogging sites such as Facebook, Twitter, Google+ present a nice opportunity for targeting advertisements that are contextually related to the microblog content. By virtue of the sparse and noisy text makes identifying the microblogs suitable for advertising a very hard problem. In this work, we approach the problem of identifying the microblogs that could be targeted for advertisements as a two-step classification approach. In the first pass, microblogs suitable for advertising are identified. Next, in the second pass, we build a model to find the sentiment of the advertisable microblog. The systems use features derived from the Part-of-speech tags, the tweet content and uses external resources such as query logs and n-gram dictionaries from previously labeled data.This work aims at providing a thorough insight into the problem and analyzing various features to assess which features contribute the most towards identifying the tweets that can be targeted for advertisements.
The YouTube Social Network
Wattenhofer, Mirjam (Google Zurich) | Wattenhofer, Roger (ETH Zurich) | Zhu, Zack (ETH Zurich)
Today, YouTube is the largest user-driven video content provider in the world; it has become a major platform for disseminating multimedia information. A major contribution to its success comes from the user-to-user social experience that differentiates it from traditional content broadcasters. This work examines the social network aspect of YouTube by measuring the full-scale YouTube subscription graph, comment graph, and video content corpus. We find YouTube to deviate significantly from network characteristics that mark traditional online social networks, such as homophily, reciprocative linking, and assortativity. However, comparing to reported characteristics of another content-driven online social network, Twitter, YouTube is remarkably similar. Examining the social and content facets of user popularity, we find a stronger correlation between a user's social popularity and his/her most popular content as opposed to typical content popularity. Finally, we demonstrate an application of our measurements for classifying YouTube Partners, who are selected users that share YouTube's advertisement revenue. Results are motivating despite the highly imbalanced nature of the classification problem.
Active Diagnosis via AUC Maximization: An Efficient Approach for Multiple Fault Identification in Large Scale, Noisy Networks
Bellala, Gowtham, Stanley, Jason, Scott, Clayton, Bhavnani, Suresh K.
The problem of active diagnosis arises in several applications such as disease diagnosis, and fault diagnosis in computer networks, where the goal is to rapidly identify the binary states of a set of objects (e.g., faulty or working) by sequentially selecting, and observing, (noisy) responses to binary valued queries. Current algorithms in this area rely on loopy belief propagation for active query selection. These algorithms have an exponential time complexity, making them slow and even intractable in large networks. We propose a rank-based greedy algorithm that sequentially chooses queries such that the area under the ROC curve of the rank-based output is maximized. The AUC criterion allows us to make a simplifying assumption that significantly reduces the complexity of active query selection (from exponential to near quadratic), with little or no compromise on the performance quality.
Generalized Fisher Score for Feature Selection
Gu, Quanquan, Li, Zhenhui, Han, Jiawei
Fisher score is one of the most widely used supervised feature selection methods. However, it selects each feature independently according to their scores under the Fisher criterion, which leads to a suboptimal subset of features. In this paper, we present a generalized Fisher score to jointly select features. It aims at finding an subset of features, which maximize the lower bound of traditional Fisher score. The resulting feature selection problem is a mixed integer programming, which can be reformulated as a quadratically constrained linear programming (QCLP). It is solved by cutting plane algorithm, in each iteration of which a multiple kernel learning problem is solved alternatively by multivariate ridge regression and projected gradient descent. Experiments on benchmark data sets indicate that the proposed method outperforms Fisher score as well as many other state-of-the-art feature selection methods.
Smoothing Multivariate Performance Measures
Zhang, Xinhua, Saha, Ankan, Vishwanatan, S. V. N.
A Support Vector Method for multivariate performance measures was recently introduced by Joachims (2005). The underlying optimization problem is currently solved using cutting plane methods such as SVM-Perf and BMRM. One can show that these algorithms converge to an eta accurate solution in O(1/Lambda*e) iterations, where lambda is the trade-off parameter between the regularizer and the loss function. We present a smoothing strategy for multivariate performance scores, in particular precision/recall break-even point and ROCArea. When combined with Nesterov's accelerated gradient algorithm our smoothing strategy yields an optimization algorithm which converges to an eta accurate solution in O(min{1/e,1/sqrt(lambda*e)}) iterations. Furthermore, the cost per iteration of our scheme is the same as that of SVM-Perf and BMRM. Empirical evaluation on a number of publicly available datasets shows that our method converges significantly faster than cutting plane methods without sacrificing generalization ability.