Supervised Learning
Predictive modelling, how to build ground-truth and extract features for action prediction? • /r/MachineLearning
I have a dataset of users, each user has has daily information about his activities (numerical values representing some measurements of his physical activities). In addition, each user in each day has a boolean value that represents if he/she took a particular action. The data set is not fixed, so new activities information and action are added for each user each new day. Build a model that predicts which user is likely to take the action in the near future (e.g. in any of the next 7 days). My approach is to build feature vectors representing the activity values for each users over a period of time, and use the action column as a source of ground-truth.
Resource Constrained Structured Prediction
Bolukbasi, Tolga, Chang, Kai-Wei, Wang, Joseph, Saligrama, Venkatesh
We study the problem of structured prediction under test-time budget constraints. We propose a novel approach applicable to a wide range of structured prediction problems in computer vision and natural language processing. Our approach seeks to adaptively generate computationally costly features during test-time in order to reduce the computational cost of prediction while maintaining prediction performance. We show that training the adaptive feature generation system can be reduced to a series of structured learning problems, resulting in efficient training using existing structured learning algorithms. This framework provides theoretical justification for several existing heuristic approaches found in literature. We evaluate our proposed adaptive system on two structured prediction tasks, optical character recognition (OCR) and dependency parsing and show strong performance in reduction of the feature costs without degrading accuracy.
Linearly Independent Sets in Vector Spaces induced by Kernels • /r/MachineLearning
I hope this post is okay (if not let me know). I'm attaching a pdf which rigorously defines my question. Briefly, what I'm wondering is this - for the set of data points {x1,...,xp} in a vector space, (say, Rn) under what conditions is the set {k(x1,),...,k(xp,)} (where k(,) is a kernel function) independent? What conditions must the set {x1,...,xp} and the kernel function have to ensure independence? If there isn't an immediate answer to this question I'll happily take recommendations for mathematical reading towards trying to answer this question.
Fearless Frenchman breaks hoverboard record, sets sights on the clouds
A fearless Frenchman, Franky Zapata, thinks one day people will be able to ride his hoverboard to pick up bread in the morning (it's a French thing). The jet ski champion on Saturday set a new Guinness World Record for the farthest hoverboard flight – yes, just like in the movies – off the coast of Sausset-les-Pins in the south of France. Mr. Zapata rode the 1,000 horsepower drone, standing on top of it, for 7,388 feet, or more than a mile. He hovered 165 feet above the surface of the water, "trailed by a fleet of boats and jet skis," as Guinness reports. His feat shattered the previous hoverboard travel record of 905 feet and 2 inches, set last year by Canadian inventor Catalin Alexandru Duru.
How To Extract Feature Vectors From Deep Neural Networks In Python Caffe
Convolutional Neural Networks are great at identifying all the information that makes an image distinct. When we train a deep neural network in Caffe to classify images, we specify a multilayered neural network with different types of layers like convolution, rectified linear unit, softmax loss, and so on. The last layer is the output layer that gives us the output tag with the corresponding confidence value. But sometimes it's useful for us to extract the feature vectors from various layers and use it for other purposes. Let's see how to do it in Python Caffe, shall we?
Aggregating Inter-Sentence Information to Enhance Relation Extraction
Zheng, Hao (Beihang University) | Li, Zhoujun (Beihang University) | Wang, Senzhang (Beihang University) | Yan, Zhao ( Beihang University ) | Zhou, Jianshe ( Capital Normal University )
Previous work for relation extraction from free text is mainly based on intra-sentence information. As relations might be mentioned across sentences, inter-sentence information can be leveraged to improve distantly supervised relation extraction. To effectively exploit inter-sentence information, we propose a ranking based approach, which first learns a scoring function based on a listwise learning-to-rank model and then uses it for multi-label relation extraction. Experimental results verify the effectiveness of our method for aggregating information across sentences. Additionally, to further improve the ranking of high-quality extractions, we propose an effective method to rank relations from different entity pairs. This method can be easily integrated into our overall relation extraction framework, and boosts the precision significantly.
A Generative Model of Words and Relationships from Multiple Sources
Hyland, Stephanie L. (Weill Cornell Graduate School of Medical Sciences/Memorial Sloan Kettering Cancer Center) | Karaletsos, Theofanis (Memorial Sloan Kettering Cancer Center) | Rätsch, Gunnar (Memorial Sloan Kettering Cancer Center)
Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this requirement may not be met due to difficulties in obtaining a large corpus, or the limited range of expression in average use. Such domains may encode prior knowledge about entities in a knowledge base or ontology. We propose a generative model which integrates evidence from diverse data sources, enabling the sharing of semantic information. We achieve this by generalising the concept of co-occurrence from distributional semantics to include other relationships between entities or words, which we model as affine transformations on the embedding space. We demonstrate the effectiveness of this approach by outperforming recent models on a link prediction task and demonstrating its ability to profit from partially or fully unobserved data training labels. We further demonstrate the usefulness of learning from different data sources with overlapping vocabularies.
Direct Discriminative Bag Mapping for Multi-Instance Learning
Wu, Jia (University of Technology Sydney) | Pan, Shirui (University of Technology Sydney) | Zhang, Peng (University of Technology Sydney) | Zhu, Xingquan (Florida Atlantic University)
Multi-instance learning (MIL) is useful for tackling labeling ambiguity in learning tasks, by allowing a bag of instances to share one label. Recently, bag mapping methods, which transform a bag to a single instance in a new space via instance selection, have drawn significant attentions. To date, most existing works are developed based on the original space, i.e., utilizing all instances for bag mapping, and instance selection is indirectly tied to the MIL objective. As a result, it is hard to guarantee the distinguish capacity of the selected instances in the new bag mapping space for MIL. In this paper, we propose a direct discriminative mapping approach for multi-instance learning (MILDM), which identifies instances to directly distinguish bags in the new mapping space. Experiments and comparisons on real-world learning tasks demonstrate the algorithm performance.
Creating Images by Learning Image Semantics Using Vector Space Models
Heath, Derrall (Brigham Young University) | Ventura, Dan (Brigham Young University)
When dealing with images and semantics, most computational systems attempt to automatically extract meaning from images. Here we attempt to go the other direction and autonomously create images that communicate concepts. We present an enhanced semantic model that is used to generate novel images that convey meaning. We employ a vector space model and a large corpus to learn vector representations of words and then train the semantic model to predict word vectors that could describe a given image. Once trained, the model autonomously guides the process of rendering images that convey particular concepts. A significant contribution is that, because of the semantic associations encoded in these word vectors, we can also render images that convey concepts on which the model was not explicitly trained. We evaluate the semantic model with an image clustering technique and demonstrate that the model is successful in creating images that communicate semantic relationships.
Who are alike? Use BigObject feature vector to find similarities
Cluster Analysis is a common technique to group a set of objects in the way that the objects in the same group share certain attributes. It's commonly used in marketing and sales planning to define market segmentations. Here at BigObject we adopt a simple approach to exploring the similarities between objects. We simply calculate the "Feature Vector" based on given attributes and use the score to determine which objects are "alike." This is a simple example to show how to use BigObject to extract product features and then find similar products in your retail data.