This paper addresses the problem of unsupervised feature learning for text data. Our method is grounded in the principle of minimum description length and uses a dictionary-based compression scheme to extract a succinct feature set. Specifically, our method finds a set of word $k$-grams that minimizes the cost of reconstructing the text losslessly. We formulate document compression as a binary optimization task and show how to solve it approximately via a sequence of reweighted linear programs that are efficient to solve and parallelizable. As our method is unsupervised, features may be extracted once and subsequently used in a variety of tasks.
For the purpose of learning on graphs, we hunt for a graph feature representation that exhibit certain uniqueness, stability and sparsity properties while also being amenable to fast computation. This leads to the discovery of family of graph spectral distances (denoted as FGSD) and their based graph feature representations, which we prove to possess most of these desired properties. To both evaluate the quality of graph features produced by FGSD and demonstrate their utility, we apply them to the graph classification problem. Through extensive experiments, we show that a simple SVM based classification algorithm, driven with our powerful FGSD based graph features, significantly outperforms all the more sophisticated state-of-art algorithms on the unlabeled node datasets in terms of both accuracy and speed; it also yields very competitive results on the labeled datasets - despite the fact it does not utilize any node label information. Papers published at the Neural Information Processing Systems Conference.
Learning with streaming data has attracted much attention during the past few years.Though most studies consider data stream with fixed features, in real practice the features may be evolvable. For example, features of data gathered by limited lifespan sensors will change when these sensors are substituted by new ones. In this paper, we propose a novel learning paradigm: Feature Evolvable Streaming Learning where old features would vanish and new features would occur. Rather than relying on only the current features, we attempt to recover the vanished features and exploit it to improve performance. Specifically, we learn two models from the recovered features and the current features, respectively.
Feature Engineering is one of those terms that, on the surface, seems to mean exactly what it is saying: you want to refactor or create something from the data that you have. Okay, fine…but what does that actually mean in real life when you're sitting in front of your data set and wondering what to do? The term encompasses a variety of methods that each have a variety of sub-methods associated with them. I'm just going to cover some of the main ones to give you an idea of the sort of thing Feature Engineering contains, with some indication of widely used methods. Encoding -- I think this is one of the most simple and commonly used aspects of Feature Engineering.
I was browsing twitter yesterday (follow me!) when I came across this tweet by Data Science Renee linking to this Medium article called "Top 6 Errors Novice Machine Learning Engineers Make" by Christopher Dossman. This drew my attention because I'm somewhat new to the field (and even if I weren't, it's always worth reviewing the fundamentals).