Collaborating Authors

How to Get the Most From Your Machine Learning Data


The data that you use, and how you use it, will likely define the success of your predictive modeling problem. Data and the framing of your problem may be the point of biggest leverage on your project. Choosing the wrong data or the wrong framing for your problem may lead to a model with poor performance or, at worst, a model that cannot converge. It is not possible to analytically calculate what data to use or how to use it, but it is possible to use a trial-and-error process to discover how to best use the data that you have. In this post, you will discover to get the most from your data on your machine learning project.

New Moto X phones reportedly feature modular accessories


Here's what the site reports: The phones' backplates have a line of 16 dots near the bottom and these are not speaker ports -- they're connection pins. Motorola has designed six "Amps" (modules) that add new features to the phone, including stereo speakers, a battery pack, a camera grip with flash and optical zoom, a pico projector and a rugged cover with wide-angle lens attachment. The cameras on these two new phones jut out a fair bit, but they should lay flush once the modules are attached. LG's G5 smartphone recently launched with modular capabilities, but it requires removing the actual battery every time you want to add a new attachment. The Vertex and Vector Thin apparently circumvent this problem by attaching modules directly to the back of the phones, rather than inserting new tools into the phone's base.

Using Branch-and-Bound with Constraint Satisfaction in t imizat ion Problems

AAAI Conferences

This work1 integrates three related AI search techniques - constraint satisfaction, branch-and-bound and solution synthesis - and applies the result to constraint satisfaction problems for which optimal answers are required. This method has already been shown to work well in natural language semantic analysis (Beale, et al, 1996); here we extend the domain to optimizing graph coloring problems, which are abstractions of many common scheduling problems of interest. We demonstrate that the methods used here allow us to determine optimal answers to many types of problems without resorting to heuristic search, and, furthermore, can be combined with heuristic search methods for problems with excessive complexity.

Deep Graphs Machine Learning

We propose an algorithm for deep learning on networks and graphs. It relies on the notion that many graph algorithms, such as PageRank, Weisfeiler-Lehman, or Message Passing can be expressed as iterative vertex updates. Unlike previous methods which rely on the ingenuity of the designer, Deep Graphs are adaptive to the estimation problem. Training and deployment are both efficient, since the cost is $O(|E| + |V|)$, where $E$ and $V$ are the sets of edges and vertices respectively. In short, we learn the recurrent update functions rather than positing their specific functional form. This yields an algorithm that achieves excellent accuracy on both graph labeling and regression tasks.

A Unifying View of Explicit and Implicit Feature Maps for Structured Data: Systematic Studies of Graph Kernels Machine Learning

Non-linear kernel methods can be approximated by fast linear ones using suitable explicit feature maps allowing their application to large scale problems. To this end, explicit feature maps of kernels for vectorial data have been extensively studied. As many real-world data is structured, various kernels for complex data like graphs have been proposed. Indeed, many of them directly compute feature maps. However, the kernel trick is employed when the number of features is very large or the individual vertices of graphs are annotated by real-valued attributes. Can we still compute explicit feature maps efficiently under these circumstances? Triggered by this question, we investigate how general convolution kernels are composed from base kernels and construct corresponding feature maps. We apply our results to widely used graph kernels and analyze for which kernels and graph properties computation by explicit feature maps is feasible and actually more efficient. In particular, we derive feature maps for random walk and subgraph matching kernels and apply them to real-world graphs with discrete labels. Thereby, our theoretical results are confirmed experimentally by observing a phase transition when comparing running time with respect to label diversity, walk lengths and subgraph size, respectively. Moreover, we derive approximative, explicit feature maps for state-of-the-art kernels supporting real-valued attributes including the GraphHopper and Graph Invariant kernels. In extensive experiments we show that our approaches often achieve a classification accuracy close to the exact methods based on the kernel trick, but require only a fraction of their running time.