A Guide to TensorFlow (Part 1)


TensorFlow is an open source software library for numerical computation using data flow graphs. It is an extremely popular symbolic math library and is widely used for machine learning applications such as neural networks. This blog is a part of "A Guide To TensorFlow", where we will explore the TensorFlow API and use it to build multiple machine learning models for real- life examples. In this blog we shall uncover TensorFlow Graph, understand the concept of Tensors and also explore TensorFlow data types. At the heart of a TensorFlow program is the computation graph described in code.

What Will Shape the Future of Machine Learning in 2018?


Any new technology is not successful until it is embraced and used to its potential. Machine learning is no exception to this rule and its success or to say its ability can be gauged by the trends that exist. Machine learning is already a hot technology at the moments and it seems to have a promising future. At present, it seems to be an evolutionary phase where remarkable developments are expected. This brings us to the thought of what the machine learning of future will be like.

Attentional Multilabel Learning over Graphs - A message passing approach Machine Learning

We address a largely open problem of multilabel classification over graphs. Unlike traditional vector input, a graph has rich variable-size structures, that suggests complex relationships between labels and subgraphs. Uncovering these relations might hold the keys of classification performance and explainability. To this end, we design GAML (Graph Attentional Multi-Label learning), a graph neural network that models the relations present in the input graph, in the label set, and across graph-labels by leveraging the message passing algorithm and attention mechanism. Representation of labels and input nodes is refined iteratively through multiple steps, during which interesting subgraph-label patterns emerge. In addition, GAML is highly flexible by allowing explicit label dependencies to be incorporated easily. It also scales linearly with the number of labels and graph size thanks to our proposed hierarchical attention. These properties open a wide range of applications seen in the real world. We evaluate GAML on an extensive set of experiments with both graph inputs (for predicting drug-protein binding, and drug-cancer response), and classical unstructured inputs. The results are significantly better than well-known multilabel learning techniques.



Seglearn is a python package for machine learning time series or sequences using a sliding window segmentation. It provides an integrated pipeline for segmentation, feature extraction, feature processing, and final estimator. Seglearn provides a flexible approach to multivariate time series and contextual data for classification, regression, and forecasting problems. It is compatible with scikit-learn. Installation documentation, API documentation, and examples can be found on the documentation.



ZhuSuan is a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and deep learning. Unlike existing deep learning libraries, which are mainly designed for deterministic neural networks and supervised tasks, ZhuSuan provides deep learning style primitives and algorithms for building probabilistic models and applying Bayesian inference. Variational inference with programmable variational posteriors, various objectives and advanced gradient estimators (SGVB, REINFORCE, VIMCO, etc.). ZhuSuan is still under development. Before the first stable release (1.0), please clone the repository and run This will install ZhuSuan and its dependencies automatically.

Discovering Relationships and their Structures Across Disparate Data Modalities Machine Learning

Determining how certain properties are related to other properties is fundamental to scientific discovery. As data collection rates accelerate, it is becoming increasingly difficult yet ever more important to determine whether one property of data (e.g., cloud density) is related to another (e.g., grass wetness). Only if two properties are related are further investigations into the geometry of the relationship warranted. While existing approaches can test whether two properties are related, they may require unfeasibly large sample sizes in real data scenarios, and do not address how they are related. Our key insight is that one can adaptively restrict the analysis to the "jointly local" observations---that is, one can estimate the scales with the most informative neighbors for determining the existence and geometry of a relationship. "Multiscale Graph Correlation" (MGC) is a framework that extends global procedures to be multiscale; consequently, MGC tests typically require far fewer samples than existing methods for a wide variety of dependence structures and dimensionalities, while maintaining computational efficiency. Moreover, MGC provides a simple and elegant multiscale characterization of the potentially complex latent geometry underlying the relationship. In several real data applications, MGC uniquely detects the presence and reveals the geometry of the relationships.

Machine Learning-Driven Bundling. The Future of JavaScript Tooling. · Minko Gechev's blog


Although, saying "mathematical foundation" may sound a bit frustrating, the covered topics are essential and it's very likely you're already familiar with them. We're going to mention few algorithms from the graph theory and one popular machine learning model. Right after that, we're going to define few concepts in order to make sure we speak the same language. Finally, in details, we'll discuss how everything from @mlx works together. Disclaimer: the packages that we're going to cover are in a very early stage of their development. It's very likely that they are incompatible with your projects. Keep in mind that their APIs are not finalized. Over time their implementation will mature and get more robust.

Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals Machine Learning

This paper introduces a nonparametric copula-based index for detecting the strength and monotonicity structure of linear and nonlinear statistical dependence between pairs of random variables or stochastic signals. Our index, termed Copula Index for Detecting Dependence and Monotonicity (CIM), satisfies several desirable properties of measures of association, including R\'enyi's properties, the data processing inequality (DPI), and consequently self-equitability. Synthetic data simulations reveal that the statistical power of CIM compares favorably to other state-of-the-art measures of association that are proven to satisfy the DPI. Simulation results with real-world data reveal the CIM's unique ability to detect the monotonicity structure among stochastic signals to find interesting dependencies in large datasets. Additionally, simulations show that the CIM shows favorable performance to estimators of mutual information when discovering Markov network structure.

Dangers of digital dependency Letters

The Guardian

I found Moya Sarner's article on digital addiction and her story of Lady Geek's reverse ferret from digital guru to prophet of doom absorbing, timely, and somehow familiar (Is it time to fight the digital dictators?, 15 March). She also quotes Professor Mark Griffiths, director of the International Gaming Research Unit at Nottingham Trent University as having invented the term "technological addiction" in 1995. In 1971 I started a degree in maths, electronics and physics at Chelsea College, University of London which involved a certain amount of programming on the college's Elliott 803 mainframe. I remember clearly our lecturer warning us very sternly about the dangers of getting over-involved in programming, quoting the case of an earlier student who had spent so many nights in the computer room, addicted to getting his programs just-so, that he neglected all his other studies and eventually failed to make progress in anything. Remember that this was back in the days when our programs were written in Fortran on decks of hand-punched 80-column cards.

A GPU enabled AMI for Deep Learning – empiricalci


TLDR: Use AMI ami-b1e2c4a6, which provides the NVIDIA drivers, docker, and nvidia-docker. Using docker to package your projects allows them to be easily ported. Here are a couple of Docker images to get started. I recently read this post on r/MachineLearning about an AMI pre-built with GPU support and several popular software dependencies for Deep Learning like OpenCV, Caffe, Keras, Theano, Tensorflow, etc. It's definitely very useful to have an environment with everything set up and ready to go. One of the biggest sources of friction when trying a new project is having to set up the environment.