Collaborating Authors

Fast and Scalable Machine Learning in R and Python with H2O


The focus of this talk is scalable machine learning using the H2O R and Python packages. H2O is an open source distributed machine learning platform designed for big data, with the added benefit that it's easy to use on a laptop (in addition to a multi-node Hadoop or Spark cluster). The core machine learning algorithms of H2O are implemented in high-performance Java; however, fully featured APIs are available in R, Python, Scala, REST/JSON and also through a web interface. Since H2O's algorithm implementations are distributed, this allows the software to scale to very large datasets that may not fit into RAM on a single machine. H2O currently features distributed implementations of generalized linear models, gradient boosting machines, random forest, deep neural nets, dimensionality reduction methods (PCA, GLRM), clustering algorithms (K-means), and anomaly detection methods, among others.

Women Who Code Silicon Valley


H2O is an open source, distributed machine learning platform designed for big data, with the added benefit that it's easy to use on a laptop (in addition to a multi-node Hadoop or Spark cluster). Erin LeDell is a Statistician and Machine Learning Scientist at, the company that produces the open source machine learning platform, H2O. She is the author of a handful of machine learning related software packages, including the h2oEnsemble R package for ensemble learning with H2O. Her research focuses on ensemble machine learning, learning from imbalanced binary-outcome data, influence curve based variance estimation and statistical computing.

New York Institute of Finance and Google Cloud launch a Machine Learning for Trading Specialisation on Coursera


The New York Institute of Finance (NYIF) and Google Cloud have launched a new Machine Learning for Trading Specialisation available exclusively on the Coursera platform. The Specialisation helps learners leverage the latest AI and machine learning techniques for financial trading. Amid the Fourth Industrial Revolution, nearly 80 per cent of financial institutions cite machine learning as a core component of business strategy and 75 per cent of financial services firms report investing significantly in machine learning. The Machine Learning for Trading Specialisation equips professionals with key technical skills increasingly needed in the financial industry today. Composed of three courses in financial trading, machine learning, and artificial intelligence, the Specialisation features a blend of theoretical and applied learning.

Theoretical Foundations of Data Science-- Should I Care or Simply Focus on Hands-on Skills?


Data science is a very hands-on and practical field. Data science requires a solid foundation in mathematics and programming. As a data scientist, it is essential that you understand the theoretical and mathematical foundations of data science in order to be able to build reliable models with real-world applications. In data science and machine learning, mathematical skills are as important as programming skills. There are so many good packages that can be used for building predictive models.

Google, Amazon, Microsoft: How do their free machine-learning courses compare?


Machine-learning engineer was the fastest growing job category in the five years to 2017, according to LinkedIn. But tech's hottest role isn't a simple field to break into, requiring at least high school math and some programming knowledge, even to get started. Luckily there are an increasing number of options for those wanting to get a grounding in the field, with Amazon Web Services (AWS) being the latest tech giant to release a set of machine-learning courses for free. That's in addition to the existing well-regarded material available online from the likes of and Andrew Ng and Coursera. If you're interested in these courses, it's worth noting that you'll benefit more if you have a basic knowledge of Python and high school linear algebra, statistics, and calculus.