Instructional Material
Course Introduction - Introduction to the Principles and Practice of Amazon Machine Learning
All content on CloudAcademy.com is the sole property of Cloud Academy, Inc. Rackspace and Rackspace Logo are a registered trademark of Rackspace US, Inc. Amazon Web Services (AWS) and Amazon Web Services Logo are a registered trademark of Amazon Web Services, Inc. Google and the Google Logo are registered trademarks of Google Inc. Azure and Azure Logo are registered trademarks of Microsoft Corporation. Inc. or Google, Inc. or Microsoft Corporation and has no claim or interest in any mark owned by Rackspace US, Inc. or Amazon Web Services, Inc. or Amazon.com, Our services are not authorized, sponsored, approved, certified or endorsed by Rackspace US, Inc. or Amazon Web Services, Inc. or Amazon.com, All trademarks, service marks, trade names, trade dress, product names and logos appearing on the site are the property of their respective owners. Any rights not expressly granted herein are reserved.
Statistical Learning
The active course run for Statistical Learning has ended, but the course is now available in a self paced mode. You are welcome to join the course and work through the material and exercises at your own pace. When you have completed the exercises with a score of 50% or higher, you can generate your Statement of Accomplishment from within the course. The course will remain available for an extended period of time. We anticipate the content will be available until at least August 2, 2017.
Deal: Master AI and achieve the impossible – 94% off - AndroidPIT
Getting Artificial Intelligence programming knowledge is an excellent way to make you stand out in the workforce. Many even make entire careers out of it. AI programmers are some of the most sought after professionals across many industries all over the world. Now, you can learn AI programming online with the complete machine learning course bundle. You'll learn valuable skills like Quant trading, Hadoop, Object-oriented Java, NLP in Python, Twitter sentiment analysis and so many more.
ledell/useR-machine-learning-tutorial
Instructions for how to install the neccessary software for this tutorial is available here. Data for the tutorial can be downloaded by running ./data/get-data.sh (requires wget). Certain algorithms don't scale well when there are millions of features. For example, decision trees require computing some sort of metric (to determine the splits) on all the feature values (or some fraction of the values as in Random Forest and Stochastic GBM). Therefore, computation time is linear in the number of features. Algorithms can deal with data sparsity (where many of the feature values are zero) in different ways.
Practical Graph Analytics with Apache Giraph: Roman Shaposhnik, Claudio Martella, Dionysios Logothetis: 9781484212523: Amazon.com: Books
If you have used (or attempted to use) Giraph you will know that it's not the easiest system to pick up and start getting your hands dirty. However, it is an extremely powerful tool that can be used to solve many interesting problems at scale. This book does a great job of taking you from the very beginning with simple, straightforward algorithms to more complex real-world applications and finally more advanced topics (e.g., data partitioning, out-of-core algorithms, running on the cloud, etc.). With this book you can start without knowing anything about Giraph or distributed graph processing systems in general (or even about graphs!) and learn how to solve many graph problems at scale. There are several things that I find so appealing about this book. The main things are: (1) the _numerous_ examples with step-by-step illustrations that help you understand what exactly is happening during an algorithm, (2) the fact that the example applications are motivated by real-world problems (e.g., recommendation systems, pagerank, etc.) whose solutions can then serve as a building block for your personal applications, and (3) the example code that will allow you to quickly get your hands dirty and start playing with the tool.
A Model Explanation System: Latest Updates and Extensions
We propose a general model explanation system (MES) for "explaining" the output of black box classifiers. This paper describes extensions to Turner (2015), which is referred to frequently in the text. We use the motivating example of a classifier trained to detect fraud in a credit card transaction history. The key aspect is that we provide explanations applicable to a single prediction, rather than provide an interpretable set of parameters. We focus on explaining positive predictions (alerts). However, the presented methodology is symmetrically applicable to negative predictions.
KPMG will soon be using artificial intelligence for audits in Australia
KPMG plans to use IBM's Watson cognitive computing technology for its professional services in Australia. The artificial intelligence deal with IBM includes a focus on audit and assurance services. IBM's Watson has been doing everything from diagnosing cancer and recommending treatment to analysing the Harry Potter books and running online university courses. "Already, data and analytics techniques are transforming audit by allowing analysis of much bigger populations of data than traditional sampling from which to draw conclusions," says Duncan McLennan, KPMG's national managing partner of audit. "Cognitive technology will allow us to build on these data and analytics advances. They will be a game changer in how the value of audit is perceived by the marketplace."
Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs
Recurrent Neural Networks (RNNs) are popular models that have shown great promise in many NLP tasks. But despite their recent popularity I've only found a limited number of resources that throughly explain how RNNs work, and how to implement them. That's what this tutorial is about. It's a multi-part series in which I'm planning to cover the following: As part of the tutorial we will implement a recurrent neural network based language model. The applications of language models are two-fold: First, it allows us to score arbitrary sentences based on how likely they are to occur in the real world.
Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data, Second Edition 2, Bruce Ratner - Amazon.com
Dr. Ratner has written a unique book that distinguishes between statistical and machine-learning data mining. The book includes 14 statistical data mining and 17 machine-learning data mining techniques. All techniques are quite practical, making this volume a handbook for every statistician, data miner, and machine-learner. Let me describe a few chapters that present approaches and techniques that I really favored. Chapter 3 introduces a new data mining method: a smoother scatterplot based on CHAID.
Intro to Machine Learning with Apache Spark and Apache Zeppelin - Hortonworks
In this tutorial, we will give you a taste of the powerful Machine Learning libraries in Apache Spark via a hands-on lab. We will also introduce the necessary steps to get you up and running with Apache Zeppelin on a Hortonworks Data Platform (HDP) Sandbox. This tutorial is a part of series of hands-on tutorials to get you started with HDP using Hortonworks Sandbox. Please ensure you complete the prerequisites before proceeding with this tutorial. Note: if you are attending a Meetup/Crash Course your speaker/instructor may have additional instructions regarding the Sandbox VM image.