After Facebook announced its own tool to detect bias in an algorithm earlier this month, a new report suggests that Microsoft is also building a tool to automate the identification of bias in a range of different Artificial Intelligence (AI) algorithms. The Microsoft tool has the potential to help businesses make use of AI without inadvertently discriminating against certain groups of people, MIT Technology Review reported on Friday. According to the MIT Technology Review report, UC Berkeley professor Bin Yu says the tools from Facebook and Microsoft seem like a step in the right direction, but may not be enough. She suggests that big companies should have outside experts audit their algorithms in order to prove they are not biased.
This is the May edition of the AI/ML learning resources newsletter -- I compiled a list learning resources, most of which are recently announced or upcoming in the near future. A few of these may have been around for a while but I recently discovered them. We are living in very exciting times when so many AI/ML learning resources are being launched at such a rapid pace. Many companies and individuals are working hard to bring AI/ML and deep learning to everyone, and I'm joining that effort by sharing. Google I/O 2018 (5/8–5/10) has tons of sessions on AI/ML.
In many applications of machine learning, such as machine learning for medical diagnosis, we would like to have machine learning algorithms that do not memorize sensitive information about the training set, such as the specific medical histories of individual patients. Differential privacy is a framework for measuring the privacy guarantees provided by an algorithm. Through the lens of differential privacy, we can design machine learning algorithms that responsibly train models on private data. Our works (with Martín Abadi, Úlfar Erlingsson, Ilya Mironov, Ananth Raghunathan, Shuang Song and Kunal Talwar) on differential privacy for machine learning have made it very easy for machine learning researchers to contribute to privacy research--even without being an expert on the mathematics of differential privacy. In this blog post, we'll show you how to do it. The key is a family of algorithms called Private Aggregation of Teacher Ensembles (PATE). One of the great things about the PATE framework, besides its name, is that anyone who knows how to train a supervised ML model (such as a neural net) can now contribute to research on differential privacy for machine learning.
Python is the preferred choice of developers, engineers, data scientists, and hobbyists everywhere. It is a great scripting language that can power your applications and provide great speed, safety, and scalability. By exposing Python as a series of simple recipes, you can gain insight into specific language features in a particular context. Having a tangible context helps make the language or standard library features easier to understand. This video comes with over 100 recipes on the latest version of Python.
How good are you at picking out fibbers from a crowd? According to a new paper from researchers at the University of Rochester, New York, there are particular (involuntary) quirks and facial movements that give us away – whether we like it or not. The team used a combination of big data, machine learning technology, and automated facial feature analysis software to identify differences in facial and verbal cues between people who are lying and people who are telling the truth. Volunteers were recruited from Amazon Mechanical Turk and split into pairs. In total, there were 151 couples and 1.3 million frames of expressions for the team to analyze.
In this course, you will learn what hyperparameters are, what Genetic Algorithm is, and what hyperparameter optimization is. In this course, you will apply Genetic Algorithm to optimize the performance of Support Vector Machines and Multilayer Perceptron Neural Networks. Hyperparameter optimization will be done on two datasets, a regression dataset for the prediction of cooling and heating loads of buildings, and a classification dataset regarding the classification of emails into spam and non-spam. The SVM and MLP will be applied on the datasets without optimization and compare their results to after their optimization. By the end of this course, you will have learnt how to code Genetic Algorithm in Python and how to optimize your Machine Learning algorithms for maximal performance.
Note: Parts of this post are based on my ACL 2018 paper Strong Baselines for Neural Semi-supervised Learning under Domain Shift with Barbara Plank. Unsupervised learning constitutes one of the main challenges for current machine learning models and one of the key elements that is missing for general artificial intelligence. While unsupervised learning on its own is still elusive, researchers have a made a lot of progress in combining unsupervised learning with supervised learning. This branch of machine learning research is called semi-supervised learning. Semi-supervised learning has a long history. For a (slightly outdated) overview, refer to Zhu (2005)  and Chapelle et al. (2006) .
Finally, a comprehensive hands-on machine learning course with specific focus on classification based models for the investment community and passionate investors. In the past few years, there has been a massive adoption and growth in the use of data science, artificial intelligence and machine learning to find alpha. However, information on and application of machine learning to investment are scarce. This course has been designed to address that. It is meant to spark your creative juices and get you started in this space.