Collaborating Authors

Decision Tree Learning

Do Decision Trees need Feature Scaling?


Machine Learning algorithms have always been on the path towards evolution since its inception. Today the domain has come a long way from mathematical modelling to ensemble modelling and more. This evolution has seen more robust and SOTA models which is almost bridging the gap between potentials capabilities of human and AI. Ensemble modelling has given us one of those SOTA model XGBoost. Recently I happened to participate in a Machine Learning Hiring Challenge where the problem statement was a classification problem.

DriveML: Self-Drive Machine Learning Projects


Implementing some of the pillars of an automated machine learning pipeline such as (i) Automated data preparation, (ii) Feature engineering, (iii) Model building in classification context that includes techniques such as (a) Regularised regression [1], (b) Logistic regression [2], (c) Random Forest [3], (d) Decision tree [4] and (e) Extreme Gradient Boosting (xgboost) [5], and finally, (iv) Model explanation (using lift chart and partial dependency plots). Also provides some additional features such as generating missing at random (MAR) variables and automated exploratory data analysis. Moreover, function exports the model results with the required plots in an HTML vignette report format that follows the best practices of the industry and the academia.

Ensemble Machine Learning in Python: Random Forest, AdaBoost


In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.

Volkswagen will start delivering ID.3 EVs to Europe in September


Volkswagen's ID.3 EV finally has an official release date. Consumers in most European countries will be able to put a deposit on the EV starting on June 17th. Those who do will have two delivery options. They'll be able to get the EV either as soon as possible or later in the year. The earliest deliveries will take place in September, with the latter ones scheduled for Q4 2020.

Random Forests (and Extremely) in Python with scikit-learn


In this guest post, you will learn by example how to do two popular machine learning techniques called random forest and extremely random forests. In fact, this post is an excerpt (adapted to the blog format) from the forthcoming Artificial Intelligence with Python – Second Edition: Your Complete Guide to Building Intelligent Apps using Python 3.x and TensorFlow 2. Now, before you will learn how to carry out random forests in Python with scikit-learn, you will find some brief information about the book. The new edition of this book, which will guide you to artificial intelligence with Python, is now updated to Python 3.x and TensorFlow 2. Furthermore, it has new chapters that, besides random forests, cover recurrent neural networks, artificial intelligence and Big Data, fundamental use cases, chatbots, and more. Finally, artificial Intelligence with Python – Second Edition is written by two experts in the field of artificial intelligence; Alberto Artasanches and Pratek Joshi (more information about the authors can be found towards the end of the post). Now, in the next section of this post, you will learn what random forests and extremely random forests are.

Alexander Jung


This lecture discusses how decision trees can be used to represent predictor functions. Variations of the basic decision tree model provide some of the most powerful machine learning methods curren... Alexander Jung uploaded a video 1 week ago Classification Methods - Duration: 46 minutes. Our focus is on linear regression methods which can be expanded by feature constructions. Guest lecture of Prof. Minna Huotilainen on learning processes in human brains. Alexander Jung subscribed to a channel 3 weeks ago Playing For Change - Channel PFC is a movement created to inspire and connect the world through music. The idea for this project came from a common belief that music has the power to break down boundaries and overcome distances SubscribeSubscribedUnsubscribe1.9M This video explains how network Lasso can be used to learn localized linear models that allow "personalized" predictions for individual data points within a network.

Fine-Tuning ML Hyperparameters


"Just as electricity transformed almost every industry 100 years ago, today I actually have hard time thinking of an industry that I don't think AI (Artificial Intelligence) will transform in the next several years" -- Andrew NG I have long been fascinated with these algorithms, capable of something that we can as humans barely begin to comprehend. However, even with all these resources one of the biggest setbacks any ML practitioner has ever faced would be tuning the model's hyperparameters. A hyperparameter is a parameter whose value is used to control the learning process. By contrast, the values of other parameters (typically node weights) are learned. The same kind of machine learning model can be trained on different constraints, learning rates or kernels and other such parameters to generalize to different datasets, and hence these instructions have to be tuned so that the model can optimally solve the machine learning problem.

Adversarial Robustness Toolbox v1.2 releases: crafting and analysis of attacks and defense methods for machine learning models • Penetration Testing


Adversarial Robustness 360 Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defenses and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which are required to test defenses with state-of-the-art threat models.

Why White-Box Models in Enterprise Data Science Work More Efficiently


Data science is the current powerhouse for organizations, turning mountains of data into actionable business insights that impact every part of the business, including customer experience, revenue, operations, risk management and other functions. Data science has the potential to dramatically accelerate digital transformation initiatives, delivering greater performance and advantages over the competition. However, not all data science platforms and methodologies are created equal. The ability to use data science to make predictions and take decisions that optimize business outcome requires transparency and accountability. There are several underlying factors such as trust, having confidence in the prediction and understanding how the technology works, but fundamentally it comes down to whether the platform uses a black-box or white-box model approach.