AITopics

Industry: Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceNov-23-2022, 08:40:25 GMT

Learning Data Science: Predictive Maintenance with Decision Trees

Predictive Maintenance is one of the big revolutions happening across all major industries right now. Instead of changing parts regularly or even only after they failed it uses Machine Learning methods to predict when a part is going to fail. If you want to get an introduction to this fascinating developing area, read on! Predictive maintenance techniques are designed to help determine the condition of in-service equipment in order to estimate when maintenance should be performed. This approach promises cost savings over routine or time-based preventive maintenance, because tasks are performed only when warranted.

learning data science, maintenance, predictive maintenance, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)

Khan, Md Sakib Nizam, Reje, Niklas, Buchegger, Sonja

Utility Assessment of Synthetic Data Generation Methods

arXiv.org Artificial IntelligenceNov-23-2022

Big data analysis poses the dual problem of privacy preservation and utility, i.e., how accurate data analyses remain after transforming original data in order to protect the privacy of the individuals that the data is about - and whether they are accurate enough to be meaningful. In this paper, we thus investigate across several datasets whether different methods of generating fully synthetic data vary in their utility a priori (when the specific analyses to be performed on the data are not known yet), how closely their results conform to analyses on original data a posteriori, and whether these two effects are correlated. We find some methods (decision-tree based) to perform better than others across the board, sizeable effects of some choices of imputation parameters (notably the number of released datasets), no correlation between broad utility metrics and analysis accuracy, and varying correlations for narrow metrics. We did get promising findings for classification tasks when using synthetic data for training machine learning models, which we consider worth exploring further also in terms of mitigating privacy attacks against ML models such as membership inference and model inversion.

artificial intelligence, decision tree learning, machine learning, (15 more...)

2211.14428

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Poland (0.04)
Oceania > New Zealand (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.66)

Iosipoi, Leonid, Vakhrushev, Anton

SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems

arXiv.org Artificial IntelligenceNov-23-2022

Gradient Boosted Decision Tree (GBDT) is a widely-used machine learning algorithm that has been shown to achieve state-of-the-art results on many standard data science problems. We are interested in its application to multioutput problems when the output is highly multidimensional. Although there are highly effective GBDT implementations, their scalability to such problems is still unsatisfactory. In this paper, we propose novel methods aiming to accelerate the training process of GBDT in the multioutput scenario. The idea behind these methods lies in the approximate computation of a scoring function used to find the best split of decision trees. These methods are implemented in SketchBoost, which itself is integrated into our easily customizable Python-based GPU implementation of GBDT called Py-Boost. Our numerical study demonstrates that SketchBoost speeds up the training process of GBDT by up to over 40 times while achieving comparable or even better performance.

artificial intelligence, machine learning, sketchboost, (17 more...)

2211.12858

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Pachebat, Jean, Ivanov, Sergei

High-Order Optimization of Gradient Boosted Decision Trees

arXiv.org Artificial IntelligenceNov-21-2022

Gradient Boosted Decision Trees (GBDTs) are dominant machine learning algorithms for modeling discrete or tabular data. Unlike neural networks with millions of trainable parameters, GBDTs optimize loss function in an additive manner and have a single trainable parameter per leaf, which makes it easy to apply high-order optimization of the loss function. In this paper, we introduce high-order optimization for GBDTs based on numerical optimization theory which allows us to construct trees based on high-order derivatives of a given loss function. In the experiments, we show that high-order optimization has faster per-iteration convergence that leads to reduced running time. Our solution can be easily parallelized and run on GPUs with little overhead on the code. Finally, we discuss future potential improvements such as automatic differentiation of arbitrary loss function and combination of GBDTs with neural networks.

artificial intelligence, loss function, machine learning, (14 more...)

2211.11367

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.86)

Potyka, Nico, Yin, Xiang, Toni, Francesca

Explaining Random Forests using Bipolar Argumentation and Markov Networks (Technical Report)

arXiv.org Artificial IntelligenceNov-21-2022

Random forests (RFs) [Bre01] are machine learning models with various applications in areas like E-commerce, Finance and Medicine. They consist of multiple decision trees that use different subsets of the available features. Given an input, every tree makes an individual decision and the output of the random forest is obtained by a majority vote. They have low risk of overfitting; support both classification and regression tasks and come equipped with some feature importance measures [Bre01]. However, feature importance measures can be too simplistic as they can represent neither joint effects of features (e.g., multi-drug interactions) nor non-monotonicity (e.g., increasing the weight may be healthy for an underweight person, but not for an overweight person). In recent years, a variety of other explanation methods has been proposed. Modelagnostic feature importance measures like LIME [RSG16], SHAP [LL17] and MAPLE [PMT18] have similar limitations like the feature importance measures defined for random forests.

argument, machine learning, natural language, (18 more...)

2211.11699

Genre: Research Report (0.40)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

#artificialintelligenceNov-20-2022, 04:12:16 GMT

Diabetes Prediction using Machine Learning, Java, and GridDB

This article will cover the health care concern of diabetes that is driving the lifestyle of many people worldwide. This article will cover the usage of machine learning models to create a predictive system. This model will use random-forest to predict if patients have diabetes or not. The article will outline the requirements needed to set up our database GridDB. Following that, we will briefly describe our dataset and model.

database, dataset, diabetes prediction, (12 more...)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.50)

#artificialintelligenceNov-19-2022, 15:35:40 GMT

A Gentle Introduction to Random Forests, Ensembles, and Performance Metrics in a Commercial System

This is the first in a series of posts that illustrate what our data team is up to, experimenting with, and building'under the hood' at CitizenNet. He has been involved in web-scale machine learning and information retrieval for over 10 years. One of the first posts we published spoke at a high level of the technical problem CitizenNet is trying to solve. In essence, we are trying to predict what combinations of demographic and interest targets will be interested in some piece of content. On the CitizenNet platform, a user would create a project that would define (broadly) the target audience, the pieces of Facebook content they are looking to promote, and other campaign and financial information. Behind the scenes, a robust prediction system builds the targets for the project.

classifier, high ctr, precision, (15 more...)

Genre:

Research Report > New Finding (0.31)
Research Report > Experimental Study (0.30)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.48)

Mutahar, Gayda, Miller, Tim

Concept-based Explanations using Non-negative Concept Activation Vectors and Decision Tree for CNN Models

arXiv.org Artificial IntelligenceNov-19-2022

This paper evaluates whether training a decision tree based on concepts extracted from a concept-based explainer can increase interpretability for Convolutional Neural Networks (CNNs) models and boost the fidelity and performance of the used explainer. CNNs for computer vision have shown exceptional performance in critical industries. However, it is a significant barrier when deploying CNNs due to their complexity and lack of interpretability. Recent studies to explain computer vision models have shifted from extracting low-level features (pixel-based explanations) to mid-or high-level features (concept-based explanations). The current research direction tends to use extracted features in developing approximation algorithms such as linear or decision tree models to interpret an original model. In this work, we modify one of the state-of-the-art concept-based explanations and propose an alternative framework named TreeICE. We design a systematic evaluation based on the requirements of fidelity (approximate models to original model's labels), performance (approximate models to ground-truth labels), and interpretability (meaningful of approximate models to humans). We conduct computational evaluation (for fidelity and performance) and human subject experiments (for interpretability) We find that Tree-ICE outperforms the baseline in interpretability and generates more human readable explanations in the form of a semantic tree structure. This work features how important to have more understandable explanations when interpretability is crucial.

explanation, machine learning, natural language, (19 more...)

2211.10807

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.88)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

#artificialintelligenceNov-18-2022, 07:35:17 GMT

Decision Trees (the upside-down trees)

Taking SpongeBob SquarePants' mood as an example, based on historical data, there are two factors that affect whether he is happy or upset. So, if the whole mood table of Mr. SquarePants is visualized into a scatter plot, it will look as the following: Decision trees are used in classification problems to find a line(s) that separates the data as perfectly as possible. The separation process is done by measuring the homogeneity or similarity of the data. Attempting to separate data, a vertical line is drawn to separate happy SpongeBob from upset SpongeBob, taking into consideration one feature only -which is the number of jellyfish that were hunted. A vertical line at the number 10 on the x-axis can work as a separator, so if the number of jellyfish hunted is less than 10 SpongeBob is upset otherwise, he is happy.

classification, decision tree, upside-down tree, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.65)