AITopics

2010.1555

Country:

Asia > China > Beijing > Beijing (0.05)
Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

#artificialintelligenceOct-8-2020, 21:15:37 GMT

Machine Learning Applied to Registry Data

Craniosynostosis is the premature fusion of 1 cranial sutures and often requires surgical intervention. Surgery may involve extensive osteotomies, which can lead to substantial blood loss. Currently, there are no consensus recommendations for guiding blood conservation or transfusion in this patient population. The aim of this study is to develop a machine-learning model to predict blood product transfusion requirements for individual pediatric patients undergoing craniofacial surgery. Using data from 2143 patients in the Pediatric Craniofacial Surgery Perioperative Registry, we assessed 6 machine-learning classification and regression models based on random forest, adaptive boosting (AdaBoost), neural network, gradient boosting machine (GBM), support vector machine, and elastic net methods with inputs from 22 demographic and preoperative features.

artificial intelligence, machine learning applied, registry data, (3 more...)

Genre: Research Report > Experimental Study (0.77)

Industry: Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.60)

Fauvel, Kevin, Fromont, Élisa, Masson, Véronique, Faverdin, Philippe, Termier, Alexandre

Local Cascade Ensemble for Multivariate Data Classification

arXiv.org Machine LearningOct-8-2020

There are three main reasons We present LCE, a Local Cascade Ensemble for that justify the use of ensembles over single classifiers [Dietterich, traditional (tabular) multivariate data classification, 2000]: statistical (reduce the risk of choosing the and its extension LCEM for Multivariate Time Series wrong classifier by averaging when the amount of training (MTS) classification. LCE is a new hybrid ensemble data available is too small compared to the size of the hypothesis method that combines an explicit boostingbagging space), computational (local search from many different approach to handle the bias-variance tradeoff starting points may provide a better approximation to faced by machine learning models and an implicit the true unknown function than any of the individual classifier), divide-and-conquer approach to individualize and representational (expansion of the space of representable classifier errors on different parts of the training functions).

artificial intelligence, decision tree learning, machine learning, (19 more...)

2005.03645

Country:

Europe > Portugal > Coimbra > Coimbra (0.04)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Donick, Delilah, Lera, Sandro Claudio

Uncovering Feature Interdependencies in Complex Systems with Non-Greedy Random Forests

arXiv.org Machine LearningOct-5-2020

A "non-greedy" variation of the random forest algorithm is presented to better uncover feature interdependencies inherent in complex systems. Conventionally, random forests are built from "greedy" decision trees which each consider only one split at a time during their construction. In contrast, the decision trees included in this random forest algorithm each consider three split nodes simultaneously in tiers of depth two. It is demonstrated on synthetic data and bitcoin price time series that the non-greedy version significantly outperforms the greedy one if certain non-linear relationships between feature-pairs are present. In particular, both greedy and a non-greedy random forests are trained to predict the signs of daily bitcoin returns and backtest a long-short trading strategy. The better performance of the non-greedy algorithm is explained by the presence of "XOR-like" relationships between long-term and short-term technical indicators. When no such relationships exist, performance is similar. Given its enhanced ability to understand the feature-interdependencies present in complex systems, this non-greedy extension should become a standard method in the toolkit of data scientists.

classification, node, prediction, (16 more...)

2009.14572

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceOct-3-2020, 11:50:57 GMT

Random Forest Algorithm in Machine Learning

Random Forest or random decision forests are an ensemble learning method for classification, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes or mean prediction of the individual trees. Random forest is a supervised learning algorithm. The "forest" it builds, is an ensemble of decision trees, usually trained with the "bagging" method. The general idea of the bagging method is that a combination of learning models increases the overall result. Random Forest is an ensemble method.

artificial intelligence, machine learning, random forest algorithm, (8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

arXiv.org Machine LearningOct-2-2020

Attention augmented differentiable forest for tabular data

Chen, Yingshi

Differentiable forest is an ensemble of decision trees with full differentiability. Its simple tree structure is easy to use and explain. With full differentiability, it would be trained in the end-to-end learning framework with gradient-based optimization method. In this paper, we propose tree attention block(TAB) in the framework of differentiable forest. TAB block has two operations, squeeze and regulate. The squeeze operation would extract the characteristic of each tree. The regulate operation would learn nonlinear relations between these trees. So TAB block would learn the importance of each tree and adjust its weight to improve accuracy. Our experiment on large tabular dataset shows attention augmented differentiable forest would get comparable accuracy with gradient boosted decision trees(GBDT), which is the state-of-the-art algorithm for tabular datasets. And on some datasets, our model has higher accuracy than best GBDT libs (LightGBM, Catboost, and XGBoost). Differentiable forest model supports batch training and batch size is much smaller than the size of training set. So on larger data sets, its memory usage is much lower than GBDT model. The source codes are available at https://github.com/closest-git/QuantumForest.

artificial intelligence, decision tree learning, machine learning, (14 more...)

2010.02921

Country: Asia > China > Fujian Province > Xiamen (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
(2 more...)

#artificialintelligenceSep-29-2020, 15:01:23 GMT

XGBoost vs LightGBM on a High Dimensional Dataset

I have recently completed a multi-class classification problem given as a take-home assignment for a data scientist position. It was a good opportunity to compare the two state-of-the-art implementations of gradient boosting decision trees which are XGBoost and LightGBM. Both algorithms are so powerful that they are prominent among the best performing machine learning models. The dataset contains over 60 thousand observations and 103 numerical features. The target variable contains 9 different classes.

artificial intelligence, lightgbm, machine learning, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

arXiv.org Machine LearningSep-29-2020

Selective Cascade of Residual ExtraTrees

Liu, Qimin, Liu, Fang

We propose a novel tree-based ensemble method named Selective Cascade of Residual ExtraTrees (SCORE). SCORE draws inspiration from representation learning, incorporates regularized regression with variable selection features, and utilizes boosting to improve prediction and reduce generalization errors. We also develop a variable importance measure to increase the explainability of SCORE. Our computer experiments show that SCORE provides comparable or superior performance in prediction against ExtraTrees, random forest, gradient boosting machine, and neural networks; and the proposed variable importance measure for SCORE is comparable to studied benchmark methods. Finally, the predictive performance of SCORE remains stable across hyper-parameter values, suggesting potential robustness to hyperparameter specification.

artificial intelligence, extratree, machine learning, (16 more...)

2009.14138

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > New York (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)

#artificialintelligenceSep-28-2020, 12:40:18 GMT

CHIRPS: Explaining random forest classification

Modern machine learning methods typically produce "black box" models that are opaque to interpretation. Yet, their demand has been increasing in the Human-in-the-Loop processes, that is, those processes that require a human agent to verify, approve or reason about the automated decisions before they can be applied. To facilitate this interpretation, we propose Collection of High Importance Random Path Snippets (CHIRPS); a novel algorithm for explaining random forest classification per data instance. CHIRPS extracts a decision path from each tree in the forest that contributes to the majority classification, and then uses frequent pattern mining to identify the most commonly occurring split conditions. Then a simple, conjunctive form rule is constructed where the antecedent terms are derived from the attributes that had the most influence on the classification.

classification, machine learning, pattern recognition, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.62)

Guillame-Bert, Mathieu, Bruch, Sebastian, Mitrichev, Petr, Mikheev, Petr, Pfeifer, Jan

Modeling Text with Decision Forests using Categorical-Set Splits

arXiv.org Machine LearningSep-28-2020

Decision forest algorithms model data by learning a binary tree structure recursively where every node splits the feature space into two regions, sending examples into the left or right branches. This "decision" is the result of the evaluation of a condition. For example, a node may split input data by applying a threshold to a numerical feature value. Such decisions are learned using (often greedy) algorithms that attempt to optimize a local loss function. Crucially, whether an algorithm exists to find and evaluate splits for a feature type (e.g., text) determines whether a decision forest algorithm can model that feature type at all. In this work, we set out to devise such an algorithm for textual features, thereby equipping decision forests with the ability to directly model text without the need for feature transformation. Our algorithm is efficient during training and the resulting splits are fast to evaluate with our extension of the QuickScorer inference algorithm. Experiments on benchmark text classification datasets demonstrate the utility and effectiveness of our proposal.

artificial intelligence, machine learning, natural language, (16 more...)

2009.09991

Country:

North America > United States (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)
(2 more...)