AITopics | regression forest

Collaborating Authors

regression forest

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Assessing Surrogate Heterogeneity in Real World Data Using Meta-Learners

Knowlton, Rebecca, Parast, Layla

arXiv.org Machine LearningApr-21-2025

Surrogate markers are most commonly studied within the context of randomized clinical trials. However, the need for alternative outcomes extends beyond these settings and may be more pronounced in real-world public health and social science research, where randomized trials are often impractical. Research on identifying surrogates in real-world non-randomized data is scarce, as available statistical approaches for evaluating surrogate markers tend to rely on the assumption that treatment is randomized. While the few methods that allow for non-randomized treatment/exposure appropriately handle confounding individual characteristics, they do not offer a way to examine surrogate heterogeneity with respect to patient characteristics. In this paper, we propose a framework to assess surrogate heterogeneity in real-world, i.e., non-randomized, data and implement this framework using various meta-learners. Our approach allows us to quantify heterogeneity in surrogate strength with respect to patient characteristics while accommodating confounders through the use of flexible, off-the-shelf machine learning methods. In addition, we use our framework to identify individuals for whom the surrogate is a valid replacement of the primary outcome. We examine the performance of our methods via a simulation study and application to examine heterogeneity in the surrogacy of hemoglobin A1c as a surrogate for fasting plasma glucose.

artificial intelligence, learner, machine learning, (18 more...)

arXiv.org Machine Learning

2504.15386

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Public Health (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Uncertainty estimation in satellite precipitation spatial prediction by combining distributional regression algorithms

Papacharalampous, Georgia, Tyralis, Hristos, Doulamis, Nikolaos, Doulamis, Anastasios

arXiv.org Machine LearningJun-29-2024

To facilitate effective decision-making, gridded satellite precipitation products should include uncertainty estimates. Machine learning has been proposed for issuing such estimates. However, most existing algorithms for this purpose rely on quantile regression. Distributional regression offers distinct advantages over quantile regression, including the ability to model intermittency as well as a stronger ability to extrapolate beyond the training data, which is critical for predicting extreme precipitation. In this work, we introduce the concept of distributional regression for the engineering task of creating precipitation datasets through data merging. Building upon this concept, we propose new ensemble learning methods that can be valuable not only for spatial prediction but also for prediction problems in general. These methods exploit conditional zero-adjusted probability distributions estimated with generalized additive models for location, scale, and shape (GAMLSS), spline-based GAMLSS and distributional regression forests as well as their ensembles (stacking based on quantile regression, and equal-weight averaging). To identify the most effective methods for our specific problem, we compared them to benchmarks using a large, multi-source precipitation dataset. Stacking emerged as the most successful strategy. Three specific stacking methods achieved the best performance based on the quantile scoring rule, although the ranking of these methods varied across quantile levels. This suggests that a task-specific combination of multiple algorithms could yield significant benefits.

algorithm, prediction, quantile, (13 more...)

arXiv.org Machine Learning

2407.01623

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Vienna (0.14)
(6 more...)

Genre: Research Report (0.40)

Industry:

Energy > Renewable (0.69)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.52)

Add feedback

Stacking for Probabilistic Short-term Load Forecasting

Dudek, Grzegorz

arXiv.org Artificial IntelligenceJun-15-2024

In this study, we delve into the realm of meta-learning to combine point base forecasts for probabilistic short-term electricity demand forecasting. Our approach encompasses the utilization of quantile linear regression, quantile regression forest, and post-processing techniques involving residual simulation to generate quantile forecasts. Furthermore, we introduce both global and local variants of meta-learning. In the local-learning mode, the meta-model is trained using patterns most similar to the query pattern. Through extensive experimental studies across 35 forecasting scenarios and employing 16 base forecasting models, our findings underscored the superiority of quantile regression forest over its competitors.

forecast, forecasting, probabilistic forecast, (15 more...)

arXiv.org Artificial Intelligence

2406.10718

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Poland (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Efficient Normalized Conformal Prediction and Uncertainty Quantification for Anti-Cancer Drug Sensitivity Prediction with Deep Regression Forests

Nolte, Daniel, Ghosh, Souparno, Pal, Ranadip

arXiv.org Machine LearningFeb-21-2024

Deep learning models are being adopted and applied on various critical decision-making tasks, yet they are trained to provide point predictions without providing degrees of confidence. The trustworthiness of deep learning models can be increased if paired with uncertainty estimations. Conformal Prediction has emerged as a promising method to pair machine learning models with prediction intervals, allowing for a view of the model's uncertainty. However, popular uncertainty estimation methods for conformal prediction fail to provide heteroskedastic intervals that are equally accurate for all samples. In this paper, we propose a method to estimate the uncertainty of each sample by calculating the variance obtained from a Deep Regression Forest. We show that the deep regression forest variance improves the efficiency and coverage of normalized inductive conformal prediction on a drug response prediction task.

icp, prediction, variance, (13 more...)

arXiv.org Machine Learning

2402.1408

Country:

North America > United States > Texas (0.04)
North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Example-Based Explanations of Random Forest Predictions

Boström, Henrik

arXiv.org Artificial IntelligenceNov-24-2023

A random forest prediction can be computed by the scalar product of the labels of the training examples and a set of weights that are determined by the leafs of the forest into which the test object falls; each prediction can hence be explained exactly by the set of training examples for which the weights are non-zero. The number of examples used in such explanations is shown to vary with the dimensionality of the training set and hyperparameters of the random forest algorithm. This means that the number of examples involved in each prediction can to some extent be controlled by varying these parameters. However, for settings that lead to a required predictive performance, the number of examples involved in each prediction may be unreasonably large, preventing the user to grasp the explanations. In order to provide more useful explanations, a modified prediction procedure is proposed, which includes only the top-weighted examples. An investigation on regression and classification tasks shows that the number of examples used in each explanation can be substantially reduced while maintaining, or even improving, predictive performance compared to the standard prediction procedure.

prediction, predictive performance, training example, (12 more...)

arXiv.org Artificial Intelligence

2311.14581

Country: Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.85)

Add feedback

Statistical post-processing of wind speed forecasts using convolutional neural networks

Veldkamp, Simon, Whan, Kirien, Dirksen, Sjoerd, Schmeits, Maurice

arXiv.org Machine LearningJul-8-2020

Current statistical post-processing methods for probabilistic weather forecasting are not capable of using full spatial patterns from the numerical weather prediction (NWP) model. In this paper we incorporate spatial wind speed information by using convolutional neural networks (CNNs) and obtain probabilistic wind speed forecasts in the Netherlands for 48 hours ahead, based on KNMI's Harmonie-Arome NWP model. The CNNs are shown to have higher Brier skill scores for medium to higher wind speeds, as well as a better continuous ranked probability score (CRPS), than fully connected neural networks and quantile regression forests.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

2007.04005

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
Europe > Netherlands > South Holland > Rotterdam (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sufficient Representations for Categorical Variables

Johannemann, Jonathan, Hadad, Vitor, Athey, Susan, Wager, Stefan

arXiv.org Machine LearningAug-26-2019

Many learning algorithms require categorical data to be transformed into real vectors before it can be used as input. Often, categorical variables are encoded as one-hot (or dummy) vectors. However, this mode of representation can be wasteful since it adds many low-signal regressors, especially when the number of unique categories is large. In this paper, we investigate simple alternative solutions for universally consistent estimators that rely on lower-dimensional real-valued representations of categorical variables that are "sufficient" in the sense that no predictive information is lost. We then compare preexisting and proposed methods on simulated and observational datasets.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1908.09874

Country: North America > United States (0.93)

Genre: Research Report (0.65)

Industry: Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
(3 more...)

Add feedback

XGBoostLSS -- An extension of XGBoost to probabilistic forecasting

März, Alexander

arXiv.org Artificial IntelligenceJul-11-2019

We propose a new framework of XGBoost that predicts the entire conditional distribution of a univariate response variable. In particular, XGBoostLSS models all moments of a parametric distribution, i.e., mean, location, scale and shape (LSS), instead of the conditional mean only. Choosing from a wide range of continuous, discrete and mixed discrete-continuous distribution, modelling and predicting the entire conditional distribution greatly enhances the flexibility of XGBoost, as it allows to gain additional insight into the data generating process, as well as to create probabilistic forecasts from which prediction intervals and quantiles of interest can be derived. We present both a simulation study and real world examples that demonstrate the benefits of our approach.

artificial intelligence, machine learning, xgboostlss, (17 more...)

arXiv.org Artificial Intelligence

1907.03178

Country: Europe > Germany (0.29)

Genre: Research Report > Experimental Study (0.46)

Add feedback

Quantitative Error Prediction of Medical Image Registration using Regression Forests

Sokooti, Hessam, Saygili, Gorkem, Glocker, Ben, Lelieveldt, Boudewijn P. F., Staring, Marius

arXiv.org Machine LearningMay-18-2019

Predicting registration error can be useful for evaluation of registration procedures, which is important for the adoption of registration techniques in the clinic. In addition, quantitative error prediction can be helpful in improving the registration quality. The task of predicting registration error is demanding due to the lack of a ground truth in medical images. This paper proposes a new automatic method to predict the registration error in a quantitative manner, and is applied to chest CT scans. A random regression forest is utilized to predict the registration error locally. The forest is built with features related to the transformation model and features related to the dissimilarity after registration. The forest is trained and tested using manually annotated corresponding points between pairs of chest CT scans in two experiments: SPREAD (trained and tested on SPREAD) and inter-database (including three databases SPREAD, DIR-Lab-4DCT and DIR-Lab-COPDgene). The results show that the mean absolute errors of regression are 1.07 $\pm$ 1.86 and 1.76 $\pm$ 2.59 mm for the SPREAD and inter-database experiment, respectively. The overall accuracy of classification in three classes (correct, poor and wrong registration) is 90.7% and 75.4%, for SPREAD and inter-database respectively. The good performance of the proposed method enables important applications such as automatic quality control in large-scale image analysis.

machine learning, pattern recognition, registration, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/j.media.2019.05.005

1905.07624

Country: Europe > Netherlands (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.43)

Add feedback

Deep Distribution Regression

Li, Rui, Bondell, Howard D., Reich, Brian J.

arXiv.org Machine LearningMar-14-2019

In recent years, a variety of machine learning methods, such as random forest, gradient boosting trees and neural networks have gained popularity and been widely adopted. These methods are often flexible enough to uncover complex relationships in high-dimensional data without strong assumptions on the underlying data structure. Off-the-shelf software is available to put these algorithms into production [Pedregosa et al. (2011), Abadi et al. (2016) and Paszke et al. (2017)]. However, in regression and forecasting tasks, many of the machine learning methods only provide a point estimate, without any additional information regarding the uncertainty of the target quantity. Understanding uncertainties are often crucial in fields such as financial markets and risk analysis [Diebold et al. (1997), Timmermann (2000)], population and demographic studies [Wilson and Bell (2007)], transportation and traffic analysis [Zhu and Laptev (2017), Rodrigues and Pereira (2018)] and energy forecasting [Hong et al. (2016)].

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

1903.06023

Genre: Research Report > New Finding (0.47)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback