Obster, Fabian
Introducing sgboost: A Practical Guide and Implementation of sparse-group boosting in R
Obster, Fabian, Heumann, Christian
This paper introduces the sgboost package in R, which implements sparse-group boosting for modeling high-dimensional data with natural groupings in covariates. Sparse-group boosting offers a flexible approach for both group and individual variable selection, reducing overfitting and enhancing model interpretability. The package uses regularization techniques based on the degrees of freedom of individual and group base-learners, and is designed to be used in conjunction with the mboost package. Through comparisons with existing methods and demonstration of its unique functionalities, this paper provides a practical guide on utilizing sparse-group boosting in R, accompanied by code examples to facilitate its application in various research domains. Overall, this paper serves as a valuable resource for researchers and practitioners seeking to use sparse-group boosting for efficient and interpretable high-dimensional data analysis.
Using interpretable boosting algorithms for modeling environmental and agricultural data
Obster, Fabian, Heumann, Christian, Bohle, Heidi, Pechan, Paul
We describe how interpretable boosting algorithms based on ridge-regularized generalized linear models can be used to analyze high-dimensional environmental data. We illustrate this by using environmental, social, human and biophysical data to predict the financial vulnerability of farmers in Chile and Tunisia against climate hazards. We show how group structures can be considered and how interactions can be found in high-dimensional datasets using a novel 2-step boosting approach. The advantages and efficacy of the proposed method are shown and discussed. Results indicate that the presence of interaction effects only improves predictive power when included in two-step boosting. The most important variable in predicting all types of vulnerabilities are natural assets. Other important variables are the type of irrigation, economic assets and the presence of crop damage of near farms.
Factors other than climate change are currently more important in predicting how well fruit farms are doing financially
Obster, Fabian, Bohle, Heidi, Pechan, Paul M.
Machine learning and statistical modeling methods were used to analyze the impact of climate change on financial wellbeing of fruit farmers in Tunisia and Chile. The analysis was based on face to face interviews with 801 farmers. Three research questions were investigated. First, whether climate change impacts had an effect on how well the farm was doing financially. Second, if climate change was not influential, what factors were important for predicting financial wellbeing of the farm. And third, ascertain whether observed effects on the financial wellbeing of the farm were a result of interactions between predictor variables. This is the first report directly comparing climate change with other factors potentially impacting financial wellbeing of farms. Certain climate change factors, namely increases in temperature and reductions in precipitation, can regionally impact self-perceived financial wellbeing of fruit farmers. Specifically, increases in temperature and reduction in precipitation can have a measurable negative impact on the financial wellbeing of farms in Chile. This effect is less pronounced in Tunisia. Climate impact differences were observed within Chile but not in Tunisia. However, climate change is only of minor importance for predicting farm financial wellbeing, especially for farms already doing financially well. Factors that are more important, mainly in Tunisia, included trust in information sources and prior farm ownership. Other important factors include farm size, water management systems used and diversity of fruit crops grown. Moreover, some of the important factors identified differed between farms doing and not doing well financially. Interactions between factors may improve or worsen farm financial wellbeing.