Regression
Forecast combinations: an over 50-year review
Wang, Xiaoqian, Hyndman, Rob J, Li, Feng, Kang, Yanfei
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sources, thereby mitigating the risk of identifying a single "best" forecast. Combination schemes have evolved from simple combination methods without estimation, to sophisticated methods involving time-varying weights, nonlinear combinations, correlations among components, and cross-learning. They include combining point forecasts and combining probabilistic forecasts. This paper provides an up-to-date review of the extensive literature on forecast combinations, together with reference to available open-source software implementations. We discuss the potential and limitations of various methods and highlight how these ideas have developed over time. Some important issues concerning the utility of forecast combinations are also surveyed. Finally, we conclude with current research gaps and potential insights for future research.
Feature selection in stratification estimators of causal effects: lessons from potential outcomes, causal diagrams, and structural equations
Hahn, P. Richard, Herren, Andrew
What is the ideal regression (if any) for estimating average causal effects? We study this question in the setting of discrete covariates, deriving expressions for the finite-sample variance of various stratification estimators. This approach clarifies the fundamental statistical phenomena underlying many widely-cited results. Our exposition combines insights from three distinct methodological traditions for studying causal effect estimation: potential outcomes, causal diagrams, and structural models with additive errors.
From Weakly Supervised Learning to Active Learning
Applied mathematics and machine computations have raised a lot of hope since the recent success of supervised learning. Many practitioners in industries have been trying to switch from their old paradigms to machine learning. Interestingly, those data scientists spend more time scrapping, annotating and cleaning data than fine-tuning models. This thesis is motivated by the following question: can we derive a more generic framework than the one of supervised learning in order to learn from clutter data? This question is approached through the lens of weakly supervised learning, assuming that the bottleneck of data collection lies in annotation. We model weak supervision as giving, rather than a unique target, a set of target candidates. We argue that one should look for an ``optimistic'' function that matches most of the observations. This allows us to derive a principle to disambiguate partial labels. We also discuss the advantage to incorporate unsupervised learning techniques into our framework, in particular manifold regularization approached through diffusion techniques, for which we derived a new algorithm that scales better with input dimension then the baseline method. Finally, we switch from passive to active weakly supervised learning, introducing the ``active labeling'' framework, in which a practitioner can query weak information about chosen data. Among others, we leverage the fact that one does not need full information to access stochastic gradients and perform stochastic gradient descent.
Tensor-Based Multi-Modality Feature Selection and Regression for Alzheimer's Disease Diagnosis
Yu, Jun, Kong, Zhaoming, Zhan, Liang, Shen, Li, He, Lifang
The assessment of Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI) associated with brain changes remains a challenging task. Recent studies have demonstrated that combination of multi-modality imaging techniques can better reflect pathological characteristics and contribute to more accurate diagnosis of AD and MCI. In this paper, we propose a novel tensor-based multi-modality feature selection and regression method for diagnosis and biomarker identification of AD and MCI from normal controls. Specifically, we leverage the tensor structure to exploit high-level correlation information inherent in the multi-modality data, and investigate tensor-level sparsity in the multilinear regression model. We present the practical advantages of our method for the analysis of ADNI data using three imaging modalities (VBM- MRI, FDG-PET and AV45-PET) with clinical parameters of disease severity and cognitive scores. The experimental results demonstrate the superior performance of our proposed method against the state-of-the-art for the disease diagnosis and the identification of disease-specific regions and modality-related differences. The code for this work is publicly available at https://github.com/junfish/BIOS22.
Optimization with Constraint Learning: A Framework and Survey
Fajemisin, Adejuyigbe, Maragno, Donato, Hertog, Dick den
Many real-life optimization problems frequently contain one or more constraints or objectives for which there are no explicit formulas. If data is however available, these data can be used to learn the constraints. The benefits of this approach are clearly seen, however there is a need for this process to be carried out in a structured manner. This paper therefore provides a framework for Optimization with Constraint Learning (OCL) which we believe will help to formalize and direct the process of learning constraints from data. This framework includes the following steps: (i) setup of the conceptual optimization model, (ii) data gathering and preprocessing, (iii) selection and training of predictive models, (iv) resolution of the optimization model, and (v) verification and improvement of the optimization model. We then review the recent OCL literature in light of this framework, and highlight current trends, as well as areas for future research.
Beyond Voxel Prediction Uncertainty: Identifying brain lesions you can trust
Lambert, Benjamin, Forbes, Florence, Doyle, Senan, Tucholka, Alan, Dojat, Michel
Deep neural networks have become the gold-standard approach for the automated segmentation of 3D medical images. Their full acceptance by clinicians remains however hampered by the lack of intelligible uncertainty assessment of the provided results. Most approaches to quantify their uncertainty, such as the popular Monte Carlo dropout, restrict to some measure of uncertainty in prediction at the voxel level. In addition not to be clearly related to genuine medical uncertainty, this is not clinically satisfying as most objects of interest (e.g. brain lesions) are made of groups of voxels whose overall relevance may not simply reduce to the sum or mean of their individual uncertainties. In this work, we propose to go beyond voxel-wise assessment using an innovative Graph Neural Network approach, trained from the outputs of a Monte Carlo dropout model. This network allows the fusion of three estimators of voxel uncertainty: entropy, variance, and model's confidence; and can be applied to any lesion, regardless of its shape or size. We demonstrate the superiority of our approach for uncertainty estimate on a task of Multiple Sclerosis lesions segmentation.
Intermediate Machine Learning: Principal Component Analysis (PCA) - PythonAlgos
Welcome to the third module in our Machine Learning series. So far we've covered Linear Regression and Logistic Regression. Just to recap, Linear Regression is the simplest implementation of continuous prediction (i.e. Now let's get into something a little more complex – Principal Component Analysis (PCA) in Python. PCA is a dimensionality reduction technique.
Leak Detection in Natural Gas Pipeline Using Machine Learning Models
Leak detection in gas pipelines is an important and persistent problem in the Oil and Gas industry. This is particularly important as pipelines are the most common way of transporting natural gas. This research aims to study the ability of data-driven intelligent models to detect small leaks for a natural gas pipeline using basic operational parameters and then compare the intelligent models among themselves using existing performance metrics. This project applies the observer design technique to detect leaks in natural gas pipelines using a regressoclassification hierarchical model where an intelligent model acts as a regressor and a modified logistic regression model acts as a classifier. Five intelligent models (gradient boosting, decision trees, random forest, support vector machine and artificial neural network) are studied in this project using a pipeline data stream of four weeks. The results shows that while support vector machine and artificial neural networks are better regressors than the others, they do not provide the best results in leak detection due to their internal complexities and the volume of data used. The random forest and decision tree models are the most sensitive as they can detect a leak of 0.1% of nominal flow in about 2 hours. All the intelligent models had high reliability with zero false alarm rate in testing phase. The average time to leak detection for all the intelligent models was compared to a real time transient model in literature. The results show that intelligent models perform relatively well in the problem of leak detection. This result suggests that intelligent models could be used alongside a real time transient model to significantly improve leak detection results.
Artificial Intelligence and Innovation to Reduce the Impact of Extreme Weather Events on Sustainable Production
Effah, Derrick, Bai, Chunguang, Quayson, Matthew
Frequent occurrences of extreme weather events substantially impact the lives of the less privileged in our societies, particularly in agriculture-inclined economies. The unpredictability of extreme fires, floods, drought, cyclones, and others endangers sustainable production and life on land (SDG goal 15), which translates into food insecurity and poorer populations. Fortunately, modern technologies such as Artificial Intelligent (AI), the Internet of Things (IoT), blockchain, 3D printing, and virtual and augmented reality (VR and AR) are promising to reduce the risk and impact of extreme weather in our societies. However, research directions on how these technologies could help reduce the impact of extreme weather are unclear. This makes it challenging to emploring digital technologies within the spheres of extreme weather. In this paper, we employed the Delphi Best Worst method and Machine learning approaches to identify and assess the push factors of technology. The BWM evaluation revealed that predictive nature was AI's most important criterion and role, while the mass-market potential was the less important criterion. Based on this outcome, we tested the predictive ability of machine elarning on a publilcly available dataset to affrm the predictive rols of AI. We presented the managerial and methodological implications of the study, which are crucial for research and practice. The methodology utilized in this study could aid decision-makers in devising strategies and interventions to safeguard sustainable production. This will also facilitate allocating scarce resources and investment in improving AI techniques to reduce the adverse impacts of extreme events. Correspondingly, we put forward the limitations of this, which necessitate future research.
Interpretable Selective Learning in Credit Risk
Chen, Dangxing, Ye, Weicheng, Ye, Jiahui
The forecasting of the credit default risk has been an important research field for several decades. Traditionally, logistic regression has been widely recognized as a solution due to its accuracy and interpretability. As a recent trend, researchers tend to use more complex and advanced machine learning methods to improve the accuracy of the prediction. Although certain non-linear machine learning methods have better predictive power, they are often considered to lack interpretability by financial regulators. Thus, they have not been widely applied in credit risk assessment. We introduce a neural network with the selective option to increase interpretability by distinguishing whether the datasets can be explained by the linear models or not. We find that, for most of the datasets, logistic regression will be sufficient, with reasonable accuracy; meanwhile, for some specific data portions, a shallow neural network model leads to much better accuracy without significantly sacrificing the interpretability.