Goto

Collaborating Authors

 Regression


Digital Twin and Artificial Intelligence Incorporated With Surrogate Modeling for Hybrid and Sustainable Energy Systems

arXiv.org Artificial Intelligence

Surrogate modeling has brought about a revolution in computation in the branches of science and engineering. Backed by Artificial Intelligence, a surrogate model can present highly accurate results with a significant reduction in computation time than computer simulation of actual models. Surrogate modeling techniques have found their use in numerous branches of science and engineering, energy system modeling being one of them. Since the idea of hybrid and sustainable energy systems is spreading rapidly in the modern world for the paradigm of the smart energy shift, researchers are exploring the future application of artificial intelligence-based surrogate modeling in analyzing and optimizing hybrid energy systems. One of the promising technologies for assessing applicability for the energy system is the digital twin, which can leverage surrogate modeling. This work presents a comprehensive framework/review on Artificial Intelligence-driven surrogate modeling and its applications with a focus on the digital twin framework and energy systems. The role of machine learning and artificial intelligence in constructing an effective surrogate model is explained. After that, different surrogate models developed for different sustainable energy sources are presented. Finally, digital twin surrogate models and associated uncertainties are described.


Machine Unlearning Method Based On Projection Residual

arXiv.org Artificial Intelligence

Machine learning models (mainly neural networks) are used more and more in real life. Users feed their data to the model for training. But these processes are often one-way. Once trained, the model remembers the data. Even when data is removed from the dataset, the effects of these data persist in the model. With more and more laws and regulations around the world protecting data privacy, it becomes even more important to make models forget this data completely through machine unlearning. This paper adopts the projection residual method based on Newton iteration method. The main purpose is to implement machine unlearning tasks in the context of linear regression models and neural network models. This method mainly uses the iterative weighting method to completely forget the data and its corresponding influence, and its computational cost is linear in the feature dimension of the data. This method can improve the current machine learning method. At the same time, it is independent of the size of the training set. Results were evaluated by feature injection testing (FIT). Experiments show that this method is more thorough in deleting data, which is close to model retraining.


Shuffled linear regression through graduated convex relaxation

arXiv.org Artificial Intelligence

The shuffled linear regression problem aims to recover linear relationships in datasets where the correspondence between input and output is unknown. This problem arises in a wide range of applications including survey data, in which one needs to decide whether the anonymity of the responses can be preserved while uncovering significant statistical connections. In this work, we propose a novel optimization algorithm for shuffled linear regression based on a posterior-maximizing objective function assuming Gaussian noise prior. We compare and contrast our approach with existing methods on synthetic and real data. We show that our approach performs competitively while achieving empirical running-time improvements. Furthermore, we demonstrate that our algorithm is able to utilize the side information in the form of seeds, which recently came to prominence in related problems.


Physically Meaningful Uncertainty Quantification in Probabilistic Wind Turbine Power Curve Models as a Damage Sensitive Feature

arXiv.org Artificial Intelligence

A wind turbines' power curve is easily accessible damage sensitive data, and as such is a key part of structural health monitoring in wind turbines. Power curve models can be constructed in a number of ways, but the authors argue that probabilistic methods carry inherent benefits in this use case, such as uncertainty quantification and allowing uncertainty propagation analysis. Many probabilistic power curve models have a key limitation in that they are not physically meaningful - they return mean and uncertainty predictions outside of what is physically possible (the maximum and minimum power outputs of the wind turbine). This paper investigates the use of two bounded Gaussian Processes in order to produce physically meaningful probabilistic power curve models. The first model investigated was a warped heteroscedastic Gaussian process, and was found to be ineffective due to specific shortcomings of the Gaussian Process in relation to the warping function. The second model - an approximated Gaussian Process with a Beta likelihood was highly successful and demonstrated that a working bounded probabilistic model results in better predictive uncertainty than a corresponding unbounded one without meaningful loss in predictive accuracy. Such a bounded model thus offers increased accuracy for performance monitoring and increased operator confidence in the model due to guaranteed physical plausibility.


A Multiple Criteria Decision Analysis based Approach to Remove Uncertainty in SMP Models

arXiv.org Artificial Intelligence

Advanced AI technologies are serving humankind in a number of ways, from healthcare to manufacturing. Advanced automated machines are quite expensive, but the end output is supposed to be of the highest possible quality. Depending on the agility of requirements, these automation technologies can change dramatically. The likelihood of making changes to automation software is extremely high, so it must be updated regularly. If maintainability is not taken into account, it will have an impact on the entire system and increase maintenance costs. Many companies use different programming paradigms in developing advanced automated machines based on client requirements. Therefore, it is essential to estimate the maintainability of heterogeneous software. As a result of the lack of widespread consensus on software maintainability prediction (SPM) methodologies, individuals and businesses are left perplexed when it comes to determining the appropriate model for estimating the maintainability of software, which serves as the inspiration for this research. A structured methodology was designed, and the datasets were preprocessed and maintainability index (MI) range was also found for all the datasets expect for UIMS and QUES, the metric CHANGE is used for UIMS and QUES. To remove the uncertainty among the aforementioned techniques, a popular multiple criteria decision-making model, namely the technique for order preference by similarity to ideal solution (TOPSIS), is used in this work. TOPSIS revealed that GARF outperforms the other considered techniques in predicting the maintainability of heterogeneous automated software.


Using Knowledge Distillation to improve interpretable models in a retail banking context

arXiv.org Artificial Intelligence

Although the banking sector holds massive troves of data regarding its customers, products and transactions, and is no stranger to using quantitative tools to inform its decisions, two constraints usually weigh on the development of predictive models. The first one lies in the regulatory obligation to use interpretable models for a wide range of issues, with the management function being able to explain both the way a model was trained and why specific decisions have been made. Indeed, the European Banking Authority (2020) urges banking institutions to "understand the models used, and their methodology, input data, assumptions, limitations and outputs". The second has to do with the production environments available to deploy the models on. Due to the persistence of legacy systems, cost constraints or execution time limits -- think real time e-commerce fraud detection -- models may be limited to simple operations and conditions, i.e. a set of rules rather than a random forest, light computations in place of a fully fledged neural network. Modeling for retail banking use cases means dealing with both these strong customers protections -- enforced through regular audits -- and the high data volume which at times shortens the time allocated to each sample. These shackles help explain why modeling practices in retail banking departments are centered around simple and interpretable models such as the logistic regression or the (shallow) decision tree.


Higher-order Neural Additive Models: An Interpretable Machine Learning Model with Feature Interactions

arXiv.org Artificial Intelligence

Black-box models, such as deep neural networks, exhibit superior predictive performances, but understanding their behavior is notoriously difficult. Many explainable artificial intelligence methods have been proposed to reveal the decision-making processes of black box models. However, their applications in high-stakes domains remain limited. Recently proposed neural additive models (NAM) have achieved state-of-the-art interpretable machine learning. NAM can provide straightforward interpretations with slight performance sacrifices compared with multi-layer perceptron. However, NAM can only model 1$^{\text{st}}$-order feature interactions; thus, it cannot capture the co-relationships between input features. To overcome this problem, we propose a novel interpretable machine learning method called higher-order neural additive models (HONAM) and a feature interaction method for high interpretability. HONAM can model arbitrary orders of feature interactions. Therefore, it can provide the high predictive performance and interpretability that high-stakes domains need. In addition, we propose a novel hidden unit to effectively learn sharp-shape functions. We conducted experiments using various real-world datasets to examine the effectiveness of HONAM. Furthermore, we demonstrate that HONAM can achieve fair AI with a slight performance sacrifice. The source code for HONAM is publicly available.


Causal Inference via Nonlinear Variable Decorrelation for Healthcare Applications

arXiv.org Artificial Intelligence

Features Explanation Heart Disease age middle Patients between the ages of 40 and 60 #major vessels0 The number of major vessels (0-3) colored by flourosopy is 0 fixed defect Thalium stress test result is fixed defect pressure normal Blood pressure within the normal range ST-T wave abnormality Resting electrocardiography result is ST-T wave abnormality cholesterol edge Serum cholesterol is in range (200, 220] mg/dl lower than 120mg/ml Fasting blood sugar is lower than 120mg/ml non-anginal pain Chest pain type is non-angina cholesterol high Serum cholesterol is higher than 220 mg/dl no exercise induced angina not Exercise induced angina downsloping Slope of peak exercise ST segment is downsloping heart disease It refers to the presence of heart disease in the patient Esophageal Cancer Modified Ryan Score 2.0 (near complete response): single cells or rare small groups of cancer cells Esophagectomy Procedure 4 Complete MIS/Robotic McKeown (Three-Hole) esophagectomy tobacco use Use tobacco Alcohol Use Use Alcohol Neoadjuvant Radiation Patient underwent neoadjuvant radiation Histological Grade 2 How differentiated the tumor is: Moderately Differentiated Final Histology 1 History: Adenocarcinoma Histological Grade 3 How differentiated the tumor is: Poorly Differentiated clinical m Stage 1 Details any spread (metastasis) to other sites of the body: M0 esoph tumor location 4 Lower Thoracic, including GE junction Esophagectomy Procedure 5 Hybrid (Laparoscopy + Thoracotomy) McKeown (Three-Hole) esophagectomy recurrence Details whether the patient experience recurrence of their cancer Cauda Equina Syndrome elixsum


Masked Multi-Step Multivariate Time Series Forecasting with Future Information

arXiv.org Artificial Intelligence

In this paper, we introduce Masked Multi-Step Multivariate Forecasting (MMMF), a novel and general self-supervised learning framework for time series forecasting with known future information. In many real-world forecasting scenarios, some future information is known, e.g., the weather information when making a short-to-mid-term electricity demand forecast, or the oil price forecasts when making an airplane departure forecast. Existing machine learning forecasting frameworks can be categorized into (1) sample-based approaches where each forecast is made independently, and (2) time series regression approaches where the future information is not fully incorporated. To overcome the limitations of existing approaches, we propose MMMF, a framework to train any neural network model capable of generating a sequence of outputs, that combines both the temporal information from the past and the known information about the future to make better predictions. Experiments are performed on two real-world datasets for (1) mid-term electricity demand forecasting, and (2) two-month ahead flight departures forecasting. They show that the proposed MMMF framework outperforms not only sample-based methods but also existing time series forecasting models with the exact same base models. Furthermore, once a neural network model is trained with MMMF, its inference speed is similar to that of the same model trained with traditional regression formulations, thus making MMMF a better alternative to existing regression-trained time series forecasting models if there is some available future information.


NAAP-440 Dataset and Baseline for Neural Architecture Accuracy Prediction

arXiv.org Artificial Intelligence

Neural architecture search (NAS) has become a common approach to developing and discovering new neural architectures for different target platforms and purposes. However, scanning the search space is comprised of long training processes of many candidate architectures, which is costly in terms of computational resources and time. Regression algorithms are a common tool to predicting a candidate architecture's accuracy, which can dramatically accelerate the search procedure. We aim at proposing a new baseline that will support the development of regression algorithms that can predict an architecture's accuracy just from its scheme, or by only training it for a minimal number of epochs. Therefore, we introduce the NAAP-440 dataset of 440 neural architectures, which were trained on CIFAR10 using a fixed recipe. Our experiments indicate that by using off-the-shelf regression algorithms and running up to 10% of the training process, not only is it possible to predict an architecture's accuracy rather precisely, but that the values predicted for the architectures also maintain their accuracy order with a minimal number of monotonicity violations. This approach may serve as a powerful tool for accelerating NAS-based studies and thus dramatically increase their efficiency. The dataset and code used in the study have been made public.