Goto

Collaborating Authors

 structural error


A Fuzzy Approach to the Specification, Verification and Validation of Risk-Based Ethical Decision Making Models

arXiv.org Artificial Intelligence

The ontological and epistemic complexities inherent in the moral domain make it challenging to establish clear standards for evaluating the performance of a moral machine. In this paper, we present a formal method to describe Ethical Decision Making models based on ethical risk assessment. Then, we show how these models that are specified as fuzzy rules can be verified and validated using fuzzy Petri nets. A case study from the medical field is considered to illustrate the proposed approach.


Towards Understanding the Optimization Mechanisms in Deep Learning

arXiv.org Artificial Intelligence

Key insights from the studies Arjevani and Field (2022); Chizat, Oyallon, and Bach (2018); Du, Zhai, P oczos, and Singh (2018); Yun, Sra, and Jadbabaie (2018) emphasize the pivotal role of over-parameterization in finding the global optimum and enhancing the generalization ability of deep neural networks (DNNs). Recent work has shown that the evolution of the trainable parameters in continuous-width DNNs during training can be captured by the neural tangent kernel (NTK) Arora, Du, Hu, Li, and Wang (2019); Du, Lee, Li, Wang, and Zhai (2018); Jacot, Gabriel, and Hongler (2018); Mohamadi, Bae, and Sutherland (2023); Wang, Li, and Sun (2023); Zou, Cao, Zhou, and Gu (2018). An alternative research direction attempts to examine the infinite-width neural network from a mean-field perspective (Chizat & Bach, 2018; Mei, Montanari, & Nguyen, 2018; Nguyen & Pham, 2023; Sirignano & Spiliopoulos, 2018). However, in practical applications, neural networks are of finite width, and under this condition, it remains unclear whether NTK theory and mean-field theory can adequately characterize the convergence properties of neural networks Seleznova and Kutyniok (2021). Therefore, the mechanisms of non-convex optimization in deep learning, and the impact of over-parameterization on model training, remain incompletely resolved.


General Distribution Learning: A theoretical framework for Deep Learning

arXiv.org Machine Learning

There remain numerous unanswered research questions on deep learning (DL) within the classical learning theory framework. These include the remarkable generalization capabilities of overparametrized neural networks (NNs), the efficient optimization performance despite non-convexity of objectives, the mechanism of flat minima for generalization, and the exceptional performance of deep architectures in solving physical problems. This paper introduces General Distribution Learning (GD Learning), a novel theoretical learning framework designed to address a comprehensive range of machine learning and statistical tasks, including classification, regression and parameter estimation. Departing from traditional statistical machine learning, GD Learning focuses on the true underlying distribution. In GD Learning, learning error, corresponding to the expected error in classical statistical learning framework, is divided into fitting errors due to models and algorithms, as well as sampling errors introduced by limited sampling data. The framework significantly incorporates prior knowledge, especially in scenarios characterized by data scarcity, thereby enhancing performance. Within the GD Learning framework, we demonstrate that the global optimal solutions in non-convex optimization can be approached by minimizing the gradient norm and the non-uniformity of the eigenvalues of the model's Jacobian matrix. This insight leads to the development of the gradient structure control algorithm. GD Learning also offers fresh insights into the questions on deep learning, including overparameterization and non-convex optimization, bias-variance trade-off, and the mechanism of flat minima.


Learning About Structural Errors in Models of Complex Dynamical Systems

arXiv.org Artificial Intelligence

Complex dynamical systems are notoriously difficult to model because some degrees of freedom (e.g., small scales) may be computationally unresolvable or are incompletely understood, yet they are dynamically important. For example, the small scales of cloud dynamics and droplet formation are crucial for controlling climate, yet are unresolvable in global climate models. Semi-empirical closure models for the effects of unresolved degrees of freedom often exist and encode important domain-specific knowledge. Building on such closure models and correcting them through learning the structural errors can be an effective way of fusing data with domain knowledge. Here we describe a general approach, principles, and algorithms for learning about structural errors. Key to our approach is to include structural error models inside the models of complex systems, for example, in closure models for unresolved scales. The structural errors then map, usually nonlinearly, to observable data. As a result, however, mismatches between model output and data are only indirectly informative about structural errors, due to a lack of labeled pairs of inputs and outputs of structural error models. Additionally, derivatives of the model may not exist or be readily available. We discuss how structural error models can be learned from indirect data with derivative-free Kalman inversion algorithms and variants, how sparsity constraints enforce a "do no harm" principle, and various ways of modeling structural errors. We also discuss the merits of using non-local and/or stochastic error models. In addition, we demonstrate how data assimilation techniques can assist the learning about structural errors in non-ergodic systems. The concepts and algorithms are illustrated in two numerical examples based on the Lorenz-96 system and a human glucose-insulin model.


Why Data Cleansing is Must for Predictive Modeling? - DataScienceCentral.com

#artificialintelligence

Wouldn't it be nice to have a sneak-peek into the future of your business to make informed decisions and eliminate guesswork? With the help of predictive modeling, this is possible. Predictive modeling enables businesses to reliably forecast trends and behaviors using past and current data. But to ensure the effectiveness of a predictive model, the data must meet exceptionally high standards. It is for this reason; the data scientists spend 80% of their time preparing and organizing data.


Combining Parametric Land Surface Models with Machine Learning

arXiv.org Machine Learning

A hybrid machine learning and process-based-modeling (PBM) approach is proposed and evaluated at a handful of AmeriFlux sites to simulate the top-layer soil moisture state. The Hybrid-PBM (HPBM) employed here uses the Noah land-surface model integrated with Gaussian Processes. It is designed to correct the model only in climatological situations similar to the training data else it reverts to the PBM. In this way, our approach avoids bad predictions in scenarios where similar training data is not available and incorporates our physical understanding of the system. Here we assume an autoregressive model and obtain out-of-sample results with upwards of a 3-fold reduction in the RMSE using a one-year leave-one-out cross-validation at each of the selected sites. A path is outlined for using hybrid modeling to build global land-surface models with the potential to significantly outperform the current state-of-the-art.