Goto

Collaborating Authors

 Regression


Evaluating All Possible Combinations of Hyperparameters -Grid Search-

#artificialintelligence

The model and the preprocessing are individual for each project. Hyperparameters are tuned according to the dataset and using the same hyperparameters for each project compromises the accuracy of the results. For example, there are different hyperparameters such as'solver', 'C', 'penalty' in the Logistic Regression algorithm, and different combinations of these give different results. Similarly, there are adjustable parameters for Support Vector Machine such as gamma value, C value, and combination of them also gives different results. These hyperparameters of the algorithms are available on the sklearn website.


Uncertainty quantification for industrial design using dictionaries of reduced order models

arXiv.org Machine Learning

We consider the dictionary-based ROM-net (Reduced Order Model) framework [T. Daniel, F. Casenave, N. Akkari, D. Ryckelynck, Model order reduction assisted by deep neural networks (ROM-net), Advanced modeling and Simulation in Engineering Sciences 7 (16), 2020] and summarize the underlying methodologies and their recent improvements. The main contribution of this work is the application of the complete workflow to a real-life industrial model of an elastoviscoplastic high-pressure turbine blade subjected to thermal, centrifugal and pressure loadings, for the quantification of the uncertainty on dual quantities (such as the accumulated plastic strain and the stress tensor), generated by the uncertainty on the temperature loading field. The dictionary-based ROM-net computes predictions of dual quantities of interest for 1008 Monte Carlo draws of the temperature loading field in 2 hours and 48 minutes, which corresponds to a speedup greater than 600 with respect to a reference parallel solver using domain decomposition, with a relative error in the order of 2%. Another contribution of this work consists in the derivation of a meta-model to reconstruct the dual quantities of interest over the complete mesh from their values on the reduced integration points.


Kinematics clustering enables head impact subtyping for better traumatic brain injury prediction

arXiv.org Artificial Intelligence

Traumatic brain injury can be caused by various types of head impacts. However, due to different kinematic characteristics, many brain injury risk estimation models are not generalizable across the variety of impacts that humans may sustain. The current definitions of head impact subtypes are based on impact sources (e.g., football, traffic accident), which may not reflect the intrinsic kinematic similarities of impacts across the impact sources. To investigate the potential new definitions of impact subtypes based on kinematics, 3,161 head impacts from various sources including simulation, college football, mixed martial arts, and car racing were collected. We applied the K-means clustering to cluster the impacts on 16 standardized temporal features from head rotation kinematics. Then, we developed subtype-specific ridge regression models for cumulative strain damage (using the threshold of 15%), which significantly improved the estimation accuracy compared with the baseline method which mixed impacts from different sources and developed one model (R^2 from 0.7 to 0.9). To investigate the effect of kinematic features, we presented the top three critical features (maximum resultant angular acceleration, maximum angular acceleration along the z-axis, maximum linear acceleration along the y-axis) based on regression accuracy and used logistic regression to find the critical points for each feature that partitioned the subtypes. This study enables researchers to define head impact subtypes in a data-driven manner, which leads to more generalizable brain injury risk estimation.


Interpretable Summaries of Black Box Incident Triaging with Subgroup Discovery

arXiv.org Artificial Intelligence

The need of predictive maintenance comes with an increasing number of incidents reported by monitoring systems and equipment/software users. In the front line, on-call engineers (OCEs) have to quickly assess the degree of severity of an incident and decide which service to contact for corrective actions. To automate these decisions, several predictive models have been proposed, but the most efficient models are opaque (say, black box), strongly limiting their adoption. In this paper, we propose an efficient black box model based on 170K incidents reported to our company over the last 7 years and emphasize on the need of automating triage when incidents are massively reported on thousands of servers running our product, an ERP. Recent developments in eXplainable Artificial Intelligence (XAI) help in providing global explanations to the model, but also, and most importantly, with local explanations for each model prediction/outcome. Sadly, providing a human with an explanation for each outcome is not conceivable when dealing with an important number of daily predictions. To address this problem, we propose an original data-mining method rooted in Subgroup Discovery, a pattern mining technique with the natural ability to group objects that share similar explanations of their black box predictions and provide a description for each group. We evaluate this approach and present our preliminary results which give us good hope towards an effective OCE's adoption. We believe that this approach provides a new way to address the problem of model agnostic outcome explanation.


Recommending Insurance products by using Users' Sentiments

arXiv.org Artificial Intelligence

In today's tech-savvy world every industry is trying to formulate methods for recommending products by combining several techniques and algorithms to form a pool that would bring forward the most enhanced models for making the predictions. Building on these lines is our paper focused on the application of sentiment analysis for recommendation in the insurance domain. We tried building the following Machine Learning models namely, Logistic Regression, Multinomial Naive Bayes, and the mighty Random Forest for analyzing the polarity of a given feedback line given by a customer. Then we used this polarity along with other attributes like Age, Gender, Locality, Income, and the list of other products already purchased by our existing customers as input for our recommendation model. Then we matched the polarity score along with the user's profiles and generated the list of insurance products to be recommended in descending order. Despite our model's simplicity and the lack of the key data sets, the results seemed very logical and realistic. So, by developing the model with more enhanced methods and with access to better and true data gathered from an insurance industry may be the sector could be very well benefitted from the amalgamation of sentiment analysis with a recommendation.


Interpolation can hurt robust generalization even when there is no noise

arXiv.org Machine Learning

Conventional statistical wisdom cautions the user that trains a model by minimizing a loss L(θ): if a global minimizer achieves zero or near-zero training loss (i.e., it interpolates), we run the risk of overfitting (i.e., high variance) and thus sub-optimal prediction performance. Instead, regularization is commonly used to reduce the effect of noise and to obtain an estimator with better generalization. Specifically, regularization limits model complexity and induces worse data fit, for example via an explicit penalty term R(θ). The resulting penalized loss L(θ) λR(θ) explicitly imposes certain structural properties on the minimizer. This classical rationale, however, does seemingly not apply to overparameterized models: in practice, large neural networks, for example, exhibit good generalization performance on i.i.d.


Mixture of Linear Models Co-supervised by Deep Neural Networks

arXiv.org Machine Learning

Deep neural network (DNN) models have achieved phenomenal success for applications in many domains, ranging from academic research in science and engineering to industry and business. The modeling power of DNN is believed to have come from the complexity and over-parameterization of the model, which on the other hand has been criticized for the lack of interpretation. Although certainly not true for every application, in some applications, especially in economics, social science, healthcare industry, and administrative decision making, scientists or practitioners are resistant to use predictions made by a black-box system for multiple reasons. One reason is that a major purpose of a study can be to make discoveries based upon the prediction function, e.g., to reveal the relationships between measurements. Another reason can be that the training dataset is not large enough to make researchers feel completely sure about a purely data-driven result. Being able to examine and interpret the prediction function will enable researchers to connect the result with existing knowledge or gain insights about new directions to explore. Although classic statistical models are much more explainable, their accuracy often falls considerably below DNN. In this paper, we propose an approach to fill the gap between relatively simple explainable models and DNN such that we can more flexibly tune the trade-off between interpretability and accuracy. Our main idea is a mixture of discriminative models that is trained with the guidance from a DNN. Although mixtures of discriminative models have been studied before, our way of generating the mixture is quite different.


Statistical Analysis of Wasserstein Distributionally Robust Estimators

arXiv.org Machine Learning

We consider statistical methods which invoke a min-max distributionally robust formulation to extract good out-of-sample performance in data-driven optimization and learning problems. Acknowledging the distributional uncertainty in learning from limited samples, the min-max formulations introduce an adversarial inner player to explore unseen covariate data. The resulting Distributionally Robust Optimization (DRO) formulations, which include Wasserstein DRO formulations (our main focus), are specified using optimal transportation phenomena. Upon describing how these infinite-dimensional min-max problems can be approached via a finite-dimensional dual reformulation, the tutorial moves into its main component, namely, explaining a generic recipe for optimally selecting the size of the adversary's budget. This is achieved by studying the limit behavior of an optimal transport projection formulation arising from an inquiry on the smallest confidence region that includes the unknown population risk minimizer. Incidentally, this systematic prescription coincides with those in specific examples in high-dimensional statistics and results in error bounds that are free from the curse of dimensions. Equipped with this prescription, we present a central limit theorem for the DRO estimator and provide a recipe for constructing compatible confidence regions that are useful for uncertainty quantification. The rest of the tutorial is devoted to insights into the nature of the optimizers selected by the min-max formulations and additional applications of optimal transport projections.


Learning Causal Relationships from Conditional Moment Conditions by Importance Weighting

arXiv.org Machine Learning

We consider learning causal relationships under conditional moment conditions. Unlike causal inference under unconditional moment conditions, conditional moment conditions pose serious challenges for causal inference, especially in complex, high-dimensional settings. To address this issue, we propose a method that transforms conditional moment conditions to unconditional moment conditions through importance weighting using the conditional density ratio. Then, using this transformation, we propose a method that successfully approximates conditional moment conditions. Our proposed approach allows us to employ methods for estimating causal parameters from unconditional moment conditions, such as generalized method of moments, adequately in a straightforward manner. In experiments, we confirm that our proposed method performs well compared to existing methods.


Introducing a Family of Synthetic Datasets for Research on Bias in Machine Learning

arXiv.org Machine Learning

A significant impediment to progress in research on bias in machine learning (ML) is the availability of relevant datasets. This situation is unlikely to change much given the sensitivity of such data. For this reason, there is a role for synthetic data in this research. In this short paper, we present one such family of synthetic data sets. We provide an overview of the data, describe how the level of bias can be varied, and present a simple example of an experiment on the data.