AITopics

#artificialintelligenceApr-8-2021

Machine Learning - Regression and Classification (math Inc.)

Machine learning is a branch of artificial intelligence (AI) focused on building applications that learn from data and improve their accuracy over time without being programmed to do so. In data science, an algorithm is a sequence of statistical processing steps. In machine learning, algorithms are'trained' to find patterns and features in massive amounts of data in order to make decisions and predictions based on new data. The better the algorithm, the more accurate the decisions and predictions will become as it processes more data. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts.

algorithm, machine learning, regression and classification, (4 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Kortmann, Karl-Philipp, Fehsenfeld, Moritz, Wielitzka, Mark

Autoencoder-based Representation Learning from Heterogeneous Multivariate Time Series Data of Mechatronic Systems

arXiv.org Artificial IntelligenceApr-8-2021

Sensor and control data of modern mechatronic systems are often available as heterogeneous time series with different sampling rates and value ranges. Suitable classification and regression methods from the field of supervised machine learning already exist for predictive tasks, for example in the context of condition monitoring, but their performance scales strongly with the number of labeled training data. Their provision is often associated with high effort in the form of person-hours or additional sensors. In this paper, we present a method for unsupervised feature extraction using autoencoder networks that specifically addresses the heterogeneous nature of the database and reduces the amount of labeled training data required compared to existing methods. Three public datasets of mechatronic systems from different application domains are used to validate the results.

classification, time sery, training data, (10 more...)

2104.02784

Country: Europe > Germany (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Bertsimas, Dimitris, Delarue, Arthur, Pauphilet, Jean

Prediction with Missing Data

arXiv.org Machine LearningApr-7-2021

Missing information is inevitable in real-world data sets. While imputation is well-suited and theoretically sound for statistical inference, its relevance and practical implementation for out-of-sample prediction remains unsettled. We provide a theoretical analysis of widely used data imputation methods and highlight their key deficiencies in making accurate predictions. Alternatively, we propose adaptive linear regression, a new class of models that can be directly trained and evaluated on partially observed data, adapting to the set of available features. In particular, we show that certain adaptive regression models are equivalent to impute-then-regress methods where the imputation and the regression models are learned simultaneously instead of sequentially. We validate our theoretical findings and adaptive regression approach with numerical results with real-world data sets.

imputation, pauphilet, prediction, (15 more...)

2104.03158

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.94)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Artelt, André, Hinder, Fabian, Vaquet, Valerie, Feldhans, Robert, Hammer, Barbara

Contrastive Explanations for Explaining Model Adaptations

arXiv.org Artificial IntelligenceApr-7-2021

Many decision making systems deployed in the real world are not static - a phenomenon known as model adaptation takes place over time. The need for transparency and interpretability of AI-based decision models is widely accepted and thus have been worked on extensively. Usually, explanation methods assume a static system that has to be explained. Explaining non-static systems is still an open research question, which poses the challenge how to explain model adaptations. In this contribution, we propose and (empirically) evaluate a framework for explaining model adaptations by contrastive explanations. We also propose a method for automatically finding regions in data space that are affected by a given model adaptation and thus should be explained.

counterfactual explanation, explanation, model adaptation, (12 more...)

2104.02459

Country:

Europe > Germany (0.14)
North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry:

Banking & Finance (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

arXiv.org Artificial IntelligenceApr-6-2021

Towards a Rigorous Evaluation of Explainability for Multivariate Time Series

Saluja, Rohit, Malhi, Avleen, Knapič, Samanta, Främling, Kary, Cavdar, Cicek

Machine learning-based systems are rapidly gaining popularity and in-line with that there has been a huge research surge in the field of explainability to ensure that machine learning models are reliable, fair, and can be held liable for their decision-making process. Explainable Artificial Intelligence (XAI) methods are typically deployed to debug black-box machine learning models but in comparison to tabular, text, and image data, explainability in time series is still relatively unexplored. The aim of this study was to achieve and evaluate model agnostic explainability in a time series forecasting problem. This work focused on proving a solution for a digital consultancy company aiming to find a data-driven approach in order to understand the effect of their sales related activities on the sales deals closed. The solution involved framing the problem as a time series forecasting problem to predict the sales deals and the explainability was achieved using two novel model agnostic explainability techniques, Local explainable model-agnostic explanations (LIME) and Shapley additive explanations (SHAP) which were evaluated using human evaluation of explainability. The results clearly indicate that the explanations produced by LIME and SHAP greatly helped lay humans in understanding the predictions made by the machine learning model. The presented work can easily be extended to any time

explanation, participant, prediction, (15 more...)

2104.04075

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
Europe > Sweden > Västerbotten County > Umeå (0.04)
Europe > United Kingdom > England > Dorset > Bournemouth (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.66)

Ahfock, Daniel, McLachlan, Geoffrey J.

Harmless label noise and informative soft-labels in supervised classification

arXiv.org Machine LearningApr-6-2021

Manual labelling of training examples is common practice in supervised learning. When the labelling task is of non-trivial difficulty, the supplied labels may not be equal to the ground-truth labels, and label noise is introduced into the training dataset. If the manual annotation is carried out by multiple experts, the same training example can be given different class assignments by different experts, which is indicative of label noise. In the framework of model-based classification, a simple, but key observation is that when the manual labels are sampled using the posterior probabilities of class membership, the noisy labels are as valuable as the ground-truth labels in terms of statistical information. A relaxation of this process is a random effects model for imperfect labelling by a group that uses approximate posterior probabilities of class membership. The relative efficiency of logistic regression using the noisy labels compared to logistic regression using the ground-truth labels can then be derived. The main finding is that logistic regression can be robust to label noise when label noise and classification difficulty are positively correlated. In particular, when classification difficulty is the only source of label errors, multiple sets of noisy labels can supply more information for the estimation of a classification rule compared to the single set of ground-truth labels.

dataset, ground-truth label, probability, (13 more...)

2104.02872

Country:

North America > United States > Wisconsin (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > Queensland (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.77)
Research Report > New Finding (0.76)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Hu, Liangyuan, Lin, Jung-Yi Joyce, Ji, Jiayi

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning

arXiv.org Machine LearningApr-6-2021

The missing data issue is ubiquitous in health studies. Variable selection in the presence of both missing covariates and outcomes is an important statistical research topic but has been less studied. Existing literature focuses on parametric regression techniques that provide direct parameter estimates of the regression model. In practice, parametric regression models are often sub-optimal for variable selection because they are susceptible to misspecification. Machine learning methods considerably weaken the parametric assumptions and increase modeling flexibility, but do not provide as naturally defined variable importance measure as the covariate effect native to parametric models. We investigate a general variable selection approach when both the covariates and outcomes can be missing at random and have general missing data patterns. This approach exploits the flexibility of machine learning modeling techniques and bootstrap imputation, which is amenable to nonparametric methods in which the covariate effects are not directly available. We conduct expansive simulations investigating the practical operating characteristics of the proposed variable selection approach, when combined with four tree-based machine learning methods, XGBoost, Random Forests, Bayesian Additive Regression Trees (BART) and Conditional Random Forests, and two commonly used parametric methods, lasso and backward stepwise selection. Numeric results show XGBoost and BART have the overall best performance across various settings. Guidance for choosing methods appropriate to the structure of the analysis data at hand are discussed. We further demonstrate the methods via a case study of risk factors for 3-year incidence of metabolic syndrome with data from the Study of Women's Health Across the Nation.

imputation, predictor, selection, (12 more...)

2104.02769

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Rügamer, David, Shen, Ruolin, Bukas, Christina, Sousa, Lisa Barros de Andrade e, Thalmeier, Dominik, Klein, Nadja, Kolb, Chris, Pfisterer, Florian, Kopper, Philipp, Bischl, Bernd, Müller, Christian L.

deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression

arXiv.org Machine LearningApr-6-2021

This paper describes the implementation of semi-structured deep distributional regression, a flexible framework to learn distributions based on a combination of additive regression models and deep neural networks. deepregression is implemented in both R and Python, using the deep learning libraries TensorFlow and PyTorch, respectively. The implementation consists of (1) a modular neural network building system for the combination of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks as well as (3) pre-processing steps necessary to initialize such models. The software package allows to define models in a user-friendly manner using distribution definitions via a formula environment that is inspired by classical statistical model frameworks such as mgcv. The packages' modular design and functionality provides a unique resource for rapid and reproducible prototyping of complex statistical and deep learning models while simultaneously retaining the indispensable interpretability of classical statistical models.

deepregression, formula, predictor, (14 more...)

2104.02705

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.06)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Merabet, Ghezlane Halhoul, Essaaidi, Mohamed, Haddou, Mohamed Ben, Qolomany, Basheer, Qadir, Junaid, Anan, Muhammad, Al-Fuqaha, Ala, Abid, Mohamed Riduan, Benhaddou, Driss

Intelligent Building Control Systems for Thermal Comfort and Energy-Efficiency: A Systematic Review of Artificial Intelligence-Assisted Techniques

arXiv.org Artificial IntelligenceApr-5-2021

Building operations represent a significant percentage of the total primary energy consumed in most countries due to the proliferation of Heating, Ventilation and Air-Conditioning (HVAC) installations in response to the growing demand for improved thermal comfort. Reducing the associated energy consumption while maintaining comfortable conditions in buildings are conflicting objectives and represent a typical optimization problem that requires intelligent system design. Over the last decade, different methodologies based on the Artificial Intelligence (AI) techniques have been deployed to find the sweet spot between energy use in HVAC systems and suitable indoor comfort levels to the occupants. This paper performs a comprehensive and an in-depth systematic review of AI-based techniques used for building control systems by assessing the outputs of these techniques, and their implementations in the reviewed works, as well as investigating their abilities to improve the energy-efficiency, while maintaining thermal comfort conditions. This enables a holistic view of (1) the complexities of delivering thermal comfort to users inside buildings in an energy-efficient way, and (2) the associated bibliographic material to assist researchers and experts in the field in tackling such a challenge. Among the 20 AI tools developed for both energy consumption and comfort control, functions such as identification and recognition patterns, optimization, predictive control. Based on the findings of this work, the application of AI technology in building control is a promising area of research and still an ongoing, i.e., the performance of AI-based control is not yet completely satisfactory. This is mainly due in part to the fact that these algorithms usually need a large amount of high-quality real-world data, which is lacking in the building or, more precisely, the energy sector.

comfort parameter, renewable energy, upstream oil & gas, (28 more...)

2104.02214

Country:

Asia > China (0.45)
North America > United States > Texas (0.27)
North America > United States > Michigan (0.27)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Construction & Engineering > HVAC (1.00)
Energy > Oil & Gas > Upstream (0.68)
Energy > Renewable > Solar (0.45)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Systems & Languages (1.00)
(12 more...)