Goto

Collaborating Authors

 Regression


Machine Learning for Data Analysis: Regression & Forecasting

#artificialintelligence

You'll see how regression analysis can be used to estimate property prices, forecast seasonal trends, predict sales for a new product launch, and even measure This course makes data science approachable to everyday people, and is designed to demystify powerful Machine Learning tools & techniques without trying to teach you a coding language at the same time. Instead, we'll use familiar, user-friendly tools like Microsoft Excel to break down complex topics and help you understand exactly HOW and WHY machine learning works before you dive into programming languages like Python or R. Unlike most Data Science and Machine Learning courses, you won't write a SINGLE LINE of code. In this Part 3 course, we'll start by introducing core building blocks like linear relationships and least squared error, then show you how these concepts can be applied to univariate, multivariate, and non-linear regression models. From there we'll review common diagnostic metrics like R-squared, mean error, F-significance, and P-Values, along with important concepts like homoscedasticity and multicollinearity. Last but not least we'll dive into time-series forecasting, and explore powerful techniques for identifying seasonality, predicting nonlinear trends, and measuring the impact of key business decisions using intervention analysis: Throughout the course we'll introduce hands-on case studies to solidify key concepts and tie them back to real world scenarios.


Fast Newton method solving KLR based on Multilevel Circulant Matrix with log-linear complexity

arXiv.org Artificial Intelligence

Kernel logistic regression (KLR) is a conventional nonlinear classifier in machine learning. With the explosive growth of data size, the storage and computation of large dense kernel matrices is a major challenge in scaling KLR. Even the nystr\"{o}m approximation is applied to solve KLR, it also faces the time complexity of $O(nc^2)$ and the space complexity of $O(nc)$, where $n$ is the number of training instances and $c$ is the sampling size. In this paper, we propose a fast Newton method efficiently solving large-scale KLR problems by exploiting the storage and computing advantages of multilevel circulant matrix (MCM). Specifically, by approximating the kernel matrix with an MCM, the storage space is reduced to $O(n)$, and further approximating the coefficient matrix of the Newton equation as MCM, the computational complexity of Newton iteration is reduced to $O(n \log n)$. The proposed method can run in log-linear time complexity per iteration, because the multiplication of MCM (or its inverse) and vector can be implemented the multidimensional fast Fourier transform (mFFT). Experimental results on some large-scale binary-classification and multi-classification problems show that the proposed method enables KLR to scale to large scale problems with less memory consumption and less training time without sacrificing test accuracy.


9 Best Data Analyst with R Online Courses

#artificialintelligence

Do you want to learn data analytics with R? If yes, then Good Decision! Because R programming has various statistical and graphical capabilities. R has a huge variety of libraries to perform statistical analysis. Some most powerful visualization packages in R are ggplot2, ggvis, googleVis, and rCharts. So, if you are looking for a data analyst with R online courses, then this article will help you.


Data-Driven Sample Average Approximation with Covariate Information

arXiv.org Machine Learning

We study optimization for data-driven decision-making when we have observations of the uncertain parameters within the optimization model together with concurrent observations of covariates. Given a new covariate observation, the goal is to choose a decision that minimizes the expected cost conditioned on this observation. We investigate three data-driven frameworks that integrate a machine learning prediction model within a stochastic programming sample average approximation (SAA) for approximating the solution to this problem. Two of the SAA frameworks are new and use out-of-sample residuals of leave-one-out prediction models for scenario generation. The frameworks we investigate are flexible and accommodate parametric, nonparametric, and semiparametric regression techniques. We derive conditions on the data generation process, the prediction model, and the stochastic program under which solutions of these data-driven SAAs are consistent and asymptotically optimal, and also derive convergence rates and finite sample guarantees. Computational experiments validate our theoretical results, demonstrate the potential advantages of our data-driven formulations over existing approaches (even when the prediction model is misspecified), and illustrate the benefits of our new data-driven formulations in the limited data regime.


[100%OFF] Linear Regression And Logistic Regression Using R Studio

#artificialintelligence

You're looking for a complete Linear Regression and Logistic Regression course that teaches you everything you need to create a Linear or Logistic Regression model in R Studio, right? You've found the right Linear Regression course! A Verifiable Certificate of Completion is presented to all students who undertake this Machine learning basics course. How this course will help you? Why should you choose this course?


Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception

arXiv.org Artificial Intelligence

Recently, adversarial machine learning attacks have posed serious security threats against practical audio signal classification systems, including speech recognition, speaker recognition, and music copyright detection. Previous studies have mainly focused on ensuring the effectiveness of attacking an audio signal classifier via creating a small noise-like perturbation on the original signal. It is still unclear if an attacker is able to create audio signal perturbations that can be well perceived by human beings in addition to its attack effectiveness. This is particularly important for music signals as they are carefully crafted with human-enjoyable audio characteristics. In this work, we formulate the adversarial attack against music signals as a new perception-aware attack framework, which integrates human study into adversarial attack design. Specifically, we conduct a human study to quantify the human perception with respect to a change of a music signal. We invite human participants to rate their perceived deviation based on pairs of original and perturbed music signals, and reverse-engineer the human perception process by regression analysis to predict the human-perceived deviation given a perturbed signal. The perception-aware attack is then formulated as an optimization problem that finds an optimal perturbation signal to minimize the prediction of perceived deviation from the regressed human perception model. We use the perception-aware framework to design a realistic adversarial music attack against YouTube's copyright detector. Experiments show that the perception-aware attack produces adversarial music with significantly better perceptual quality than prior work.


Internet of Things (IoT) based ECG System for Rural Health Care

arXiv.org Artificial Intelligence

Nearly 30% of the people in the rural areas of Bangladesh are below the poverty level. Moreover, due to the unavailability of modernized healthcare-related technology, nursing and diagnosis facilities are limited for rural people. Therefore, rural people are deprived of proper healthcare. In this perspective, modern technology can be facilitated to mitigate their health problems. ECG sensing tools are interfaced with the human chest, and requisite cardiovascular data is collected through an IoT device. These data are stored in the cloud incorporates with the MQTT and HTTP servers. An innovative IoT-based method for ECG monitoring systems on cardiovascular or heart patients has been suggested in this study. The ECG signal parameters P, Q, R, S, T are collected, pre-processed, and predicted to monitor the cardiovascular conditions for further health management. The machine learning algorithm is used to determine the significance of ECG signal parameters and error rate. The logistic regression model fitted the better agreements between the train and test data. The prediction has been performed to determine the variation of PQRST quality and its suitability in the ECG Monitoring System. Considering the values of quality parameters, satisfactory results are obtained. The proposed IoT-based ECG system reduces the health care cost and complexity of cardiovascular diseases in the future.


An Urban Population Health Observatory for Disease Causal Pathway Analysis and Decision Support: Underlying Explainable Artificial Intelligence Model

arXiv.org Artificial Intelligence

This study sought to (1) expand our existing Urban Population Health Observatory (UPHO) system by incorporating a semantics layer; (2) cohesively employ machine learning and semantic/logical inference to provide measurable evidence and detect pathways leading to undesirable health outcomes; (3) provide clinical use case scenarios and design case studies to identify socioenvironmental determinants of health associated with the prevalence of obesity, and (4) design a dashboard that demonstrates the use of UPHO in the context of obesity surveillance using the provided scenarios. The system design includes a knowledge graph generation component that provides contextual knowledge from relevant domains of interest. This system leverages semantics using concepts, properties, and axioms from existing ontologies. In addition, we used the publicly available US Centers for Disease Control and Prevention 500 Cities data set to perform multivariate analysis. A cohesive approach that employs machine learning and semantic/logical inference reveals pathways leading to diseases. In this study, we present 2 clinical case scenarios and a proof-of-concept prototype design of a dashboard that provides warnings, recommendations, and explanations and demonstrates the use of UPHO in the context of obesity surveillance, treatment, and prevention. While exploring the case scenarios using a support vector regression machine learning model, we found that poverty, lack of physical activity, education, and unemployment were the most important predictive variables that contribute to obesity in Memphis, TN. The application of UPHO could help reduce health disparities and improve urban population health. The expanded UPHO feature incorporates an additional level of interpretable knowledge to enhance physicians, researchers, and health officials' informed decision-making at both patient and community levels.


Generalized Linear Model

#artificialintelligence

What is a Generalized Linear Model? Why GLM? Assumptions of GLM Components of GLM Different Generalized Linear Models Difference Between Generalized Linear Model and General Linear Model Can Generalized Linear Models have correlated data? Generalized Linear Model (GLiM, or GLM) is an advanced statistical modelling technique formulated by John Nelder and Robert Wedderburn in 1972. It is an umbrella term that encompasses many other models, which allows the response variable y to have an error distribution other than a normal distribution. The models include Linear Regression, Logistic Regression, and Poisson Regression.


Machine Learning to Predict the Antimicrobial Activity of Cold Atmospheric Plasma-Activated Liquids

arXiv.org Artificial Intelligence

Plasma is defined as the fourth state of matter and non-thermal plasma can be produced at atmospheric pressure under a high electrical field. The strong and broad-spectrum antimicrobial effect of plasma-activated liquids (PALs) is now well known. The proven applicability of machine learning (ML) in the medical field is encouraging for its application in the field of plasma medicine as well. Thus, ML applications on PALs could present a new perspective to better understand the influences of various parameters on their antimicrobial effects. In this paper, comparative supervised ML models are presented by using previously obtained data to qualitatively predict the in vitro antimicrobial activity of PALs. A literature search was performed and data is collected from 33 relevant articles. After the required preprocessing steps, two supervised ML methods, namely classification, and regression are applied to data to obtain microbial inactivation (MI) predictions. For classification, MI is labeled in four categories and for regression, MI is used as a continuous variable. Two different robust cross-validation strategies are conducted for classification and regression models to evaluate the proposed method; repeated stratified k-fold cross-validation and k-fold cross-validation, respectively. We also investigate the effect of different features on models. The results demonstrated that the hyperparameter-optimized Random Forest Classifier (oRFC) and Random Forest Regressor (oRFR) provided better results than other models for the classification and regression, respectively. Finally, the best test accuracy of 82.68% for oRFC and R2 of 0.75 for the oRFR are obtained. ML techniques could contribute to a better understanding of plasma parameters that have a dominant role in the desired antimicrobial effect. Furthermore, such findings may contribute to the definition of a plasma dose in the future.