AITopics

2203.00554

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

#artificialintelligenceFeb-28-2022, 08:50:09 GMT

Project 8 Part 1: Logistic Regression - Python

Welcome Hi again, hi again! If you've been catching up with my blog, thanks for your continuous support If you're new here, thank you for giving my blog a chance Since I started learning R, I've thought about making code comparisons between Python and R. Concidentally, I've also started learning machine learning so I thought... why not try and compare machine learning codes between Python and R! So far, I've learned how to build logistic regression models using Python and R. Project 8 is divided into parts 1 and 2 where the codes using Python and R will be described respectively. I will be using the Iris dataset to demonstrate how the codes work If you're someone who requires assistive software to read, I suggest downloading the PDF documents to read the codes. Python - Jupyter Notebook For this project, I built a logistic regression model using sklearn. For starters, the packages I used were Pandas, Numpy, Scipy, Sklearn, and matplotlib.

dataset, logistic regression model, pdf version, (13 more...)

Genre:

Research Report > New Finding (0.86)
Research Report > Experimental Study (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceFeb-28-2022, 08:20:18 GMT

R Programming: Selection of variables

The all-possible-regressions procedure considers all possible subsets of the pool of potential explanatory variables Xi (with i 1, 2, …, m). It then identifies a small group of regression models which are "good" according to a specified criterion. A detailed examination of these models can lead to the selection of the final model. If there are m candidate explanatory variables: 2 m regressions for all possible subsets (e.g. if m 10, then there are 1024 possible regression models) The function leaps() (from package leaps) performs an exhaustive search for the best subsets of the explanatory variables for predicting the response variable in linear regression. This gave us a little idea but still, we are not sure how many parameters to be used.

criterion, regression model, selection criterion, (9 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.81)

#artificialintelligenceFeb-28-2022, 08:20:17 GMT

Hyper Parameter Tuning with Uninformed and Informed Search

Hyperparameters are those parameters in Machine learning algorithms that are used to control the learning process of algorithms. Hyperparameter tuning is the process of finding the best hyperparameters which help us to build more accurate machine learning models. Note: There is a difference between Model Parameters and Hyper Parameters. Model parameters are learned from data e.g. Slope and intercept in Linear Regression models, and Hyperparameters are those which we set such as L1 or L2 Regularization in Regression Model.

best hyperparameter, hyperparameter, search space, (10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)

Pham, Nhat Thien, Chamroukhi, Faicel

Functional mixture-of-experts for classification

arXiv.org Machine LearningFeb-28-2022

We develop a mixtures-of-experts (ME) approach to the multiclass classification where the predictors are univariate functions. It consists of a ME model in which both the gating network and the experts network are constructed upon multinomial logistic activation functions with functional inputs. We perform a regularized maximum likelihood estimation in which the coefficient functions enjoy interpretable sparsity constraints on targeted derivatives. We develop an EM-Lasso like algorithm to compute the regularized MLE and evaluate the proposed approach on simulated and real data.

algorithm, classification, coefficient function, (14 more...)

2202.13934

Country:

Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

#artificialintelligenceFeb-26-2022, 13:55:14 GMT

Beginner Machine Learning: 2) Multiple Linear Regression in Python

A regression model is a statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line (or a plane in the case of two or more independent variables). Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. Multiple regression is an extension of linear (OLS) regression that uses just one explanatory variable. Let's try to predict of startups using Multiple Linear Regression in Python We will be using Scikit-learn Library to import the necessary functions required for this Exercise. We will be using Pandas and Numpy for Data Exploration.

beginner machine learning, explanatory variable, multiple linear regression, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceFeb-25-2022, 20:40:32 GMT

Bayesian Statistics Overview and your first Bayesian Linear Regression Model

Frequentist and Bayesian are two different versions of statistics. Frequentist is a more classical version, which, as the name suggests, rely on the long run frequency of events (data points) to calculate the variable of interest. Bayesian on the other hand, can also work without having a large number of events (in fact, it could work even with one data point!). The cardinal difference between the two is that: frequentist will give you a point estimate, whereas Bayesian will give you a distribution. Having a point estimate means that -- "we are certain that this is the output for this variable of interest". Whereas, having a distribution can be interpreted as -- "we have some belief that the mean of the distribution is the good estimate for this variable of interest, but there is uncertainty too, in the form of standard deviation".

bayesian, linear regression, probability, (15 more...)

Country: North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Hatt, Tobias, Berrevoets, Jeroen, Curth, Alicia, Feuerriegel, Stefan, van der Schaar, Mihaela

Combining Observational and Randomized Data for Estimating Heterogeneous Treatment Effects

arXiv.org Machine LearningFeb-25-2022

Estimating heterogeneous treatment effects is an important problem across many domains. In order to accurately estimate such treatment effects, one typically relies on data from observational studies or randomized experiments. Currently, most existing works rely exclusively on observational data, which is often confounded and, hence, yields biased estimates. While observational data is confounded, randomized data is unconfounded, but its sample size is usually too small to learn heterogeneous treatment effects. In this paper, we propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data via representation learning. In particular, we introduce a two-step framework: first, we use observational data to learn a shared structure (in form of a representation); and then, we use randomized data to learn the data-specific structures. We analyze the finite sample properties of our framework and compare them to several natural baselines. As such, we derive conditions for when combining observational and randomized data is beneficial, and for when it is not. Based on this, we introduce a sample-efficient algorithm, called CorNet. We use extensive simulation studies to verify the theoretical properties of CorNet and multiple real-world datasets to demonstrate our method's superiority compared to existing methods.

conf, observational data, unc, (16 more...)

2202.12891

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Tennessee (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Information Management (0.92)
(4 more...)

arXiv.org Machine LearningFeb-25-2022

Trying to Outrun Causality with Machine Learning: Limitations of Model Explainability Techniques for Identifying Predictive Variables

Vowels, Matthew J.

Machine Learning explainability techniques have been proposed as a means of `explaining' or interrogating a model in order to understand why a particular decision or prediction has been made. Such an ability is especially important at a time when machine learning is being used to automate decision processes which concern sensitive factors and legal outcomes. Indeed, it is even a requirement according to EU law. Furthermore, researchers concerned with imposing overly restrictive functional form (e.g., as would be the case in a linear regression) may be motivated to use machine learning algorithms in conjunction with explainability techniques, as part of exploratory research, with the goal of identifying important variables which are associated with an outcome of interest. For example, epidemiologists might be interested in identifying `risk factors' - i.e. factors which affect recovery from disease - by using random forests and assessing variable relevance using importance measures. However, and as we demonstrate, machine learning algorithms are not as flexible as they might seem, and are instead incredibly sensitive to the underling causal structure in the data. The consequences of this are that predictors which are, in fact, critical to a causal system and highly correlated with the outcome, may nonetheless be deemed by explainability techniques to be unrelated/unimportant/unpredictive of the outcome. Rather than this being a limitation of explainability techniques per se, we show that it is rather a consequence of the mathematical implications of regression, and the interaction of these implications with the associated conditional independencies of the underlying causal structure. We provide some alternative recommendations for researchers wanting to explore the data for important variables.

algorithm, graph, random forest, (14 more...)

2202.09875

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Government > Regional Government > Europe Government (0.48)
Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

#artificialintelligenceFeb-24-2022, 21:35:23 GMT

Starting With Linear Regression in Python – Real Python

This is just the beginning. Data science and machine learning are driving image recognition, autonomous vehicle development, decisions in the financial and energy sectors, advances in medicine, the rise of social networks, and more. Linear regression is an important part of this. Linear regression is one of the fundamental statistical and machine learning techniques. Whether you want to do statistics, machine learning, or scientific computing, there's a good chance that you'll need it.

linear regression, real python

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.98)