AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

The GTEx Consortium atlas of genetic regulatory effects across human tissues

ScienceSep-10-2020, 17:41:01 GMT

The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the version 8 data, examining 15,201 RNA-sequencing samples from 49 tissues of 838 postmortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans, showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large diversity of tissues, we provide insights into the tissue specificity of genetic effects and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.

artificial intelligence, machine learning, variant, (18 more...)

Science

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Washington > King County > Seattle (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.14)
(36 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A First Step Towards Distribution Invariant Regression Metrics

Krell, Mario Michael, Wehbe, Bilal

arXiv.org Machine LearningSep-10-2020

Regression evaluation has been performed for decades. Some metrics have been identified to be robust against shifting and scaling of the data but considering the different distributions of data is much more difficult to address (imbalance problem) even though it largely impacts the comparability between evaluations on different datasets. In classification, it has been stated repeatedly that performance metrics like the F-Measure and Accuracy are highly dependent on the class distribution and that comparisons between different datasets with different distributions are impossible. We show that the same problem exists in regression. The distribution of odometry parameters in robotic applications can for example largely vary between different recording sessions. Here, we need regression algorithms that either perform equally well for all function values, or that focus on certain boundary regions like high speed. This has to be reflected in the evaluation metric. We propose the modification of established regression metrics by weighting with the inverse distribution of function values $Y$ or the samples $X$ using an automatically tuned Gaussian kernel density estimator. We show on synthetic and robotic data in reproducible experiments that classical metrics behave wrongly, whereas our new metrics are less sensitive to changing distributions, especially when correcting by the marginal distribution in $X$. Our new evaluation concept enables the comparison of results between different datasets with different distributions. Furthermore, it can reveal overfitting of a regression algorithm to overrepresented target values. As an outcome, non-overfitting regression algorithms will be more likely chosen due to our corrected metrics.

artificial intelligence, machine learning, regression, (16 more...)

arXiv.org Machine Learning

2009.05176

Country:

Europe > Germany > Bremen > Bremen (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > District of Columbia > Washington (0.05)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Generalized Multi-Output Gaussian Process Censored Regression

Gammelli, Daniele, Rolsted, Kasper Pryds, Pacino, Dario, Rodrigues, Filipe

arXiv.org Machine LearningSep-10-2020

When modelling censored observations, a typical approach in current regression methods is to use a censored-Gaussian (i.e. Tobit) model to describe the conditional output distribution. In this paper, as in the case of missing data, we argue that exploiting correlations between multiple outputs can enable models to better address the bias introduced by censored data. To do so, we introduce a heteroscedastic multi-output Gaussian process model which combines the non-parametric flexibility of GPs with the ability to leverage information from correlated outputs under input-dependent noise conditions. To address the resulting inference intractability, we further devise a variational bound to the marginal log-likelihood suitable for stochastic optimization. We empirically evaluate our model against other generative models for censored data on both synthetic and real world tasks and further show how it can be generalized to deal with arbitrary likelihood functions. Results show how the added flexibility allows our model to better estimate the underlying non-censored (i.e. true) process under potentially complex censoring dynamics.

artificial intelligence, likelihood function, machine learning, (17 more...)

arXiv.org Machine Learning

2009.04822

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback

Simulating normalising constants with referenced thermodynamic integration: application to COVID-19 model selection

Hawryluk, Iwona, Mishra, Swapnil, Flaxman, Seth, Bhatt, Samir, Mellan, Thomas A.

arXiv.org Machine LearningSep-10-2020

Model selection is a fundamental part of Bayesian statistical inference; a widely used tool in the field of epidemiology. Simple methods such as Akaike Information Criterion are commonly used but they do not incorporate the uncertainty of the model's parameters, which can give misleading choices when comparing models with similar fit to the data. One approach to model selection in a more rigorous way that uses the full posterior distributions of the models is to compute the ratio of the normalising constants (or model evidence), known as Bayes factors. These normalising constants integrate the posterior distribution over all parameters and balance over and under fitting. However, normalising constants often come in the form of intractable, high-dimensional integrals, therefore special probabilistic techniques need to be applied to correctly estimate the Bayes factors. One such method is thermodynamic integration (TI), which can be used to estimate the ratio of two models' evidence by integrating over a continuous path between the two un-normalised densities. In this paper we introduce a variation of the TI method, here referred to as referenced TI, which computes a single model's evidence in an efficient way by using a reference density such as a multivariate normal - where the normalising constant is known. We show that referenced TI, an asymptotically exact Monte Carlo method of calculating the normalising constant of a single model, in practice converges to the correct result much faster than other competing approaches such as the method of power posteriors. We illustrate the implementation of the algorithm on informative 1- and 2-dimensional examples, and apply it to a popular linear regression problem, and use it to select parameters for a model of the COVID-19 epidemic in South Korea.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2009.03851

Country:

Asia > South Korea (0.25)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Gentle Introduction to Self-Training and Semi-Supervised Learning

#artificialintelligenceSep-9-2020, 20:30:11 GMT

When it comes to machine learning classification tasks, the more data available to train algorithms, the better. In supervised learning, this data must be labeled with respect to the target class -- otherwise, these algorithms wouldn't be able to learn the relationships between the independent and target variables. So, what if we only have enough time and money to label some of a large data set, and choose to leave the rest unlabeled? Can this unlabeled data somehow be used in a classification algorithm? This is where semi-supervised learning comes in.

Add feedback

The First Step in Bayesian Time Series-- Linear Regression

#artificialintelligenceSep-9-2020, 08:34:02 GMT

Today time series forecasting is ubiquitous, and decision-making processes in companies depend heavily on their ability to predict the future. Through a short series of articles I will present you with a possible approach to this kind of problems, combining state-space models with Bayesian statistics. In the initial articles, I will take some of the examples from the book An Introduction to State Space Time Series Analysis from Jacques J.F. Commandeur and Siem Jan Koopman [1]. It comprises a well-known introduction to the subject of state-space modeling applied to the time series domain. In classical regression analysis, it is assumed a linear relationship between a dependent variable y and a predictor variable x.

artificial intelligence, autocorrelation, machine learning, (13 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Working out the mystery of ectasia risk with artificial intelligence

#artificialintelligenceSep-9-2020, 06:45:21 GMT

This article was reviewed by Renato Ambrósio, Jr, MD, PhD Ectasia is an intriguing and mysterious complication of laser-vision-correction (LVC) procedures. The potentially devastating problem underscores the importance of determining the susceptibility of the cornea for developing progressive ectasia, and of going beyond detecting just mild or subclinical keratoconus. The corneal structure as well as the potential impact of LVC should be considered to predict ectasia risk in every patient. "The LVC procedure and eye rubbing are the primary environmental culprits in the development of ectasia in any cornea," said Renato Ambrósio, Jr, MD, PhD. "So, a basic factor for avoiding ectasia is educating the patient not to rub the eye."

artificial intelligence, ectasia, machine learning, (14 more...)

#artificialintelligence

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.06)
Europe > Germany (0.05)

Genre:

Research Report > New Finding (0.32)
Research Report > Experimental Study (0.31)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Algorithmic Trading Using Logistic Regression - Hands-Off Investing

#artificialintelligenceSep-9-2020, 03:20:24 GMT

With the increasing popularity of machine learning, many traders are looking for ways in which they can "teach" a computer to trade for them. This process is called algorithmic trading (sometimes called algo-trading). Algorithmic trading is a hands off strategy for buying and selling stocks that leverages technical indicators instead of human intuition. In order to implement an algorithmic trading strategy though, you have to first narrow down a list of stocks that you want to analyze. This walk-through provides an automated process (using python and logistic regression) for determining the best stocks to algo-trade.

logistic regression, machine learning, technical indicator, (9 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.66)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.78)

Add feedback

Build an IoT hub for streaming, storing, and analyzing sensor data in the cloud: Connect an Android device to the IBM Cloud, build a Node-RED dashboard, and build an AI classifier

#artificialintelligenceSep-8-2020, 08:15:48 GMT

In this tutorial, we present the high-level steps that are involved in connecting an Android device to the cloud and developing analytics models to analyze sensor data. By the end of this tutorial you should be able to set up your own IoT hub for streaming, storing and processing device data. The following figure shows the architecture of our sample app. This tutorial requires an Android device (smartphone), an internet connection, and an IBM Cloud account. In Step 1 you will create an account on IBM Cloud and install an application on your Android phone.

android device, cloud computing, machine learning, (17 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Information Technology > Software (0.88)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Empirical Strategy for Stretching Probability Distribution in Neural-network-based Regression

Koo, Eunho, Kim, Hyungjun

arXiv.org Artificial IntelligenceSep-8-2020

In regression analysis under artificial neural networks, the prediction performance depends on determining the appropriate weights between layers. As randomly initialized weights are updated during back-propagation using the gradient descent procedure under a given loss function, the loss function structure can affect the performance significantly. In this study, we considered the distribution error, i.e., the inconsistency of two distributions (those of the predicted values and label), as the prediction error, and proposed weighted empirical stretching (WES) as a novel loss function to increase the overlap area of the two distributions. The function depends on the distribution of a given label, thus, it is applicable to any distribution shape. Moreover, it contains a scaling hyperparameter such that the appropriate parameter value maximizes the common section of the two distributions. To test the function capability, we generated ideal distributed curves (unimodal, skewed unimodal, bimodal, and skewed bimodal) as the labels, and used the Fourier-extracted input data from the curves under a feedforward neural network. In general, WES outperformed loss functions in wide use, and the performance was robust to the various noise levels. The improved results in RMSE for the extreme domain (i.e., both tail regions of the distribution) are expected to be utilized for prediction of abnormal events in non-linear complex systems such as natural disaster and financial crisis.

artificial intelligence, loss function, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2009.03534

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
North America > United States (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > East Asia (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback