Goto

Collaborating Authors

 true effect


Chapter 2 Error control

#artificialintelligence

If you perform a study and plan to make a claim based on the statistical test you plan to perform, the long run probability of making a correct claim or an erroneous claim is determined by three factors, namely the Type 1 error rate, the Type 2 error rate, and the probability that the null hypothesis is true. There are four possible outcomes of a statistical test, depending on whether the result is statistically significant or not, and whether the null hypothesis is true, or not. False Positive (FP): Concluding there is a true effect, when there is a no true effect (\(H_0\) is true). This is also referred to as a Type 1 error, and indicated by \(\alpha\). False Negative (FN): Concluding there is a no true effect, when there is a true effect (\(H_1\) is true).


Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding

Osama, Muhammad, Zachariah, Dave, Schön, Thomas

arXiv.org Machine Learning

We address the problem of inferring the causal effect of an exposure on an outcome across space, using observational data. The data is possibly subject to unmeasured confounding variables which, in a standard approach, must be adjusted for by estimating a nuisance function. Here we develop a method that eliminates the nuisance function, while mitigating the resulting errors-in-variables. The result is a robust and accurate inference method for spatially varying heterogeneous causal effects. The properties of the method are demonstrated on synthetic as well as real data from Germany and the US.


Estimating causal effects of time-dependent exposures on a binary endpoint in a high-dimensional setting

Asvatourian, Vahé, Coutzac, Clélia, Chaput, Nathalie, Robert, Caroline, Michiels, Stefan, Lanoy, Emilie

arXiv.org Machine Learning

Recently, the intervention calculus when the DAG is absent (IDA) method was developed to estimate lower bounds of causal effects from observational high-dimensional data. Originally it was introduced to assess the effect of baseline biomarkers which do not vary over time. However, in many clinical settings, measurements of biomarkers are repeated at fixed time points during treatment exposure and, therefore, this method need to be extended. The purpose of this paper is then to extend the first step of the IDA, the Peter Clarks (PC)-algorithm, to a time-dependent exposure in the context of a binary outcome. We generalised the PC-algorithm for taking into account the chronological order of repeated measurements of the exposure and propose to apply the IDA with our new version, the chronologically ordered PC-algorithm (COPC-algorithm). A simulation study has been performed before applying the method for estimating causal effects of time-dependent immunological biomarkers on toxicity, death and progression in patients with metastatic melanoma. The simulation study showed that the completed partially directed acyclic graphs (CPDAGs) obtained using COPC-algorithm were structurally closer to the true CPDAG than CPDAGs obtained using PC-algorithm. Also, causal effects were more accurate when they were estimated based on CPDAGs obtained using COPC-algorithm. Moreover, CPDAGs obtained by COPC-algorithm allowed removing non-chronologic arrows with a variable measured at a time t pointing to a variable measured at a time t' where t'< t. Bidirected edges were less present in CPDAGs obtained with the COPC-algorithm, supporting the fact that there was less variability in causal effects estimated from these CPDAGs. The COPC-algorithm provided CPDAGs that keep the chronological structure present in the data, thus allowed to estimate lower bounds of the causal effect of time-dependent biomarkers.


Should we worry about rigged priors? A long discussion.

#artificialintelligence

Today's discussion starts with Stuart Buck, who came across a post by John Cook linking to my post, "Bayesian statistics: What's it all about?". Cook wrote about the benefit of prior distributions in making assumptions explicit. Buck shared Cook's post with Jon Baron, who wrote: My concern is that if researchers are systematically too optimistic (or even self-deluded) about about the prior evidence--which I think is usually the case--then using prior distributions as the basis for their new study can lead to too much statistical confidence in the study's results. And so could compound the problem. My response to Jon is that I think all aspects of a model should be justified.