Resolution of Simpson's paradox via the common cause principle
Hovhannisyan, A., Allahverdyan, A. E.
–arXiv.org Artificial Intelligence
Simpson's paradox is an obstacle to establishing a probabilistic association between two events $a_1$ and $a_2$, given the third (lurking) random variable $B$. We focus on scenarios when the random variables $A$ (which combines $a_1$, $a_2$, and their complements) and $B$ have a common cause $C$ that need not be observed. Alternatively, we can assume that $C$ screens out $A$ from $B$. For such cases, the correct association between $a_1$ and $a_2$ is to be defined via conditioning over $C$. This set-up generalizes the original Simpson's paradox. Now its two contradicting options simply refer to two particular and different causes $C$. We show that if $B$ and $C$ are binary and $A$ is quaternary (the minimal and the most widespread situation for valid Simpson's paradox), the conditioning over any binary common cause $C$ establishes the same direction of the association between $a_1$ and $a_2$ as the conditioning over $B$ in the original formulation of the paradox. Thus, for the minimal common cause, one should choose the option of Simpson's paradox that assumes conditioning over $B$ and not its marginalization. For tertiary (unobserved) common causes $C$ all three options of Simpson's paradox become possible (i.e. marginalized, conditional, and none of them), and one needs prior information on $C$ to choose the right option.
arXiv.org Artificial Intelligence
Mar-1-2024
- Country:
- Asia
- Europe
- Italy (0.05)
- Netherlands > North Holland
- Amsterdam (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States
- California (0.04)
- New York (0.04)
- Genre:
- Research Report (0.40)
- Industry:
- Health & Medicine (1.00)
- Leisure & Entertainment > Sports (0.46)
- Technology: