A primer on optimal transport for causal inference with observational data