Causal inference

Causality

The fundamental problem of causal inference is that only one potential outcome can ever be observed, unless one can go back in time and change the treatment or it is possible to find the identical subjects (e.g., in Physics or Chemistry).

Randomized controlled trial is a nice way to statistically overcome this limitation by randomly assigning people (on average the control and treatment group should be statistically equivalent with $N \rightarrow \infty$ ). But it is difficult to do in many cases and it is hard to scale up.

On the other hand, Causal inference with observational data can potentially leverage huge, sometimes population-level datasets. At the same time, it is difficult to establish strong causal relationships due to the unobserved Confounding factors. See Homophily and influence, particularly Cosma Shalizi‘s paper Shalizi2011homophily.

Big data can be a potential solution. Having rich data may provide more opportunities to measure hidden confounders (e.g., see Keith2020text regarding the usage of text data). However, big data does not automatically address the fundamental challenges (Prosperi2020causal). Causal machine learning may be able to address this.

Learning materials

Tutorials

Examples

This is my favorite teaching example for showing the importance of #CausalInference: @Google conducts an annual pay equity analysis in which they use fairly advanced statistical techniques. In 2019 they found that they were actually underpaying MEN?!

Talks

Courses

Gary King‘s course is on youtube
The lecture accompanying The Effect: https://www.youtube.com/playlist?list=PLcTBLulJV_AK1hKtnO0-kYrU0D09K-kj8
Mastering Mostly Harmless Econometrics

Methods

Here is a decent summary of the advantages and disadvantages of many tools in causal inference

Libraries

DoWhy

Books

Articles

Arnold Foundation and Vera Institute argue about a study of the effectiveness of college education programs in prison

Other refs

https://publish.obsidian.md/mrd-brain/Knowledge+Base/Causal+Inference/00+-+Causal+Inference - Matthew DeVerna‘s resources