The fundamental problem of causal inference is that only one potential outcome can ever be observed, unless one can go back in time and change the treatment or it is possible to find the identical subjects (e.g., in Physics or Chemistry).
Randomized controlled trial is a nice way to statistically overcome this limitation by randomly assigning people (on average the control and treatment group should be statistically equivalent with ). But it is difficult to do in many cases and it is hard to scale up.
On the other hand, Causal inference with observational data can potentially leverage huge, sometimes population-level datasets. At the same time, it is difficult to establish strong causal relationships due to the unobserved Confounding factors. See Homophily and influence, particularly Cosma Shalizi‘s paper Shalizi2011homophily.
Big data can be a potential solution. Having rich data may provide more opportunities to measure hidden confounders (e.g., see Keith2020text regarding the usage of text data). However, big data does not automatically address the fundamental challenges (Prosperi2020causal). Causal machine learning may be able to address this.
- An introduction to causal inference by George Berry
- Regression, Fire, and Dangerous Things
- An Introduction to Causal Inference by Judea Pearl
- This is my favorite teaching example for showing the importance of #CausalInference: @Google conducts an annual pay equity analysis in which they use fairly advanced statistical techniques. In 2019 they found that they were actually underpaying MEN?!
- Web Science Meets Network Science Workshop by Sinan Aral
- Causal inference for observational studies by Uri Shalit and David Sontag
- Gary King‘s course is on youtube
- The lecture accompanying The Effect: https://www.youtube.com/playlist?list=PLcTBLulJV_AK1hKtnO0-kYrU0D09K-kj8
- Mastering Mostly Harmless Econometrics
Here is a decent summary of the advantages and disadvantages of many tools in causal inference
- Difference in differences
- Matching (statistics)
- Regression discontinuity design
- Proximal causal learning
- Mostly Harmless Econometrics
- Causal Inference The Mixtape
- Hernán MA, Robins JM (2019). Causal Inference
- The Effect
- Causal Inference for The Brave and True
- Causal Analysis