Machine learning the first stage in 2SLS: Practical guidance from bias decomposition and simulation

Lennon, Connor, Rubin, Edward, Waddell, Glen

May-20-2025–arXiv.org Machine Learning

Machine learning (ML) primarily evolved to solve "prediction problems." The first stage of two-stage least squares (2SLS) is a prediction problem, suggesting potential gains from ML first-stage assistance. However, little guidance exists on when ML helps 2SLS$\unicode{x2014}$or when it hurts. We investigate the implications of inserting ML into 2SLS, decomposing the bias into three informative components. Mechanically, ML-in-2SLS procedures face issues common to prediction and causal-inference settings$\unicode{x2014}$and their interaction. Through simulation, we show linear ML methods (e.g., post-Lasso) work well, while nonlinear methods (e.g., random forests, neural nets) generate substantial bias in second-stage estimates$\unicode{x2014}$potentially exceeding the bias of endogenous OLS.

artificial intelligence, instrument, machine learning, (17 more...)

arXiv.org Machine Learning

May-20-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.04)
  - Oregon > Lane County
    - Eugene (0.04)
  - New Jersey > Mercer County
    - Princeton (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)
- Asia
  - China (0.04)
  - Philippines (0.04)

Genre:
- Research Report (1.00)
- Instructional Material > Training Manual (0.40)

Industry:
- Banking & Finance > Economy (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found