AITopics | chernozhukov

Collaborating Authors

chernozhukov

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Estimation in high-dimensional linear regression: Post-Double-Autometrics as an alternative to Post-Double-Lasso

Hué, Sullivan, Laurent, Sébastien, Aiounou, Ulrich, Flachaire, Emmanuel

arXiv.org Machine LearningNov-27-2025

Post-Double-Lasso is becoming the most popular method for estimating linear regression models with many covariates when the purpose is to obtain an accurate estimate of a parameter of interest, such as an average treatment effect. However, this method can suffer from substantial omitted variable bias in finite sample. We propose a new method called Post-Double-Autometrics, which is based on Autometrics, and show that this method outperforms Post-Double-Lasso.

covariate, post-double-autometric, post-double-lasso, (14 more...)

arXiv.org Machine Learning

2511.21257

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
South America > Brazil (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.68)
Education > Educational Setting > K-12 Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback

Beyond the Average: Distributional Causal Inference under Imperfect Compliance

Byambadalai, Undral, Hirata, Tomu, Oka, Tatsushi, Yasui, Shota

arXiv.org Machine LearningSep-22-2025

We study the estimation of distributional treatment effects in randomized experiments with imperfect compliance. When participants do not adhere to their assigned treatments, we leverage treatment assignment as an instrumental variable to identify the local distributional treatment effect-the difference in outcome distributions between treatment and control groups for the subpopulation of compliers. We propose a regression-adjusted estimator based on a distribution regression framework with Neyman-orthogonal moment conditions, enabling robustness and flexibility with high-dimensional covariates. Our approach accommodates continuous, discrete, and mixed discrete-continuous outcomes, and applies under a broad class of covariate-adaptive randomization schemes, including stratified block designs and simple random sampling. We derive the estimator's asymptotic distribution and show that it achieves the semiparametric efficiency bound. Simulation results demonstrate favorable finite-sample performance, and we demonstrate the method's practical relevance in an application to the Oregon Health Insurance Experiment.

estimator, pre-randomization number, treatment effect, (14 more...)

arXiv.org Machine Learning

2509.15594

Country:

North America > United States > Oregon (0.25)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Adventures in Demand Analysis Using AI

Bach, Philipp, Chernozhukov, Victor, Klaassen, Sven, Spindler, Martin, Teichert-Kluge, Jan, Vijaykumar, Suhas

arXiv.org Machine LearningDec-31-2024

This paper advances empirical demand analysis by integrating multimodal product representations derived from artificial intelligence (AI). Using a detailed dataset of toy cars on \textit{Amazon.com}, we combine text descriptions, images, and tabular covariates to represent each product using transformer-based embedding models. These embeddings capture nuanced attributes, such as quality, branding, and visual characteristics, that traditional methods often struggle to summarize. Moreover, we fine-tune these embeddings for causal inference tasks. We show that the resulting embeddings substantially improve the predictive accuracy of sales ranks and prices and that they lead to more credible causal estimates of price elasticity. Notably, we uncover strong heterogeneity in price elasticity driven by these product-specific features. Our findings illustrate that AI-driven representations can enrich and modernize empirical demand analysis. The insights generated may also prove valuable for applied causal inference more broadly.

elasticity, price sensitivity, representation, (15 more...)

arXiv.org Machine Learning

2501.00382

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semiparametric inference for impulse response functions using double/debiased machine learning

Ballinari, Daniele, Wehrli, Alexander

arXiv.org Machine LearningNov-15-2024

We introduce a double/debiased machine learning (DML) estimator for the impulse response function (IRF) in settings where a time series of interest is subjected to multiple discrete treatments, assigned over time, which can have a causal effect on future outcomes. The proposed estimator can rely on fully nonparametric relations between treatment and outcome variables, opening up the possibility to use flexible machine learning approaches to estimate IRFs. To this end, we extend the theory of DML from an i.i.d. to a time series setting and show that the proposed DML estimator for the IRF is consistent and asymptotically normally distributed at the parametric rate, allowing for semiparametric inference for dynamic effects in a time series setting. The properties of the estimator are validated numerically in finite samples by applying it to learn the IRF in the presence of serial dependence in both the confounder and observation innovation processes. We also illustrate the methodology empirically by applying it to the estimation of the effects of macroeconomic shocks.

bias std, estimator, nuisance function, (14 more...)

arXiv.org Machine Learning

2411.10009

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Banking & Finance > Economy (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)

Add feedback

Choice Models and Permutation Invariance

Singh, Amandeep, Liu, Ye, Yoganarasimhan, Hema

arXiv.org Artificial IntelligenceJul-13-2023

Choice Modeling is at the core of many economics, operations, and marketing problems. In this paper, we propose a fundamental characterization of choice functions that encompasses a wide variety of extant choice models. We demonstrate how nonparametric estimators like neural nets can easily approximate such functionals and overcome the curse of dimensionality that is inherent in the non-parametric estimation of choice functions. We demonstrate through extensive simulations that our proposed functionals can flexibly capture underlying consumer behavior in a completely data-driven fashion and outperform traditional parametric models. As demand settings often exhibit endogenous features, we extend our framework to incorporate estimation under endogenous features. Further, we also describe a formal inference procedure to construct valid confidence intervals on objects of interest like price elasticity. Finally, to assess the practical applicability of our estimator, we utilize a real-world dataset from S. Berry, Levinsohn, and Pakes (1995). Our empirical analysis confirms that the estimator generates realistic and comparable own- and cross-price elasticities that are consistent with the observations reported in the existing literature.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.0709

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Banking & Finance (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Causally Learning an Optimal Rework Policy

Schacht, Oliver, Klaassen, Sven, Schwarz, Philipp, Spindler, Martin, Grünbaum, Daniel, Imhof, Sebastian

arXiv.org Artificial IntelligenceJun-7-2023

In manufacturing, rework refers to an optional step of a production process which aims to eliminate errors or remedy products that do not meet the desired quality standards. Reworking a production lot involves repeating a previous production stage with adjustments to ensure that the final product meets the required specifications. While offering the chance to improve the yield and thus increase the revenue of a production lot, a rework step also incurs additional costs. Additionally, the rework of parts that already meet the target specifications may damage them and decrease the yield. In this paper, we apply double/debiased machine learning (DML) to estimate the conditional treatment effect of a rework step during the color conversion process in opto-electronic semiconductor manufacturing on the final product yield. We utilize the implementation DoubleML to develop policies for the rework of components and estimate their value empirically. From our causal machine learning analysis we derive implications for the coating of monochromatic LEDs with conversion layers.

artificial intelligence, machine learning, treatment effect, (17 more...)

arXiv.org Artificial Intelligence

2306.04223

Country:

Europe > Germany > Bavaria > Regensburg (0.04)
Europe > Germany > Hamburg (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report (0.50)
Workflow (0.48)

Industry: Semiconductors & Electronics (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Heterogeneous Treatment Effect Bounds under Sample Selection with an Application to the Effects of Social Media on Political Polarization

Heiler, Phillip

arXiv.org Machine LearningMay-16-2023

We propose a method for estimation and inference for bounds for heterogeneous causal effect parameters in general sample selection models where the treatment can affect whether an outcome is observed and no exclusion restrictions are available. The method provides conditional effect bounds as functions of policy relevant pre-treatment variables. It allows for conducting valid statistical inference on the unidentified conditional effects. We use a flexible debiased/double machine learning approach that can accommodate non-linear functional forms and high-dimensional confounders. Easily verifiable high-level conditions for estimation, misspecification robust confidence intervals, and uniform confidence bands are provided as well. Re-analyzing data from a large scale field experiment on Facebook, we find significant depolarization effects of counter-attitudinal news subscription nudges. The effect bounds are highly heterogeneous and suggest strong depolarization effects for moderates, conservatives, and younger users.

artificial intelligence, machine learning, social media, (20 more...)

arXiv.org Machine Learning

2209.04329

Country:

North America > United States > New York (0.04)
Europe > Denmark (0.04)
Asia > India (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > News (0.92)
Government > Voting & Elections (0.92)
Information Technology > Services (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A Machine Learning Approach to Measuring Climate Adaptation

Vilgalys, Max

arXiv.org Machine LearningFeb-2-2023

I measure adaptation to climate change by comparing elasticities from short-run and long-run changes in damaging weather. I propose a debiased machine learning approach to flexibly measure these elasticities in panel settings. In a simulation exercise, I show that debiased machine learning has considerable benefits relative to standard machine learning or ordinary least squares, particularly in high-dimensional settings. I then measure adaptation to damaging heat exposure in United States corn and soy production. Using rich sets of temperature and precipitation variation, I find evidence that short-run impacts from damaging heat are significantly offset in the long run. I show that this is because the impacts of long-run changes in heat exposure do not follow the same functional form as short-run shocks to heat exposure.

artificial intelligence, exposure, machine learning, (15 more...)

arXiv.org Machine Learning

2302.01236

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Africa (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Food & Agriculture > Agriculture (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Orthogonal Series Estimation for the Ratio of Conditional Expectation Functions

Shinoda, Kazuhiko, Hoshino, Takahiro

arXiv.org Machine LearningDec-26-2022

In various fields of data science, researchers are often interested in estimating the ratio of conditional expectation functions (CEFR). Specifically in causal inference problems, it is sometimes natural to consider ratio-based treatment effects, such as odds ratios and hazard ratios, and even difference-based treatment effects are identified as CEFR in some empirically relevant settings. This chapter develops the general framework for estimation and inference on CEFR, which allows the use of flexible machine learning for infinite-dimensional nuisance parameters. In the first stage of the framework, the orthogonal signals are constructed using debiased machine learning techniques to mitigate the negative impacts of the regularization bias in the nuisance estimates on the target estimates. The signals are then combined with a novel series estimator tailored for CEFR. We derive the pointwise and uniform asymptotic results for estimation and inference on CEFR, including the validity of the Gaussian bootstrap, and provide low-level sufficient conditions to apply the proposed framework to some specific examples. We demonstrate the finite-sample performance of the series estimator constructed under the proposed framework by numerical simulations. Finally, we apply the proposed method to estimate the causal effect of the 401(k) program on household assets.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2212.13145

Country: North America > United States (0.27)

Genre: Research Report > Experimental Study (0.92)

Industry:

Health & Medicine (0.67)
Banking & Finance (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Omitted Variable Bias in Machine Learned Causal Models

Chernozhukov, Victor, Cinelli, Carlos, Newey, Whitney, Sharma, Amit, Syrgkanis, Vasilis

arXiv.org Machine LearningDec-29-2021

We derive general, yet simple, sharp bounds on the size of the omitted variable bias for a broad class of causal parameters that can be identified as linear functionals of the conditional expectation function of the outcome. Such functionals encompass many of the traditional targets of investigation in causal inference studies, such as, for example, (weighted) average of potential outcomes, average treatment effects (including subgroup effects, such as the effect on the treated), (weighted) average derivatives, and policy effects from shifts in covariate distribution -- all for general, nonparametric causal models. Our construction relies on the Riesz-Frechet representation of the target functional. Specifically, we show how the bound on the bias depends only on the additional variation that the latent variables create both in the outcome and in the Riesz representer for the parameter of interest. Moreover, in many important cases (e.g, average treatment effects in partially linear models, or in nonseparable models with a binary treatment) the bound is shown to depend on two easily interpretable quantities: the nonparametric partial $R^2$ (Pearson's "correlation ratio") of the unobserved variables with the treatment and with the outcome. Therefore, simple plausibility judgments on the maximum explanatory power of omitted variables (in explaining treatment and outcome variation) are sufficient to place overall bounds on the size of the bias. Finally, leveraging debiased machine learning, we provide flexible and efficient statistical inference methods to estimate the components of the bounds that are identifiable from the observed distribution.

confounder, omitted variable bias, regression, (16 more...)

arXiv.org Machine Learning

2112.13398

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.62)

Add feedback