AITopics

Country: North America > Canada (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry: Health & Medicine > Public Health (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-12-2026, 22:22:45 GMT

8fb5f8be2aa9d6c64a04e3ab9f63feee-AuthorFeedback.pdf

consistency, hyperparameter, propensity score, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

Neural Information Processing SystemsOct-3-2025, 05:18:26 GMT

Adapting Neural Networks for the Estimation of Treatment Effects

Claudia Shi, David Blei, Victor Veitch

We propose two adaptations based on insights from the statistical literature on the estimation of treatment effects.

dragonnet, estimation, regularization, (14 more...)

Country: North America > Canada (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry: Health & Medicine > Public Health (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsOct-3-2025, 05:18:11 GMT

Reviewer 1 [ The ] methodology combines multiple different ideas in causal inference (multi-headed deep learning

The baselines in their evaluations are not completely clear . In addition [...] We have clarified this. It seems weird that Equation 2.2 has no hyperparameter ... We have clarified this. Indeed, there is a hyperparameter. We used an arbitrary fixed value (1.0) to avoid unfairly advantaging our method via hyperparam search.

artificial intelligence, machine learning, methodology combine multiple different idea, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

arXiv.org Artificial IntelligenceAug-4-2025

MOSIC: Model-Agnostic Optimal Subgroup Identification with Multi-Constraint for Improved Reliability

Chen, Wenxin, Pan, Weishen, Gan, Kyra, Wang, Fei

Current subgroup identification methods typically follow a two-step approach: first estimate conditional average treatment effects and then apply thresholding or rule-based procedures to define subgroups. While intuitive, this decoupled approach fails to incorporate key constraints essential for real-world clinical decision-making, such as subgroup size and propensity overlap. These constraints operate on fundamentally different axes than CATE estimation and are not naturally accommodated within existing frameworks, thereby limiting the practical applicability of these methods. We propose a unified optimization framework that directly solves the primal constrained optimization problem to identify optimal subgroups. Our key innovation is a reformulation of the constrained primal problem as an unconstrained differentiable min-max objective, solved via a gradient descent-ascent algorithm. We theoretically establish that our solution converges to a feasible and locally optimal solution. Unlike threshold-based CATE methods that apply constraints as post-hoc filters, our approach enforces them directly during optimization. The framework is model-agnostic, compatible with a wide range of CATE estimators, and extensible to additional constraints like cost limits or fairness criteria. Extensive experiments on synthetic and real-world datasets demonstrate its effectiveness in identifying high-benefit subgroups while maintaining better satisfaction of constraints.

artificial intelligence, constraint, machine learning, (14 more...)

2504.20908

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Hines, Christian L., Hines, Oliver J.

Automatic debiasing of neural networks via moment-constrained learning

arXiv.org Machine LearningSep-29-2024

Causal and nonparametric estimands in economics and biostatistics can often be viewed as the mean of a linear functional applied to an unknown outcome regression function. Naively learning the regression function and taking a sample mean of the target functional results in biased estimators, and a rich debiasing literature has developed where one additionally learns the so-called Riesz representer (RR) of the target estimand (targeted learning, double ML, automatic debiasing etc.). Learning the RR via its derived functional form can be challenging, e.g. due to extreme inverse probability weights or the need to learn conditional density functions. Such challenges have motivated recent advances in automatic debiasing (AD), where the RR is learned directly via minimization of a bespoke loss. We propose moment-constrained learning as a new RR learning approach that addresses some shortcomings in AD, constraining the predicted moments and improving the robustness of RR estimates to optimization hyperparamters. Though our approach is not tied to a particular class of learner, we illustrate it using neural networks, and evaluate on the problems of average treatment/derivative effect estimation using semi-synthetic data. Our numerical experiments show improved performance versus state of the art benchmarks.

estimation, estimator, multiheaded 0, (15 more...)

arXiv.org Machine Learning

2409.19777

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Lin, Victoria, Morency, Louis-Philippe, Dimitriadis, Dimitrios, Sharma, Srinagesh

Counterfactual Augmentation for Multimodal Learning Under Presentation Bias

arXiv.org Artificial IntelligenceOct-30-2023

In real-world machine learning systems, labels are often derived from user behaviors that the system wishes to encourage. Over time, new models must be trained as new training examples and features become available. However, feedback loops between users and models can bias future user behavior, inducing a presentation bias in the labels that compromises the ability to train new models. In this paper, we propose counterfactual augmentation, a novel causal method for correcting presentation bias using generated counterfactual labels. Our empirical evaluations demonstrate that counterfactual augmentation yields better downstream performance compared to both uncorrected models and existing bias-correction methods. Model analyses further indicate that the generated counterfactuals align closely with true counterfactuals in an oracle setting.

counterfactual, counterfactual augmentation, presentation bias, (15 more...)

2305.14083

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Agarwal, Aayush, Bassi, Saksham

Learning high-dimensional causal effect

arXiv.org Artificial IntelligenceMar-1-2023

The scarcity of high-dimensional causal inference datasets restricts the exploration of complex deep models. In this work, we propose a method to generate a synthetic causal dataset that is high-dimensional. The synthetic data simulates a causal effect using the MNIST dataset with Bernoulli treatment values. This provides an opportunity to study varieties of models for causal effect estimation. We experiment on this dataset using Dragonnet architecture (Shi et al. (2019)) and modified architectures. We use the modified architectures to explore different types of initial Neural Network layers and observe that the modified architectures perform better in estimations. We observe that residual and transformer models estimate treatment effect very closely without the need for targeted regularization, introduced by Shi et al. (2019).

artificial intelligence, deep learning, machine learning, (16 more...)

2303.00821

Country: North America > United States > New York (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Kiriakidou, Niki, Diou, Christos

An evaluation framework for comparing causal inference models

arXiv.org Artificial IntelligenceAug-31-2022

Estimation of causal effects is the core objective of many scientific disciplines. However, it remains a challenging task, especially when the effects are estimated from observational data. Recently, several promising machine learning models have been proposed for causal effect estimation. The evaluation of these models has been based on the mean values of the error of the Average Treatment Effect (ATE) as well as of the Precision in Estimation of Heterogeneous Effect (PEHE). In this paper, we propose to complement the evaluation of causal inference models using concrete statistical evidence, including the performance profiles of Dolan and Mor{\'e}, as well as non-parametric and post-hoc statistical tests. The main motivation behind this approach is the elimination of the influence of a small number of instances or simulation on the benchmarking process, which in some cases dominate the results. We use the proposed evaluation methodology to compare several state-of-the-art causal effect estimation models.

causal inference model, evaluation framework, treatment effect, (12 more...)

2209.00115

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Shi, Claudia, Blei, David M., Veitch, Victor

Adapting Neural Networks for the Estimation of Treatment Effects

arXiv.org Machine LearningJun-5-2019

This paper addresses the use of neural networks for the estimation of treatment effects from observational data. Generally, estimation proceeds in two stages. First, we fit models for the expected outcome and the probability of treatment (propensity score) for each unit. Second, we plug these fitted models into a downstream estimator of the effect. Neural networks are a natural choice for the models in the first step. The question we address is: how can we adapt the design and training of the neural networks used in the first step in order to improve the quality of the final estimate of the treatment effect? We propose two adaptations based on insights from the statistical literature on the estimation of treatment effects. The first is a new architecture, the Dragonnet, that exploits the sufficiency of the propensity score for estimation adjustment. The second is a regularization procedure, targeted regularization, that induces a bias towards models that have non-parametrically optimal asymptotic properties `out-of-the-box`. Studies on benchmark datasets for causal inference show these adaptations outperform existing methods. Code is available at github.com/claudiashi57/dragonnet

artificial intelligence, machine learning, regularization, (17 more...)

arXiv.org Machine Learning

1906.0212

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Public Health (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)