AITopics | experimental sample

Collaborating Authors

experimental sample

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

917d55788726131e3bb21bf39d477f58-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 21:49:26 GMT

causal effect, denote, estimator, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Iowa (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)
Research Report > Strength High (0.45)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Education > Educational Setting > K-12 Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Cross-Validated Causal Inference: a Modern Method to Combine Experimental and Observational Data

Yang, Xuelin, Lin, Licong, Athey, Susan, Jordan, Michael I., Imbens, Guido W.

arXiv.org Machine LearningNov-4-2025

We develop new methods to integrate experimental and observational data in causal inference. While randomized controlled trials offer strong internal validity, they are often costly and therefore limited in sample size. Observational data, though cheaper and often with larger sample sizes, are prone to biases due to unmeasured confounders. To harness their complementary strengths, we propose a systematic framework that formulates causal estimation as an empirical risk minimization (ERM) problem. A full model containing the causal parameter is obtained by minimizing a weighted combination of experimental and observational losses--capturing the causal parameter's validity and the full model's fit, respectively. The weight is chosen through cross-validation on the causal parameter across experimental folds. Our experiments on real and synthetic data show the efficacy and reliability of our method. We also provide theoretical non-asymptotic error bounds.

artificial intelligence, exp, machine learning, (17 more...)

arXiv.org Machine Learning

2511.00727

Country:

Asia > Middle East > Jordan (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.49)

Add feedback

Estimating Causal Effects Identifiable from a Combination of Observations and Experiments Y onghan Jung 1, Iván Díaz

Neural Information Processing SystemsOct-9-2025, 01:30:04 GMT

Learning cause and effect relations is arguably one of the central challenges found throughout the data sciences. Formally, determining whether a collection of observational and interventional distributions can be combined to learn a target causal relation is known as the problem of generalized identification (or g-identification) [ Lee et al., 2019 ]. Although g-identification has been well understood and solved in theory, it turns out to be challenging to apply these results in practice, in particular when considering the estimation of the target distribution from finite samples. In this paper, we develop a new, general estimator that exhibits multiply robustness properties for g-identifiable causal functionals. Specifically, we show that any g-identifiable causal effect can be expressed as a function of generalized multi-outcome sequential back-door adjustments that are amenable to estimation. We then construct a corresponding estimator for the g-identification expression that exhibits robustness properties to bias. We analyze the asymptotic convergence properties of the estimator. Finally, we illustrate the use of the proposed estimator in experimental studies. Simulation results corroborate the theory.

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Iowa (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.87)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Education > Educational Setting > K-12 Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Program Evaluation with Remotely Sensed Outcomes

Rambachan, Ashesh, Singh, Rahul, Viviano, Davide

arXiv.org Machine LearningNov-16-2024

While traditional program evaluations typically rely on surveys to measure outcomes, certain economic outcomes such as living standards or environmental quality may be infeasible or costly to collect. As a result, recent empirical work estimates treatment effects using remotely sensed variables (RSVs), such mobile phone activity or satellite images, instead of ground-truth outcome measurements. Common practice predicts the economic outcome from the RSV, using an auxiliary sample of labeled RSVs, and then uses such predictions as the outcome in the experiment. We prove that this approach leads to biased estimates of treatment effects when the RSV is a post-outcome variable. We nonparametrically identify the treatment effect, using an assumption that reflects the logic of recent empirical research: the conditional distribution of the RSV remains stable across both samples, given the outcome and treatment. Our results do not require researchers to know or consistently estimate the relationship between the RSV, outcome, and treatment, which is typically mis-specified with unstructured data. We form a representation of the RSV for downstream causal inference by predicting the outcome and predicting the treatment, with better predictions leading to more precise causal estimates. We re-evaluate the efficacy of a large-scale public program in India, showing that the program's measured effects on local consumption and poverty can be replicated using satellite

artificial intelligence, experimental sample, machine learning, (17 more...)

arXiv.org Machine Learning

2411.10959

Country:

Asia > India > Andhra Pradesh (0.04)
Africa > Uganda (0.04)
Africa > Togo (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A Double Machine Learning Approach to Combining Experimental and Observational Data

Morucci, Marco, Orlandi, Vittorio, Parikh, Harsh, Roy, Sudeepa, Rudin, Cynthia, Volfovsky, Alexander

arXiv.org Artificial IntelligenceJul-3-2023

Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only one assumption is violated, we provide semi-parametrically efficient treatment effect estimators. However, our no-free-lunch theorem highlights the necessity of accurately identifying the violated assumption for consistent treatment effect estimation. We demonstrate the applicability of our approach in three real-world case studies, highlighting its relevance for practical settings.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.01449

Country:

North America > United States > Tennessee (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Probabilities of Causation: Adequate Size of Experimental and Observational Samples

Li, Ang, Mao, Ruirui, Pearl, Judea

arXiv.org Artificial IntelligenceOct-10-2022

The probabilities of causation are commonly used to solve decision-making problems. Tian and Pearl derived sharp bounds for the probability of necessity and sufficiency (PNS), the probability of sufficiency (PS), and the probability of necessity (PN) using experimental and observational data. The assumption is that one is in possession of a large enough sample to permit an accurate estimation of the experimental and observational distributions. In this study, we present a method for determining the sample size needed for such estimation, when a given confidence interval (CI) is specified. We further show by simulation that the proposed sample size delivered stable estimations of the bounds of PNS.

artificial intelligence, bernoulli, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2210.05027

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.32)
North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

GEAR: On Optimal Decision Making with Auxiliary Data

Cai, Hengrui, Song, Rui, Lu, Wenbin

arXiv.org Machine LearningApr-21-2021

Personalized optimal decision making, finding the optimal decision rule (ODR) based on individual characteristics, has attracted increasing attention recently in many fields, such as education, economics, and medicine. Current ODR methods usually require the primary outcome of interest in samples for assessing treatment effects, namely the experimental sample. However, in many studies, treatments may have a long-term effect, and as such the primary outcome of interest cannot be observed in the experimental sample due to the limited duration of experiments, which makes the estimation of ODR impossible. This paper is inspired to address this challenge by making use of an auxiliary sample to facilitate the estimation of ODR in the experimental sample. We propose an auGmented inverse propensity weighted Experimental and Auxiliary sample-based decision Rule (GEAR) by maximizing the augmented inverse propensity weighted value estimator over a class of decision rules using the experimental sample, with the primary outcome being imputed based on the auxiliary sample. The asymptotic properties of the proposed GEAR estimators and their associated value estimators are established. Simulation studies are conducted to demonstrate its empirical validity with a real AIDS application.

auxiliary sample, experimental sample, long-term outcome, (15 more...)

arXiv.org Machine Learning

2104.10573

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Tennessee (0.04)
North America > United States > North Carolina (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.66)
Health & Medicine > Therapeutic Area > Immunology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

2020 Summer Intern Projects

#artificialintelligenceDec-9-2020, 18:50:00 GMT

Thank you to all the 2020 summer interns that worked with the Stitch Fix Algorithms team. For the first time, the internship program was fully remote, but that hasn't stopped them from working on impactful projects. This post summarizes some of the projects they worked on. We appreciate all your contributions and insights! This summer, I worked on the Merchandise Algorithms team.

experiment, long-term outcome, stitch fix, (13 more...)

#artificialintelligence

Country:

North America > United States > California (0.15)
North America > United States > Illinois > Cook County > Chicago (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.31)

Add feedback

Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index

Athey, Susan, Chetty, Raj, Imbens, Guido, Kang, Hyunseung

arXiv.org Machine LearningJun-4-2016

Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be useful in causal inference. We focus primarily on a setting with two samples, an experimental sample containing data about the treatment indicator and the surrogates and an observational sample containing information about the surrogates and the primary outcome. We state assumptions under which the average treatment effect be identified and estimated with a high-dimensional vector of proxies that collectively satisfy the surrogacy assumption, and derive the bias from violations of the surrogacy assumption, and show that even if the primary outcome is also observed in the experimental sample, there is still information to be gained from using surrogates.

artificial intelligence, assumption, machine learning, (18 more...)

arXiv.org Machine Learning

1603.09326

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Tennessee (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Education > Educational Setting (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback