ioannidis
Using Large Language Models to Create AI Personas for Replication and Prediction of Media Effects: An Empirical Test of 133 Published Experimental Research Findings
Yeykelis, Leo, Pichai, Kaavya, Cummings, James J., Reeves, Byron
ABSTRACT This report analyzes the potential for large language models (LLMs) to expedite accurate replication of published message effects studies. We tested LLM-powered participants (personas) by replicating 133 experimental findings from 14 papers containing 45 recent studies in the Journal of Marketing (January 2023-May 2024). We used a new software tool, Viewpoints AI (https://viewpoints.ai/), that takes study designs, stimuli, and measures as input, automatically generates prompts for LLMs to act as a specified sample of unique personas, and collects their responses to produce a final output in the form of a complete dataset and statistical analysis. The underlying LLM used was Anthropic's Claude Sonnet 3.5. We generated 19,447 AI personas to replicate these studies with the exact same sample attributes, study designs, stimuli, and measures reported in the original human research. Our LLM replications successfully reproduced 76% of the original main effects (84 out of 111), demonstrating strong potential for AI-assisted replication of studies in which people respond to media stimuli. When including interaction effects, the overall replication rate was 68% (90 out of 133). The use of LLMs to replicate and accelerate marketing research on media effects is discussed with respect to the replication crisis in social science, potential solutions to generalizability problems in sampling subjects and experimental conditions, and the ability to rapidly test consumer responses to various media stimuli. We also address the limitations of this approach, particularly in replicating complex interaction effects in media response studies, and suggest areas for future research and improvement in AI-assisted experimental replication of media effects. STUDY OVERVIEW AND RELATED WORK Research about the effectiveness of media messages is increasingly difficult, attributable to both administrative challenges (e.g., stimulus acquisition and creation, data management demands of digital trace data, acquisition of participants and especially those in special groups like children, minorities and international groups), as well as requirements to deal with new and critical challenges to the very nature of social research, as exemplified by existential issues of replication and reproducibility, and the ability to generalize findings across people, media stimuli and experimental contexts. We briefly review these issues with an eye toward our current test of whether new LLM tools may help solve the problems mentioned, and with significant advantages in cost, time, and research personnel.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
The Case Against Registered Reports
Registered reports have been proposed as a way to move from eye-catching and surprising results and toward methodologically sound practices and interesting research questions. However, none of the top-twenty artificial intelligence journals support registered reports, and no traces of registered reports can be found in the field of artificial intelligence. Is this because they do not provide value for the type of research that is conducted in the field of artificial intelligence? Registered reports have been touted as one of the solutions to the problems surrounding the reproducibility crisis. They promote good research practices and combat data dredging1.
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
DNA shows Native Americans and Polynesians hooked up 800 years ago
Paris – Native Americans and Polynesians bridged vast expanses of open ocean around the year 1200 and mingled, leaving incontrovertible proof of their encounter in the DNA of present-day populations, scientists revealed Wednesday. Whether peoples from what is today Colombia or Ecuador drifted thousands of kilometers to tiny islands in the middle of the Pacific, or whether seafaring Polynesians sailed upwind to South America and then back again is still unknown. But what is certain, according to a study in Nature, is that the hook up took place hundreds of years before Europeans set foot in either region, and left individuals scattered across French Polynesia with signature traces of the New World in their DNA. "These findings change our understanding of one of the most unknown chapters in the history of our species' great continental expansions," senior author Andreas Moreno-Estrada, principal investigator at Mexico's National Laboratory of Genomics for biodiversity, said. Archeologists and historians have tussled for decades over whether Oceana islanders and native Americans crossed paths during the Middle Ages, and how, if they did, that contact might have unfolded.
- South America > Ecuador (0.27)
- South America > Colombia (0.27)
- Oceania > French Polynesia (0.27)
- (5 more...)
An Inability to Reproduce
Science has always hinged on the idea that researchers must be able to prove and reproduce the results of their research. Simply put, that is what makes science...science. Yet in recent years, as computing power has increased, the cloud has taken shape, and data sets have grown, a problem has appeared: it has becoming increasingly difficult to generate the same results consistently--even when researchers include the same dataset. "One basic requirement of scientific results is reproducibility: shake an apple tree, and apples will fall downwards each and every time," observes Kai Zhang, an associate professor in the department of statistics and operations research at The University of North Carolina, Chapel Hill. "The problem today is that in many cases, researchers cannot replicate existing findings in the literature and they cannot produce the same conclusions. This is undermining the credibility of scientists and science. It is producing a crisis."
- North America > United States > North Carolina > Orange County > Chapel Hill (0.25)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.05)
- North America > United States > Oregon > Clackamas County > West Linn (0.05)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.70)
- Education > Educational Setting > Higher Education (0.55)
Medicine Under the Magnifying Glass – Towards Data Science
In Part 1, I'll introduce the problem of bad medicine. As you review the evidence for it, you'll also get a good sense of why it's so entrenched. In Part 2, I'll examine the implications of the problem on artificial intelligence. In his history of Bad Medicine, David Wootton argued, "We can only think about medical progress if we start with the long tradition of medical failure."
- North America > Canada > Ontario > Toronto (0.15)
- North America > United States > Oregon (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.05)
The Trouble With Scientists - Issue 54: The Unspoken
Sometimes it seems surprising that science functions at all. In 2005, medical science was shaken by a paper with the provocative title "Why most published research findings are false."1 Written by John Ioannidis, a professor of medicine at Stanford University, it didn't actually show that any particular result was wrong. Instead, it showed that the statistics of reported positive findings was not consistent with how often one should expect to find them. As Ioannidis concluded more recently, "many published research findings are false or exaggerated, and an estimated 85 percent of research resources are wasted."2 It's likely that some researchers are consciously cherry-picking data to get their work published.
- North America > United States > Virginia (0.04)
- North America > United States > Iowa (0.04)
- Europe > Netherlands > Drenthe > Assen (0.04)
- (3 more...)
17 More Must-Know Data Science Interview Questions and Answers, Part 2
Editor's note: See also part 1 of 17 More Must-Know Data Science Interview Questions and Answers. Overfitting is when you build a predictive model that fits the data "too closely", so that it captures the random noise in the data rather than true patterns. As a result, the model predictions will be wrong when applied to new data. We frequently hear about studies that report unusual results (especially if you listen to Wait Wait Don't Tell Me), or see findings like "an orange used car is least likely to be a lemon", or learn that studies overturn previous established findings (eggs are no longer bad for you). Many such studies produce questionable results that cannot be repeated.
Cross-validation failure: small sample sizes lead to large error bars
Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers. The principled approach to establish their validity and usefulness is cross-validation, testing prediction on unseen data. Here, I would like to raise awareness on error bars of cross-validation, which are often underestimated. Simple experiments show that sample sizes of many neuroimaging studies inherently lead to large error bars, eg $\pm$10% for 100 samples. The standard error across folds strongly underestimates them. These large error bars compromise the reliability of conclusions drawn with predictive models, such as biomarkers or methods developments where, unlike with cognitive neuroimaging MVPA approaches, more samples cannot be acquired by repeating the experiment across many subjects. Solutions to increase sample size must be investigated, tackling possible increases in heterogeneity of the data.
- North America > United States (0.14)
- Europe > Portugal > Braga > Braga (0.04)
- Europe > United Kingdom (0.04)
- Europe > France > Île-de-France (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.88)
17 More Must-Know Data Science Interview Questions and Answers, Part 2
Editor's note: See also part 1 of 17 More Must-Know Data Science Interview Questions and Answers. Overfitting is when you build a predictive model that fits the data "too closely", so that it captures the random noise in the data rather than true patterns. As a result, the model predictions will be wrong when applied to new data. We frequently hear about studies that report unusual results (especially if you listen to Wait Wait Don't Tell Me), or see findings like "an orange used car is least likely to be a lemon", or learn that studies overturn previous established findings (eggs are no longer bad for you). Many such studies produce questionable results that cannot be repeated.
'Silicon Valley arrogance'? Google misfires as it strives to turn Star Trek fiction into reality
Google employees, squeezed onto metal risers and standing in the back of a meeting room, erupted in cheers as newly arrived executive Andrew Conrad announced they would try to turn science fiction into reality: The tech giant had formed a biotech venture to create a futuristic device like Star Trek's iconic "Tricorder" diagnostic wizard -- and use it to cure cancer. Conrad, recalled an employee who was present, displayed images on the room's big screens showing nanoparticles tracking down cancer cells in the bloodstream and flashing signals to a Fitbit-style wristband. He promised a working prototype of the cancer early-detection device within six months. That was three years ago. Recently departed employees said the prototype didn't work as hoped, and the Tricorder project is floundering. Tricorder is not the only misfire for Google's ambitious and extravagantly funded biotech venture, now named Verily Life Sciences. It has announced three signature projects meant to transform medicine, and a STAT examination found that all of them are plagued by serious, if not fatal, scientific shortcomings, even as Verily has vigorously promoted their promise.
- North America > United States > California (0.41)
- North America > United States > Hawaii (0.05)
- North America > United States > Massachusetts (0.04)
- North America > United States > Maryland (0.04)
- Information Technology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.89)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.70)
- Information Technology > Communications (0.88)
- Information Technology > Artificial Intelligence > Robots (0.47)