statistician
"Rebuilding" Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training
Donoho, David L., Kang, Jian, Lin, Xihong, Mukherjee, Bhramar, Nettleton, Dan, Nugent, Rebecca, Rodriguez, Abel, Xing, Eric P., Zheng, Tian, Zhu, Hongtu
This article presents the full, original record of the 2024 Joint Statistical Meetings (JSM) town hall, "Statistics in the Age of AI," which convened leading statisticians to discuss how the field is evolving in response to advances in artificial intelligence, foundation models, large-scale empirical modeling, and data-intensive infrastructures. The town hall was structured around open panel discussion and extensive audience Q&A, with the aim of eliciting candid, experience-driven perspectives rather than formal presentations or prepared statements. This document preserves the extended exchanges among panelists and audience members, with minimal editorial intervention, and organizes the conversation around five recurring questions concerning disciplinary culture and practices, data curation and "data work," engagement with modern empirical modeling, training for large-scale AI applications, and partnerships with key AI stakeholders. By providing an archival record of this discussion, the preprint aims to support transparency, community reflection, and ongoing dialogue about the evolving role of statistics in the data- and AI-centric future.
- Europe > United Kingdom (0.14)
- North America > United States > North Carolina (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- (3 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.46)
- Personal > Interview (0.34)
- Government (1.00)
- Information Technology (0.68)
- Education > Educational Setting > Higher Education (0.67)
- Health & Medicine > Health Care Technology > Medical Record (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- (2 more...)
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the model. This work first provides a comprehensive statistical theory for transformers to perform ICL. Concretely, we show that transformers can implement a broad class of standard machine learning algorithms in context, such as least squares, ridge regression, Lasso, learning generalized linear models, and gradient descent on two-layer neural networks, with near-optimal predictive power on various in-context data distributions. Using an efficient implementation of in-context gradient descent as the underlying mechanism, our transformer constructions admit mild size bounds, and can be learned with polynomially many pretraining sequences. Building on these ``base'' ICL algorithms, intriguingly, we show that transformers can implement more complex ICL procedures involving \emph{in-context algorithm selection}, akin to what a statistician can do in real life---A \emph{single} transformer can adaptively select different base ICL algorithms---or even perform qualitatively different tasks---on different input sequences, without any explicit prompting of the right algorithm or task. We both establish this in theory by explicit constructions, and also observe this phenomenon experimentally. In theory, we construct two general mechanisms for algorithm selection with concrete examples: pre-ICL testing, and post-ICL validation. As an example, we use the post-ICL validation mechanism to construct a transformer that can perform nearly Bayes-optimal ICL on a challenging task---noisy linear models with mixed noise levels. Experimentally, we demonstrate the strong in-context algorithm selection capabilities of standard transformer architectures.
Empowering Clinical Trial Design through AI: A Randomized Evaluation of PowerGPT
Lu, Yiwen, Li, Lu, Zhang, Dazheng, Jian, Xinyao, Wang, Tingyin, Chen, Siqi, Lei, Yuqing, Tong, Jiayi, Xi, Zhaohan, Chu, Haitao, Luo, Chongliang, Ogdie, Alexis, Athey, Brian, Turan, Alparslan, Abramoff, Michael, Cappelleri, Joseph C, Xu, Hua, Lu, Yun, Berlin, Jesse, Sessler, Daniel I., Asch, David A., Jiang, Xiaoqian, Chen, Yong
Sample size calculations for power analysis are critical for clinical research and trial design, yet their complexity and reliance on statistical expertise create barriers for many researchers. We introduce PowerGPT, an AI-powered system integrating large language models (LLMs) with statistical engines to automate test selection and sample size estimation in trial design. In a randomized trial to evaluate its effectiveness, PowerGPT significantly improved task completion rates (99.3% vs. 88.9% for test selection, 99.3% vs. 77.8% for sample size calculation) and accuracy (94.1% vs. 55.4% in sample size estimation, p < 0.001), while reducing average completion time (4.0 vs. 9.3 minutes, p < 0.001). These gains were consistent across various statistical tests and benefited both statisticians and non-statisticians as well as bridging expertise gaps. Already under deployment across multiple institutions, PowerGPT represents a scalable AI-driven approach that enhances accessibility, efficiency, and accuracy in statistical power analysis for clinical research.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.15)
- North America > United States > Texas > Harris County > Houston (0.14)
- (13 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Testing for LLM response differences: the case of a composite null consisting of semantically irrelevant query perturbations
Acharyya, Aranyak, Priebe, Carey E., Helm, Hayden S.
Given an input query, generative models such as large language models produce a random response drawn from a response distribution. Given two input queries, it is natural to ask if their response distributions are the same. While traditional statistical hypothesis testing is designed to address this question, the response distribution induced by an input query is often sensitive to semantically irrelevant perturbations to the query, so much so that a traditional test of equality might indicate that two semantically equivalent queries induce statistically different response distributions. As a result, the outcome of the statistical test may not align with the user's requirements. In this paper, we address this misalignment by incorporating into the testing procedure consideration of a collection of semantically similar queries. In our setting, the mapping from the collection of user-defined semantically similar queries to the corresponding collection of response distributions is not known a priori and must be estimated, with a fixed budget. Although the problem we address is quite general, we focus our analysis on the setting where the responses are binary, show that the proposed test is asymptotically valid and consistent, and discuss important practical considerations with respect to power and computation.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Maryland > Baltimore (0.04)
Nonparametric learning of heterogeneous graphical model on network-linked data
Wang, Yuwen, Liu, Changyu, He, Xin, Wang, Junhui
Graphical models have been popularly used for capturing conditional independence structure in multivariate data, which are often built upon independent and identically distributed observations, limiting their applicability to complex datasets such as network-linked data. This paper proposes a nonparametric graphical model that addresses these limitations by accommodating heterogeneous graph structures without imposing any specific distributional assumptions. The proposed estimation method effectively integrates network embedding with nonparametric graphical model estimation. It further transforms the graph learning task into solving a finite-dimensional linear equation system by leveraging the properties of vector-valued reproducing kernel Hilbert space. Moreover, theoretical guarantees are established for the proposed method in terms of the estimation consistency and exact recovery of the heterogeneous graph structures. Its effectiveness is also demonstrated through a variety of simulated examples and a real application to the statistician coauthorship dataset.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Artificial Intelligence > Systems & Languages (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
An Overview of Large Language Models for Statisticians
Ji, Wenlong, Yuan, Weizhe, Getzen, Emily, Cho, Kyunghyun, Jordan, Michael I., Mei, Song, Weston, Jason E, Su, Weijie J., Xu, Jing, Zhang, Linjun
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decision-making. While their success has primarily been driven by advances in computational power and deep learning architectures, emerging problems -- in areas such as uncertainty quantification, decision-making, causal inference, and distribution shift -- require a deeper engagement with the field of statistics. This paper explores potential areas where statisticians can make important contributions to the development of LLMs, particularly those that aim to engender trustworthiness and transparency for human users. Thus, we focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation. We also consider possible roles for LLMs in statistical analysis. By bridging AI and statistics, we aim to foster a deeper collaboration that advances both the theoretical foundations and practical applications of LLMs, ultimately shaping their role in addressing complex societal challenges.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- (2 more...)
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the model. This work first provides a comprehensive statistical theory for transformers to perform ICL. Concretely, we show that transformers can implement a broad class of standard machine learning algorithms in context, such as least squares, ridge regression, Lasso, learning generalized linear models, and gradient descent on two-layer neural networks, with near-optimal predictive power on various in-context data distributions. Using an efficient implementation of in-context gradient descent as the underlying mechanism, our transformer constructions admit mild size bounds, and can be learned with polynomially many pretraining sequences. Building on these base'' ICL algorithms, intriguingly, we show that transformers can implement more complex ICL procedures involving \emph{in-context algorithm selection}, akin to what a statistician can do in real life---A \emph{single} transformer can adaptively select different base ICL algorithms---or even perform qualitatively different tasks---on different input sequences, without any explicit prompting of the right algorithm or task.
On the Edge by Nate Silver review – the art of risk-taking
Nothing is more interesting to poker players and less interesting to everyone else than a breathless recounting of who bet how much with a jack and six of clubs in some game years ago. There's an awful lot of that kind of thing in this book, which celebrates poker players as paradigmatic citizens of a global intellectual community it calls "the River", which also counts among its inhabitants venture capitalists, crypto traders, fashionable philosophers and mild-mannered statisticians. One such statistician, Nate Silver himself, came to public prominence as a data-driven analyst of political polls at his website FiveThirtyEight, which predicted the results of US elections in 2008 and 2012 with seemingly uncanny accuracy. But before that he was a poker player, making money especially in the nascent internet-casino business, until Congress banned online poker in 2006. That, he has said, was his political awakening.
- Banking & Finance (0.90)
- Leisure & Entertainment > Games (0.77)
- Government > Regional Government > North America Government > United States Government (0.35)
Statistics and explainability: a fruitful alliance
In this paper, we propose standard statistical tools as a solution to commonly highlighted problems in the explainability literature. Indeed, leveraging statistical estimators allows for a proper definition of explanations, enabling theoretical guarantees and the formulation of evaluation metrics to quantitatively assess the quality of explanations. This approach circumvents, among other things, the subjective human assessment currently prevalent in the literature. Moreover, we argue that uncertainty quantification is essential for providing robust and trustworthy explanations, and it can be achieved in this framework through classical statistical procedures such as the bootstrap. However, it is crucial to note that while Statistics offers valuable contributions, it is not a panacea for resolving all the challenges. Future research avenues could focus on open problems, such as defining a purpose for the explanations or establishing a statistical framework for counterfactual or adversarial scenarios.
- North America > United States > New York (0.04)
- Europe > Switzerland (0.04)
Batched Nonparametric Contextual Bandits
We study nonparametric contextual bandits under batch constraints, where the expected reward for each action is modeled as a smooth function of covariates, and the policy updates are made at the end of each batch of observations. We establish a minimax regret lower bound for this setting and propose Batched Successive Elimination with Dynamic Binning (BaSEDB) that achieves optimal regret (up to logarithmic factors). In essence, BaSEDB dynamically splits the covariate space into smaller bins, carefully aligning their widths with the batch size. We also show the suboptimality of static binning under batch constraints, highlighting the necessity of dynamic binning. Additionally, our results suggest that a nearly constant number of policy updates can attain optimal regret in the fully online setting.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)