Peng, Kenny
Sparse Autoencoders for Hypothesis Generation
Movva, Rajiv, Peng, Kenny, Garg, Nikhil, Kleinberg, Jon, Pierson, Emma
We describe HypotheSAEs, a general method to hypothesize interpretable relationships between text data (e.g., headlines) and a target variable (e.g., clicks). HypotheSAEs has three steps: (1) train a sparse autoencoder on text embeddings to produce interpretable features describing the data distribution, (2) select features that predict the target variable, and (3) generate a natural language interpretation of each feature (e.g., "mentions being surprised or shocked") using an LLM. Each interpretation serves as a hypothesis about what predicts the target variable. Compared to baselines, our method better identifies reference hypotheses on synthetic datasets (at least +0.06 in F1) and produces more predictive hypotheses on real datasets (~twice as many significant findings), despite requiring 1-2 orders of magnitude less compute than recent LLM-based methods. HypotheSAEs also produces novel discoveries on two well-studied tasks: explaining partisan differences in Congressional speeches and identifying drivers of engagement with online headlines.
A No Free Lunch Theorem for Human-AI Collaboration
Peng, Kenny, Garg, Nikhil, Kleinberg, Jon
The gold standard in human-AI collaboration is complementarity -- when combined performance exceeds both the human and algorithm alone. We investigate this challenge in binary classification settings where the goal is to maximize 0-1 accuracy. Given two or more agents who can make calibrated probabilistic predictions, we show a "No Free Lunch"-style result. Any deterministic collaboration strategy (a function mapping calibrated probabilities into binary classifications) that does not essentially always defer to the same agent will sometimes perform worse than the least accurate agent. In other words, complementarity cannot be achieved "for free." The result does suggest one model of collaboration with guarantees, where one agent identifies "obvious" errors of the other agent. We also use the result to understand the necessary conditions enabling the success of other collaboration techniques, providing guidance to human-AI collaboration.
Topics, Authors, and Networks in Large Language Model Research: Trends from a Survey of 17K arXiv Papers
Movva, Rajiv, Balachandar, Sidhika, Peng, Kenny, Agostini, Gabriel, Garg, Nikhil, Pierson, Emma
Large language model (LLM) research is dramatically impacting society, making it essential to understand the topics and values it prioritizes, the authors and institutions driving it, and its networks of collaboration. Due to the recent growth of the field, many of these fundamental attributes lack systematic description. We gather, annotate, and analyze a new dataset of 16,979 LLM-related arXiv papers, focusing on changes in 2023 vs. 2018-2022. We show that LLM research increasingly focuses on societal impacts: the Computers and Society sub-arXiv has seen 20x growth in its proportion of LLM-related papers in 2023. This change is driven in part by an influx of new authors: a majority of 2023 papers are first-authored by researchers who have not previously written an LLM-related paper, and these papers focus particularly on applications and societal considerations. While a handful of companies hold outsize influence, academia publishes a much larger fraction of papers than industry overall, and this gap widens in 2023. LLM research is also being shaped by social dynamics: there are gender and academic/industry differences in the topics authors prioritize, and a stark U.S./China schism in the collaboration network. Overall, our analysis documents how LLM research both shapes and is shaped by society, attesting to the necessity of sociotechnical lenses; we discuss implications for researchers and policymakers.
REFORMS: Reporting Standards for Machine Learning Based Science
Kapoor, Sayash, Cantrell, Emily, Peng, Kenny, Pham, Thanh Hien, Bail, Christopher A., Gundersen, Odd Erik, Hofman, Jake M., Hullman, Jessica, Lones, Michael A., Malik, Momin M., Nanayakkara, Priyanka, Poldrack, Russell A., Raji, Inioluwa Deborah, Roberts, Michael, Salganik, Matthew J., Serra-Garcia, Marta, Stewart, Brandon M., Vandewiele, Gilles, Narayanan, Arvind
Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear reporting standards for ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist ($\textbf{Re}$porting Standards $\textbf{For}$ $\textbf{M}$achine Learning Based $\textbf{S}$cience). It consists of 32 questions and a paired set of guidelines. REFORMS was developed based on a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.