Goto

Collaborating Authors

 Education


Nobel prizewinner Omar Yaghi says his invention will change the world

New Scientist

Chemist Omar Yaghi invented materials called MOFs, a few grams of which have the surface area of a football field. In school, we learn about the Stone Age, the Bronze Age - and we are currently in a silicon age characterised by computers and phones. What might define the next age? Omar Yaghi at the University of California, Berkeley, thinks a family of materials he helped pioneer in the 1990s has a good shot. They are metal-organic frameworks (MOFs), and working out how to make them earned him a share of the 2025 Nobel prize in chemistry .


"Rebuilding" Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training

arXiv.org Machine Learning

This article presents the full, original record of the 2024 Joint Statistical Meetings (JSM) town hall, "Statistics in the Age of AI," which convened leading statisticians to discuss how the field is evolving in response to advances in artificial intelligence, foundation models, large-scale empirical modeling, and data-intensive infrastructures. The town hall was structured around open panel discussion and extensive audience Q&A, with the aim of eliciting candid, experience-driven perspectives rather than formal presentations or prepared statements. This document preserves the extended exchanges among panelists and audience members, with minimal editorial intervention, and organizes the conversation around five recurring questions concerning disciplinary culture and practices, data curation and "data work," engagement with modern empirical modeling, training for large-scale AI applications, and partnerships with key AI stakeholders. By providing an archival record of this discussion, the preprint aims to support transparency, community reflection, and ongoing dialogue about the evolving role of statistics in the data- and AI-centric future.


Covariate-assisted Grade of Membership Models via Shared Latent Geometry

arXiv.org Machine Learning

The grade of membership model is a flexible latent variable model for analyzing multivariate categorical data through individual-level mixed membership scores. In many modern applications, auxiliary covariates are collected alongside responses and encode information about the same latent structure. Traditional approaches to incorporating such covariates typically rely on fully specified joint likelihoods, which are computationally intensive and sensitive to misspecification. We introduce a covariate-assisted grade of membership model that integrates response and covariate information by exploiting their shared low-rank simplex geometry, rather than modeling their joint distribution. We propose a likelihood-free spectral estimation procedure that combines heterogeneous data sources through a balance parameter controlling their relative contribution. To accommodate high-dimensional and heteroskedastic noise, we employ heteroskedastic principal component analysis before performing simplex-based geometric recovery. Our theoretical analysis establishes weaker identifiability conditions than those required in the covariate-free model, and further derives finite-sample, entrywise error bounds for both mixed membership scores and item parameters. These results demonstrate that auxiliary covariates can provably improve latent structure recovery, yielding faster convergence rates in high-dimensional regimes. Simulation studies and an application to educational assessment data illustrate the computational efficiency, statistical accuracy, and interpretability gains of the proposed method. The code for reproducing these results is open-source and available at \texttt{https://github.com/Toby-X/Covariate-Assisted-GoM}


Inside OpenAI's big play for science

MIT Technology Review

An exclusive conversation with Kevin Weil, head of OpenAI for Science, a new in-house team that wants to make scientists more productive. In the three years since ChatGPT's explosive debut, OpenAI's technology has upended a remarkable range of everyday activities at home, at work, in schools--anywhere people have a browser open or a phone out, which is everywhere. Now OpenAI is making an explicit play for scientists. In October, the firm announced that it had launched a whole new team, called OpenAI for Science, dedicated to exploring how its large language models could help scientists and tweaking its tools to support them. The last couple of months have seen a slew of social media posts and academic publications in which mathematicians, physicists, biologists, and others have described how LLMs (and OpenAI's GPT-5 in particular) have helped them make a discovery or nudged them toward a solution they might otherwise have missed. In part, OpenAI for Science was set up to engage with this community.


The power of sound in a virtual world

MIT Technology Review

In the digital age, sound is proving to be the greatest connector of all, says Erik Vaveris, vice president of product management and CMO at Shure, and Brian Scholl, director of the Perception and Cognition Laboratory at Yale University. In an era where business, education, and even casual conversations occur via screens, sound has become a differentiating factor. We obsess over lighting, camera angles, and virtual backgrounds, but how we sound can be just as critical to credibility, trust, and connection. Both see audio as more than a technical layer: It's a human factor shaping how people perceive intelligence, trustworthiness, and authority in virtual settings. If you're willing to take a little bit of time with your audio set up, you can really get across the full power of your message and the full power of who you are to your peers, to your employees, your boss, your suppliers, and of course, your customers, says Vaveris. Scholl's research shows that poor audio quality can make a speaker seem less persuasive, less hireable, and even less credible. We know that [poor] sound doesn't reflect the people themselves, but we really just can't stop ourselves from having those impressions, says Scholl. We all understand intuitively that if we're having difficulty being understood while we're talking, then that's bad. But we sort of think that as long as you can make out the words I'm saying, then that's probably all fine. And this research showed in a somewhat surprising way, to a surprising degree, that this is not so. For organizations navigating hybrid work, training, and marketing, the stakes have become high. Vaveris points out that the pandemic was a watershed moment for audio technology. As classrooms, boardrooms, and conferences shifted online almost overnight, demand accelerated for advanced noise suppression, echo cancellation, and AI-driven processing tools that make meetings more seamless. Today, machine learning algorithms can strip away keyboard clicks or reverberation and isolate a speaker's voice in noisy environments. That clarity underpins the accuracy of AI meeting assistants that can step in to transcribe, summarize, and analyze discussions. The implications across industries are rippling. It empowers executives and creators alike to produce broadcast-quality content from the comfort of their home office. And it offers companies new ways to build credibility with customers and employees without the costly overhead of traditional production.


A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMs

arXiv.org Machine Learning

Understanding the curvature evolution of the loss landscape is fundamental to analyzing the training dynamics of neural networks. The most commonly studied measure, Hessian sharpness ($λ_{\max}^H$) -- the largest eigenvalue of the loss Hessian -- determines local training stability and interacts with the learning rate throughout training. Despite its significance in analyzing training dynamics, direct measurement of Hessian sharpness remains prohibitive for Large Language Models (LLMs) due to high computational cost. We analyze $\textit{critical sharpness}$ ($λ_c$), a computationally efficient measure requiring fewer than $10$ forward passes given the update direction $Δ\mathbfθ$. Critically, this measure captures well-documented Hessian sharpness phenomena, including progressive sharpening and Edge of Stability. Using this measure, we provide the first demonstration of these sharpness phenomena at scale, up to $7$B parameters, spanning both pre-training and mid-training of OLMo-2 models. We further introduce $\textit{relative critical sharpness}$ ($λ_c^{1\to 2}$), which quantifies the curvature of one loss landscape while optimizing another, to analyze the transition from pre-training to fine-tuning and guide data mixing strategies. Critical sharpness provides practitioners with a practical tool for diagnosing curvature dynamics and informing data composition choices at scale. More broadly, our work shows that scalable curvature measures can provide actionable insights for large-scale training.


NTSB will investigate why Waymo's robotaxis are illegally passing school buses

Engadget

The safety probe comes after Waymo did a voluntary software recall late last year addressing the same issue. Waymo has caught the attention of the National Transportation Safety Board as the federal agency launched an official investigation into the company for its robotaxis improperly passing school buses in Austin, Texas. The NTSB said on X that it would examine the interaction between Waymo vehicles and school buses stopped for loading and unloading students. The latest federal probe stems from a preliminary evaluation by the National Highway Traffic Safety Administration that looked into how Waymo reacts to stopped school buses in the Texas city. That report led to Waymo's voluntary software recall in December.


Forgotten, priceless medieval book found in school library

Popular Science

The hermit and mystic Richard Rolles was basically a bestselling author in the Middle Ages. Richard Rolle (depicted in this medieval illustration c. 1400) was a famous hermit and Christian mystic. Breakthroughs, discoveries, and DIY tips sent six days a week. For generations, a misidentified medieval manuscript was hidden in a 474-year-old English boarding school's library. After a careful new analysis, a medieval literature researcher can confirm the manuscript is actually the oldest and only known edition of Richard Rolle's () written in its original Latin.


Tired? You may have social jetlag.

Popular Science

You may have social jetlag. When life gets in the way of your body's ideal sleep schedule, things get messy. Waking up early can be bad for you. Breakthroughs, discoveries, and DIY tips sent six days a week. Hours before sunrise, society's earliest larks begin their day.


Revealed: The hilarious slang used in London 300 years ago - so, do YOU know your 'fuddle cups' from your 'cackling farts'?

Daily Mail - Science & tech

Border czar rips Virginia's new'Bond villain' governor after she blocked ICE on day one... as he lays out plans to move forward without her Texas's largest city warned temperatures will plunge below freezing for 40 HOURS as millions brace for life-threatening storm Mysterious UFO-shaped'Dorito' aircraft spotted over Area 51 as strange military code is heard Meghan Trainor's teary photo with her new baby born via surrogate has sparked an almost unsayable thought. Most women won't admit it... but I will: CAROLINE BULLOCK Billionaire who predicted 2008 crash issues stark warning over'worrying' new US trend but there's one way to protect your savings AND make money McDonald's customers mind-blown after seeing prices on 2009 menu...'when life was worth living' Ryan Reynolds's TORCHED by fans over'cringe' email he allegedly sent to It Ends With Us author Colleen Hoover Florida, Texas and California lead America's housing crash as other Sun Belt states start to crack as values plunge 7.6 percent Canadian woman was euthanized'against her will' after husband was fed-up with caring for her Ex-cop who was beaten on Jan. 6 unleashes on election skeptic in chaotic congressional hearing Dr. Phil's son blocked from selling'life-threatening' footage of NYPD after Mamdani lawsuit Chilling video shows high school student rampaging through classroom with knife... before teacher steps in Trump orders a'massive' military fleet toward Iran with ominous warning about what could come next: 'We're watching' Trump explains how he got bruise on his hand at Davos that sparked MORE health rumours... as he teases FOURTH term Michael Douglas and Catherine Zeta Jones's liberal nepo son is'too spooked' to return to CNN after Scott Jennings eviscerated him during debut appearance Another awkward moment between Victoria Beckham and Nicola Peltz goes viral as fans claim Brooklyn's mum'is not the problem' Woke Karen, 63, lets VERY embarrassing detail slip to the Daily Mail after she mistook cops rushing to school for ICE'and tried to obstruct them' Paris Hilton recalls'abuse' she endured after leaked 2004 sex tape as she protests against AI deepfakes The cancer now killing more Americans under 50 than any other... and why it's still being caught too late Revealed: The hilarious slang used in London 300 years ago - so, do YOU know your'fuddle cups' from your'cackling farts'? From '6,7' to'vibe-coding', new slang words and phrases seem to pop up on an almost daily basis. But it's time to wind the clock back, as a 327-year-old dictionary reveals the slang used in London in the 17th century. The glossary of terms, titled the'New Dictionary of the Terms..of the Canting Crew' was published in 1699 to help stop naive visitors to London from getting mugged or even killed.