Goto

Collaborating Authors

 Scientific Discovery


Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition

Liu, Fan, Han, Jindong, Lyu, Tengfei, Zhang, Weijia, Yang, Zhe-Rui, Dai, Lu, Liu, Cancheng, Liu, Hao

arXiv.org Artificial Intelligence

Foundation models (FMs), such as GPT-4 and AlphaFold, are reshaping the landscape of scientific research. Beyond accelerating tasks such as hypothesis generation, experimental design, and result interpretation, they prompt a more fundamental question: Are FMs merely enhancing existing scientific methodologies, or are they redefining the way science is conducted? In this paper, we argue that FMs are catalyzing a transition toward a new scientific paradigm. We introduce a three-stage framework to describe this evolution: (1) Meta-Scientific Integration, where FMs enhance workflows within traditional paradigms; (2) Hybrid Human-AI Co-Creation, where FMs become active collaborators in problem formulation, reasoning, and discovery; and (3) Autonomous Scientific Discovery, where FMs operate as independent agents capable of generating new scientific knowledge with minimal human intervention. Through this lens, we review current applications and emerging capabilities of FMs across existing scientific paradigms. We further identify risks and future directions for FM-enabled scientific discovery. This position paper aims to support the scientific community in understanding the transformative role of FMs and to foster reflection on the future of scientific discovery. Our project is available at https://github.com/usail-hkust/Awesome-Foundation-Models-for-Scientific-Discovery.


The scientific discoveries that prove God does exist, according to best-selling French book based on insights from 62 Nobel Prize winners

Daily Mail - Science & tech

The watershed moment Trump changed course on Israel after Netanyahu shattered their once-unbreakable bond: 'We felt betrayed' Kim Kardashian stuns onlookers in horrifying MASKED look at one of Hollywood's biggest galas DAPHNE BARAK: How I delivered the final, fatal blow to Andrew's fast-sinking reputation... and why Palace is right to still be deeply concerned Doctors expose the truth about melatonin... as terrifying side effects soar Gavin Newsom melts down as Pentagon plans to fire artillery shells over California highway during'No Kings' protest Olivia Nuzzi's memoir will reveal juicy text messages with RFK Jr. KENNEDY: Here's the truth of weird drug-fueled orgies in Congress that Tucker Carlson is investigating... it makes me sick to my stomach JANA HOCKING: I've uncovered the ultimate new sex secret and had the best night of my life... no wonder more women are trying it Limp Bizkit bassist Sam Rivers dead at 48 as iconic band pays tribute to'once-in-a-lifetime' talent Insiders reveal dark web of power behind earthquake of'No Kings' protests exploding across America Five safe haven investments if the global economy goes into meltdown (and one under the radar fund to buy RIGHT NOW): As more and more experts warn of a devastating fall in share prices... Inside the King's cold phone call that saw Prince Andrew lose his dukedom and have to cancel Sarah Ferguson's 66th birthday party as Epstein scandal exploded '90s icon looks unrecognizable as she teases her most infamous TV scene in bucket hat during rare outing Antonio Banderas and Melanie Griffith's daughter Stella, 29, weds her childhood sweetheart in dreamy Spanish wedding as actor toasts the newlyweds Stephen A. Smith makes racially-charged double standard accusation against LeBron James amid feud The Duchess of Scandal... who is now plain old Sarah: Fergie's humiliating downfall as King makes moves to'protect' her daughters Green Bay Packers' game in jeopardy with team stranded at airport less than 24 hours before kickoff Selena Gomez makes FIRST red carpet appearance with husband Benny Blanco since wedding as their'perfect' honeymoon is revealed READ MORE: Is there a God? It's a question that has been asked since the beginning of time: does God really exist? Traditionally, science has been the counterargument for the existence of a divine creator. However, French mathematicians Olivier Bonnassies and Michel-Yves Bollore now say that science'has become God's ally'. In a new book, the duo have distilled insights from 62 Nobel Prize winners and more than 100 leading scientists to pinpoint the scientific discoveries that could prove God is real.


Rise of the Robochemist

Zhu, Jihong, Huang, Kefeng, Pipe, Jonathon, Horbaczewsky, Chris, Tyrrell, Andy, Fairlamb, Ian J. S.

arXiv.org Artificial Intelligence

Abstract--Chemistry, a long-standing discipline, has historically relied on manual and often time-consuming processes. While some automation exists, the field is now on the cusp of a significant evolution driven by the integration of robotics and artificial intelligence (AI), giving rise to the concept of the robochemist: a new paradigm where autonomous systems assist in designing, executing, and analyzing experiments. Robo-chemists integrate mobile manipulators, advanced perception, teleoperation, and data-driven protocols to execute experiments with greater adaptability, reproducibility, and safety. Rather than a fully automated replacement for human chemists, we envisioned the robochemist as a complementary partner that works collaboratively to enhance discovery, enabling a more efficient exploration of chemical space and accelerating innovation in pharmaceuticals, materials science, and sustainable manufacturing. This article traces the technologies, applications, and challenges that define this transformation, highlighting both the opportunities and the responsibilities that accompany the emergence of the robochemist. Ultimately, the future of chemistry is argued to lie in a symbiotic partnership where human intuition and expertise is amplified by robotic precision and AI-driven insight. The field of chemistry, a cornerstone of modern science and industry, has long been characterized by a blend of theoretical insight and practical, hands-on experimentation.


Spec-Driven AI for Science: The ARIA Framework for Automated and Reproducible Data Analysis

Chen, Chuke, Luo, Biao, Li, Nan, Wang, Boxiang, Yang, Hang, Guo, Jing, Xu, Ming

arXiv.org Artificial Intelligence

The rapid expansion of scientific data has widened the gap between analytical capability and research intent. Existing AI-based analysis tools, ranging from AutoML frameworks to agentic research assistants, either favor automation over transparency or depend on manual scripting that hinders scalability and reproducibility. We present ARIA (Automated Research Intelligence Assistant), a spec-driven, human-in-the-loop framework for automated and interpretable data analysis. ARIA integrates six interoperable layers, namely Command, Context, Code, Data, Orchestration, and AI Module, within a document-centric workflow that unifies human reasoning and machine execution. Through natural-language specifications, researchers define analytical goals while ARIA autonomously generates executable code, validates computations, and produces transparent documentation. Beyond achieving high predictive accuracy, ARIA can rapidly identify optimal feature sets and select suitable models, minimizing redundant tuning and repetitive experimentation. In the Boston Housing case, ARIA discovered 25 key features and determined XGBoost as the best performing model (R square = 0.93) with minimal overfitting. Evaluations across heterogeneous domains demonstrate ARIA's strong performance, interpretability, and efficiency compared with state-of-the-art systems. By combining AI for research and AI for science principles within a spec-driven architecture, ARIA establishes a new paradigm for transparent, collaborative, and reproducible scientific discovery.



Non-iid hypothesis testing: from classical to quantum

De Palma, Giacomo, Fanizza, Marco, Mowry, Connor, O'Donnell, Ryan

arXiv.org Artificial Intelligence

We study hypothesis testing (aka state certification) in the non-identically distributed setting. A recent work (Garg et al. 2023) considered the classical case, in which one is given (independent) samples from $T$ unknown probability distributions $p_1, \dots, p_T$ on $[d] = \{1, 2, \dots, d\}$, and one wishes to accept/reject the hypothesis that their average $p_{\mathrm{avg}}$ equals a known hypothesis distribution $q$. Garg et al. showed that if one has just $c = 2$ samples from each $p_i$, and provided $T \gg \frac{\sqrt{d}}{ε^2} + \frac{1}{ε^4}$, one can (whp) distinguish $p_{\mathrm{avg}} = q$ from $d_{\mathrm{TV}}(p_{\mathrm{avg}},q) > ε$. This nearly matches the optimal result for the classical iid setting (namely, $T \gg \frac{\sqrt{d}}{ε^2}$). Besides optimally improving this result (and generalizing to tolerant testing with more stringent distance measures), we study the analogous problem of hypothesis testing for non-identical quantum states. Here we uncover an unexpected phenomenon: for any $d$-dimensional hypothesis state $σ$, and given just a single copy ($c = 1$) of each state $ρ_1, \dots, ρ_T$, one can distinguish $ρ_{\mathrm{avg}} = σ$ from $D_{\mathrm{tr}}(ρ_{\mathrm{avg}},σ) > ε$ provided $T \gg d/ε^2$. (Again, we generalize to tolerant testing with more stringent distance measures.) This matches the optimal result for the iid case, which is surprising because doing this with $c = 1$ is provably impossible in the classical case. We also show that the analogous phenomenon happens for the non-iid extension of identity testing between unknown states. A technical tool we introduce may be of independent interest: an Efron-Stein inequality, and more generally an Efron-Stein decomposition, in the quantum setting.


Gini-based Model Monitoring: A General Framework with an Application to Non-life Insurance Pricing

Brauer, Alexej, Menzel, Paul

arXiv.org Machine Learning

In a dynamic landscape where portfolios and environments evolve, maintaining the accuracy of pricing models is critical. To the best of our knowledge, this is the first study to systematically examine concept drift in non-life insurance pricing. We (i) provide an overview of the relevant literature and commonly used methodologies, clarify the distinction between virtual drift and concept drift, and explain their implications for long-run model performance; (ii) review and formalize common performance measures, including the Gini index and deviance loss, and articulate their interpretation; (iii) derive the asymptotic distribution of the Gini index, enabling valid inference and hypothesis testing; and (iv) present a standardized monitoring procedure that indicates when refitting is warranted. We illustrate the framework using a modified real-world portfolio with induced concept drift and discuss practical considerations and pitfalls.



Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection

Majee, Anay, Gangrade, Amitesh, Iyer, Rishabh

arXiv.org Artificial Intelligence

Open-World Object Detection (OWOD) enriches traditional object detectors by enabling continual discovery and integration of unknown objects via human guidance. However, existing OWOD approaches frequently suffer from semantic confusion between known and unknown classes, alongside catastrophic forgetting, leading to diminished unknown recall and degraded known-class accuracy. To overcome these challenges, we propose Combinatorial Open-World Detection (CROWD), a unified framework reformulating unknown object discovery and adaptation as an interwoven combinatorial (set-based) data-discovery (CROWD-Discover) and representation learning (CROWD-Learn) task. CROWD-Discover strategically mines unknown instances by maximizing Submodular Conditional Gain (SCG) functions, selecting representative examples distinctly dissimilar from known objects. Subsequently, CROWD-Learn employs novel combinatorial objectives that jointly disentangle known and unknown representations while maintaining discriminative coherence among known classes, thus mitigating confusion and forgetting. Extensive evaluations on OWOD benchmarks illustrate that CROWD achieves improvements of 2.83% and 2.05% in known-class accuracy on M-OWODB and S-OWODB, respectively, and nearly 2.4x unknown recall compared to leading baselines.


DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

Weng, Yixuan, Zhu, Minjun, Xie, Qiujie, Sun, Qiyao, Lin, Zhen, Liu, Sifan, Zhang, Yue

arXiv.org Artificial Intelligence

While previous AI Scientist systems can generate novel findings, they often lack the focus to produce scientifically valuable contributions that address pressing human-defined challenges. We introduce DeepScientist, a system designed to overcome this by conducting goal-oriented, fully autonomous scientific discovery over month-long timelines. It formalizes discovery as a Bayesian Optimization problem, operationalized through a hierarchical evaluation process consisting of "hypothesize, verify, and analyze". Leveraging a cumulative Findings Memory, this loop intelligently balances the exploration of novel hypotheses with exploitation, selectively promoting the most promising findings to higher-fidelity levels of validation. Consuming over 20,000 GPU hours, the system generated about 5,000 unique scientific ideas and experimentally validated approximately 1100 of them, ultimately surpassing human-designed state-of-the-art (SOT A) methods on three frontier AI tasks by 183.7%, 1.9%, and 7.9%. This work provides the first large-scale evidence of an AI achieving discoveries that progressively surpass human SOT A on scientific tasks, producing valuable findings that genuinely push the frontier of scientific discovery.Figure 1: Comparison of research progress timelines for AI text detection on the RAID (Dugan et al., 2024). The right panel shows that DeepScientist achieves progress in two weeks that is comparable to three years of human research (Su et al.; Bao et al., a;b; Hu et al., 2023) (left panel). All zero-shot methods, including the system-generated T -Detect, TDT, and P A-Detect, uniformly adopt Falcon-7B (Almazrouei et al., 2023) as the base model. Additionally, all methods produced by DeepScientist demonstrate higher throughput than the previous SOT A method, Binoculars (Hans et al., 2024). 1 Scientific discovery is inherently a process of continuous exploration and trial-and-error, where vast amounts of time and effort are invested to push the boundaries of human knowledge forward by a small step. This principle of persistent, incremental advancement is visible across the history of technology. For example, the decades-long optimization of semiconductor manufacturing has seen the feature size of transistors systematically reduced from micrometers to single-digit nanometers (Moore, 1965). Similarly, the efficiency of photovoltaic cells has been continuously advanced over half a century, with myriad material and architectural innovations pushing conversion rates from nascent single-digit percentages ever closer to their theoretical limits (Green, 1993). These historical trajectories underscore a process where human scientists engage in decades of goal-directed, iterative work to advance the SoT A artifacts continuously. Recently, the emergence of Large Language Models (LLMs) has propelled automated scientific discovery, where LLM-based AI Scientist systems take the lead in exploration (Xie et al., 2025b).