scientific method
Two ways to knowledge?
Tucny, Jean-Michel, Ganguly, Abhisek, Ansumali, Santosh, Succi, Sauro
It is shown that the weight matrices of transformer-based machine learning applications to the solution of two representative physical applications show a random-like character which bears no directly recognizable link to the physical and mathematical structure of the physical problem under study. This suggests that machine learning and the scientific method may represent two distinct and potentially complementary paths to knowledge, even though a strict notion of explainability in terms of direct correspondence between network parameters and physical structures may remain out of reach. It is also observed that drawing a parallel between transformer operation and (generalized) path-integration techniques may account for the random-like nature of the weights, but still does not resolve the tension with explainability. We conclude with some general comments on the hazards of gleaning knowledge without the benefit of Insight.
From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
Zheng, Tianshi, Deng, Zheye, Tsang, Hong Ting, Wang, Weiqi, Bai, Jiaxin, Wang, Zihao, Song, Yangqiu
Large Language Models (LLMs) are catalyzing a paradigm shift in scientific discovery, evolving from task-specific automation tools into increasingly autonomous agents and fundamentally redefining research processes and human-AI collaboration. This survey systematically charts this burgeoning field, placing a central focus on the changing roles and escalating capabilities of LLMs in science. Through the lens of the scientific method, we introduce a foundational three-level taxonomy-Tool, Analyst, and Scientist-to delineate their escalating autonomy and evolving responsibilities within the research lifecycle. We further identify pivotal challenges and future research trajectories such as robotic automation, self-improvement, and ethical governance. Overall, this survey provides a conceptual architecture and strategic foresight to navigate and shape the future of AI-driven scientific discovery, fostering both rapid innovation and responsible advancement. Github Repository: https://github.com/HKUST-KnowComp/Awesome-LLM-Scientific-Discovery.
Is the end of Insight in Sight ?
Tucny, Jean-Michel, Durve, Mihir, Succi, Sauro
The rise of deep learning challenges the longstanding scientific ideal of insight - the human capacity to understand phenomena by uncovering underlying mechanisms. In many modern applications, accurate predictions no longer require interpretable models, prompting debate about whether explainability is a realistic or even meaningful goal. From our perspective in physics, we examine this tension through a concrete case study: a physics-informed neural network (PINN) trained on a rarefied gas dynamics problem governed by the Boltzmann equation. Despite the system's clear structure and well-understood governing laws, the trained network's weights resemble Gaussian-distributed random matrices, with no evident trace of the physical principles involved. This suggests that deep learning and traditional simulation may follow distinct cognitive paths to the same outcome - one grounded in mechanistic insight, the other in statistical interpolation. Our findings raise critical questions about the limits of explainable AI and whether interpretability can - or should-remain a universal standard in artificial reasoning.
Chatbots and Zero Sales Resistance
Not a day goes by without we hear of the latest AI breakthroughs, such as chatbots that write up texts or generate images increasingly harder to tell apart from their human-made counterparts. These headlines come with a heavy load of hype, but even with hype factored out, a highly seductive promise stands tall, the promise to capture levels of complexity largely out of grasp for our best theories, models and simulations. Briefly, AI would supplant the time-honored Scientific Method, as we know it since Galileo's time [1, 2]. While heavily pumped up, this promise is not empty, addressing as it does, among others, one of the most vexing Achille's heels of the scientific method, the infamous Curse of Dimensionality (CoD) [3]. Indeed, CoD compounds with a profound hallmark of Complexity, namely the fact that complex systems are sneaky: they inhabit ultra-dimensional spaces but don't fill them up [4, 5, 6]. To the contrary, "interesting things" take place in ultrathin and often highly scattered portions of the huge state space available to them. Nature likes to play hide and seek and big time so. An illuminating example can be found in the book of Frenkel and Smit [7], where we learn that the chance of making a sensible Monte Carlo move in the state space of hundred hard-spheres (please note, hundred, not Avogadro's) is about 10
The Use of AI-Robotic Systems for Scientific Discovery
Gower, Alexander H., Korovin, Konstantin, Brunnsรฅker, Daniel, Kronstrรถm, Filip, Reder, Gabriel K., Tiukova, Ievgeniia A., Reiserer, Ronald S., Wikswo, John P., King, Ross D.
The process of developing theories and models and testing them with experiments is fundamental to the scientific method. Automating the entire scientific method then requires not only automation of the induction of theories from data, but also experimentation from design to implementation. This is the idea behind a robot scientist -- a coupled system of AI and laboratory robotics that has agency to test hypotheses with real-world experiments. In this chapter we explore some of the fundamentals of robot scientists in the philosophy of science. We also map the activities of a robot scientist to machine learning paradigms, and argue that the scientific method shares an analogy with active learning. We demonstrate these concepts using examples from previous robot scientists, and also from Genesis: a next generation robot scientist designed for research in systems biology, comprising a micro-fluidic system with 1000 computer-controlled micro-bioreactors and interpretable models based in controlled vocabularies and logic.
Explain the Black Box for the Sake of Science: Revisiting the Scientific Method in the Era of Generative Artificial Intelligence
The scientific method is the cornerstone of human progress across all branches of the natural and applied sciences, from understanding the human body to explaining how the universe works. The scientific method is based on identifying systematic rules or principles that describe the phenomenon of interest in a reproducible way that can be validated through experimental evidence. In the era of artificial intelligence (AI), there are discussions on how AI systems may discover new knowledge. We argue that, before the advent of artificial general intelligence, human complex reasoning for scientific discovery remains of vital importance. Yet, AI can be leveraged for scientific discovery via explainable AI. More specifically, knowing what data AI systems used to make decisions can be a point of contact with domain experts and scientists, that can lead to divergent or convergent views on a given scientific problem. Divergent views may spark further scientific investigations leading to new scientific knowledge. Convergent views may instead reassure that the AI system is operating within bounds deemed reasonable to humans. The latter point addresses the trustworthiness requirement that is indispensable for critical applications in the applied sciences, such as medicine.
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Baek, Jinheon, Jauhar, Sujay Kumar, Cucerzan, Silviu, Hwang, Sung Ju
Scientific Research, vital for improving human life, is hindered by its inherent complexity, slow pace, and the need for specialized experts. To enhance its productivity, we propose a ResearchAgent, a large language model-powered research idea writing agent, which automatically generates problems, methods, and experiment designs while iteratively refining them based on scientific literature. Specifically, starting with a core paper as the primary focus to generate ideas, our ResearchAgent is augmented not only with relevant publications through connecting information over an academic graph but also entities retrieved from an entity-centric knowledge store based on their underlying concepts, mined and shared across numerous papers. In addition, mirroring the human approach to iteratively improving ideas with peer discussions, we leverage multiple ReviewingAgents that provide reviews and feedback iteratively. Further, they are instantiated with human preference-aligned large language models whose criteria for evaluation are derived from actual human judgments. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines, showcasing its effectiveness in generating novel, clear, and valid research ideas based on human and model-based evaluation results.
We're Not Using AI to Its Fullest Human Potential
We should be living in a golden age of science. For centuries, the scientific method was defined by two pillars--theory and experiment. Now, we live in the age of Artificial Intelligence, which adds a vital third pillar. Without advanced computation, according to leading scientific bodies, discoveries of the past decade, such as the detection of the Higgs boson, the discovery of new drugs like halicin, which can kill strains of bacteria resistant to all known antibiotics, or the observation of gravitational waves, "would have been impossible". But despite these advances, scientific innovation today is too often defined by new use cases for existing technologies or refining previous advancements, rather than the creation of entirely new fields of discovery.
We're Not Using AI to Its Fullest Human Potential
We should be living in a golden age of science. For centuries, the scientific method was defined by two pillars--theory and experiment. Now, we live in the age of Artificial Intelligence, which adds a vital third pillar. Without advanced computation, according to leading scientific bodies, discoveries of the past decade, such as the detection of the Higgs boson, the discovery of new drugs like halicin, which can kill strains of bacteria resistant to all known antibiotics, or the observation of gravitational waves, "would have been impossible". But despite these advances, scientific innovation today is too often defined by new use cases for existing technologies or refining previous advancements, rather than the creation of entirely new fields of discovery.
Physical Computing for Materials Acceleration Platforms
Peterson, Erik, Lavin, Alexander
A ''technology lottery'' describes a research idea or technology succeeding over others because it is suited to the available software and hardware, not necessarily because it is superior to alternative directions--examples abound, from the synergies of deep learning and GPUs to the disconnect of urban design and autonomous vehicles. The nascent field of Self-Driving Laboratories (SDL), particularly those implemented as Materials Acceleration Platforms (MAPs), is at risk of an analogous pitfall: the next logical step for building MAPs is to take existing lab equipment and workflows and mix in some AI and automation. In this whitepaper, we argue that the same simulation and AI tools that will accelerate the search for new materials, as part of the MAPs research program, also make possible the design of fundamentally new computing mediums. We need not be constrained by existing biases in science, mechatronics, and general-purpose computing, but rather we can pursue new vectors of engineering physics with advances in cyber-physical learning and closed-loop, self-optimizing systems. Here we outline a simulation-based MAP program to design computers that use physics itself to solve optimization problems. Such systems mitigate the hardware-software-substrate-user information losses present in every other class of MAPs and they perfect alignment between computing problems and computing mediums eliminating any technology lottery. We offer concrete steps toward early ''Physical Computing (PC) -MAP'' advances and the longer term cyber-physical R&D which we expect to introduce a new era of innovative collaboration between materials researchers and computer scientists.