AITopics | scientific discovery

Collaborating Authors

scientific discovery

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Roadkill is a surprising and untapped source for scientists

Millions of animals unfortunately die on roads each year, but the casualties hold important data. Breakthroughs, discoveries, and DIY tips sent six days a week. As much as people try to avoid it (and not contribute to it), the untimely animal deaths are an unfortunate, inevitable byproduct of a society reliant on cars. In Brazil alone, it's estimated that anywhere between two and eight million birds and mammals are killed on roadways every year. In Europe, the potential tally may climb as high as 194 million .

artificial intelligence, physics popular science video space, scientific discovery, (10 more...)

Popular Science

Country:

South America > Brazil (0.25)
Oceania > Australia (0.05)
North America > United States > Oregon (0.05)
(6 more...)

Genre: Research Report > New Finding (0.36)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.36)

Add feedback

Expert-level protocol translation for self-driving labs

Neural Information Processing SystemsDec-25-2025, 22:32:45 GMT

Recent development in Artificial Intelligence (AI) models has propelled their application in scientific discovery, but the validation and exploration of these discoveries require subsequent empirical experimentation. The concept of self-driving laboratories promises to automate and thus boost the experimental process following AI-driven discoveries. However, the transition of experimental protocols, originally crafted for human comprehension, into formats interpretable by machines presents significant challenges, which, within the context of specific expert domain, encompass the necessity for structured as opposed to natural language, the imperative for explicit rather than tacit knowledge, and the preservation of causality and consistency throughout protocol steps. Presently, the task of protocol translation predominantly requires the manual and labor-intensive involvement of domain experts and information technology specialists, rendering the process time-intensive. To address these issues, we propose a framework that automates the protocol translation process through a three-stage workflow, which incrementally constructs Protocol Dependence Graphs (PDGs) that approach structured on the syntax level, completed on the semantics level, and linked on the execution level. Quantitative and qualitative evaluations have demonstrated its performance at par with that of human experts, underscoring its potential to significantly expedite and democratize the process of scientific discovery by elevating the automation capabilities within self-driving laboratories.

artificial intelligence, name change, proceedings, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

Neural Information Processing SystemsDec-24-2025, 01:06:13 GMT

Automated scientific discovery promises to accelerate progress across scientific domains, but evaluating an agent's capacity for end-to-end scientific reasoning is challenging as running real-world experiments is often prohibitively expensive or infeasible. In this work we introduce DiscoveryWorld, a virtual environment that enables benchmarking an agent's ability to perform complete cycles of novel scientific discovery in an inexpensive, simulated, multi-modal, long-horizon, and fictional setting.DiscoveryWorld consists of 24 scientific tasks across three levels of difficulty, each with parametric variations that provide new discoveries for agents to make across runs. Tasks require an agent to form hypotheses, design and run experiments, analyze results, and act on conclusions. Task difficulties are normed to range from straightforward to challenging for human scientists with advanced degrees. DiscoveryWorld further provides three automatic metrics for evaluating performance, including: (1) binary task completion, (2) fine-grained report cards detailing procedural scoring of task-relevant actions, and (3) the accuracy of discovered explanatory knowledge.While simulated environments such as DiscoveryWorld are low-fidelity compared to the real world, we find that strong baseline agents struggle on most DiscoveryWorld tasks, highlighting the utility of using simulated environments as proxy tasks for near-term development of scientific discovery competency in agents.

artificial intelligence, proceedings, scientific discovery, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)

Add feedback

Virtual Collaboration

Communications of the ACMDec-17-2025, 19:01:27 GMT

The holy grail for scientists is to focus on their research to enhance and produce scientific discoveries while offloading time-consuming tasks. So-called artificial intelligence (AI) co-scientists are helping to make this possible. These collaborative AI systems are designed to assist human researchers by accelerating scientific discovery, enhancing collaboration, analyzing data, and going beyond human intuition. An AI co-scientist performs various scientific tasks, especially in areas like hypothesis generation, experimental design, verification, and literature review. It uses the results to learn to improve its ability to generate and refine hypotheses.

discovery, scientist, virtual collaboration, (7 more...)

Communications of the ACM

Genre:

Research Report (1.00)
Personal > Honors (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.40)

Add feedback

AI-Newton: A Concept-Driven Physical Law Discovery System without Prior Physical Knowledge

Fang, You-Le, Jian, Dong-Shan, Li, Xiang, Ma, Yan-Qing

arXiv.org Artificial IntelligenceDec-12-2025

Advances in artificial intelligence (AI) have made AI-driven scientific discovery a highly promising new paradigm [1]. Although AI has achieved remarkable results in tackling domain-specific challenges [2, 3], the ultimate aspiration from a paradigm-shifting perspective still lies in developing reliable AI systems capable of autonomous scientific discovery directly from a large collection of raw data without supervision [4, 5]. Current approaches to automated physics discovery focus on individual experiments, employing either neural network (NN)-based methods [6-25] or symbolic techniques [26-33]. By analyzing data from a single experiment, these methods can construct a specific model capable of predicting future data from the same experiment; if sufficiently simple, such a model may even be expressed in symbolic form [34-36]. Although these methods represent a crucial and successful stage towards automated scientific discovery, they have not yet reached a discovery capacity comparable to that of human physicists.

artificial intelligence, experiment, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.01538

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.96)

Add feedback

PRiSM: An Agentic Multimodal Benchmark for Scientific Reasoning via Python-Grounded Evaluation

Imani, Shima, Moon, Seungwhan, Ahmadyan, Adel, Zhang, Lu, Ahmed, Kirmani, Damavandi, Babak

arXiv.org Artificial IntelligenceDec-8-2025

Evaluating vision-language models (VLMs) in scientific domains like mathematics and physics poses unique challenges that go far beyond predicting final answers. These domains demand conceptual understanding, symbolic reasoning, and adherence to formal laws, requirements that most existing benchmarks fail to address. In particular, current datasets tend to be static, lacking intermediate reasoning steps, robustness to variations, or mechanisms for verifying scientific correctness. To address these limitations, we introduce PRiSM, a synthetic, fully dynamic, and multimodal benchmark for evaluating scientific reasoning via grounded Python code. PRiSM includes over 24,750 university-level physics and math problems, and it leverages our scalable agent-based pipeline, PrismAgent, to generate well-structured problem instances. Each problem contains dynamic textual and visual input, a generated figure, alongside rich structured outputs: executable Python code for ground truth generation and verification, and detailed step-by-step reasoning. The dynamic nature and Python-powered automated ground truth generation of our benchmark allow for fine-grained experimental auditing of multimodal VLMs, revealing failure modes, uncertainty behaviors, and limitations in scientific reasoning. To this end, we propose five targeted evaluation tasks covering generalization, symbolic program synthesis, perturbation robustness, reasoning correction, and ambiguity resolution. Through comprehensive evaluation of existing VLMs, we highlight their limitations and showcase how PRiSM enables deeper insights into their scientific reasoning capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.0593

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

From Task Executors to Research Partners: Evaluating AI Co-Pilots Through Workflow Integration in Biomedical Research

Weidener, Lukas, Brkić, Marko, Bacci, Chiara, Jovanović, Mihailo, Ulgac, Emre, Dobrin, Alex, Weniger, Johannes, Vlas, Martin, Singh, Ritvik, Meduri, Aakaash

arXiv.org Artificial IntelligenceDec-5-2025

Artificial intelligence systems are increasingly deployed in biomedical research. However, current evaluation frameworks may inadequately assess their effectiveness as research collaborators. This rapid review examines benchmarking practices for AI systems in preclinical biomedical research. Three major databases and two preprint servers were searched from January 1, 2018 to October 31, 2025, identifying 14 benchmarks that assess AI capabilities in literature understanding, experimental design, and hypothesis generation. The results revealed that all current benchmarks assess isolated component capabilities, including data analysis quality, hypothesis validity, and experimental protocol design. However, authentic research collaboration requires integrated workflows spanning multiple sessions, with contextual memory, adaptive dialogue, and constraint propagation. This gap implies that systems excelling on component benchmarks may fail as practical research co-pilots. A process-oriented evaluation framework is proposed that addresses four critical dimensions absent from current benchmarks: dialogue quality, workflow orchestration, session continuity, and researcher experience. These dimensions are essential for evaluating AI systems as research co-pilots rather than as isolated task executors.

data mining, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2512.04854

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Colorado (0.04)
(4 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.92)
Government > Regional Government > North America Government > United States Government (0.67)
Health & Medicine > Epidemiology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(6 more...)

Add feedback

The Station: An Open-World Environment for AI-Driven Discovery

Chung, Stephen, Du, Wenyu

arXiv.org Artificial IntelligenceDec-2-2025

We introduce the STATION, an open-world multi-agent environment for autonomous scientific discovery. The Station simulates a complete scientific ecosystem, where agents can engage in long scientific journeys that include reading papers from peers, formulating hypotheses, collaborating with peers, submitting experiments, and publishing results. Importantly, there is no centralized system coordinating their activities. Utilizing their long context, agents are free to choose their own actions and develop their own narratives within the Station. Experiments demonstrate that AI agents in the Station achieve new state-of-the-art performance on a wide range of benchmarks, spanning mathematics, computational biology, and machine learning, notably surpassing AlphaEvolve in circle packing. A rich tapestry of unscripted narratives emerges, such as agents collaborating and analyzing other works rather than pursuing myopic optimization. From these emergent narratives, novel methods arise organically, such as a new density-adaptive algorithm for scRNA-seq batch integration that borrows concepts from another domain. The Station marks a first step towards autonomous scientific discovery driven by emergent behavior in an open-world environment, representing a new paradigm that moves beyond rigid pipelines.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.06309

Country:

South America > Peru > Amazonas Department (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Hierarchical AI-Meteorologist: LLM-Agent System for Multi-Scale and Explainable Weather Forecast Reporting

Sukhorukov, Daniil, Zakharov, Andrei, Glazkov, Nikita, Yanchanka, Katsiaryna, Kirilin, Vladimir, Dubovitsky, Maxim, Sultimov, Roman, Maksimov, Yuri, Makarov, Ilya

arXiv.org Artificial IntelligenceDec-1-2025

We present the Hierarchical AI-Meteorologist, an LLM-agent system that generates explainable weather reports using a hierarchical forecast reasoning and weather keyword generation. Unlike standard approaches that treat forecasts as flat time series, our framework performs multi-scale reasoning across hourly, 6-hour, and daily aggregations to capture both short-term dynamics and long-term trends. Its core reasoning agent converts structured meteorological inputs into coherent narratives while simultaneously extracting a few keywords effectively summarizing the dominant meteorological events. These keywords serve as semantic anchors for validating consistency, temporal coherence and factual alignment of the generated reports. Using OpenWeather and Meteostat data, we demonstrate that hierarchical context and keyword-based validation substantially improve interpretability and robustness of LLM-generated weather narratives, offering a reproducible framework for semantic evaluation of automated meteorological reporting and advancing agent-based scientific reasoning.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.23387

Country:

North America > United States (0.48)
Asia > Russia (0.15)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.15)
(5 more...)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.66)

Add feedback

MA-COIR: Leveraging Semantic Search Index and Generative Models for Ontology-Driven Biomedical Concept Recognition

Liu, Shanshan, Nishida, Noriki, Munne, Rumana Ferdous, Tokunaga, Narumi, Yamagata, Yuki, Kozaki, Kouji, Matsumoto, Yuji

arXiv.org Artificial IntelligenceNov-26-2025

Recognizing biomedical concepts in the text is vital for ontology refinement, knowledge graph construction, and concept relationship discovery. However, traditional concept recognition methods, relying on explicit mention identification, often fail to capture complex concepts not explicitly stated in the text. To overcome this limitation, we introduce MA-COIR, a framework that reformulates concept recognition as an indexing-recognition task. By assigning semantic search indexes (ssIDs) to concepts, MA-COIR resolves ambiguities in ontology entries and enhances recognition efficiency. Using a pretrained BART-based model fine-tuned on small datasets, our approach reduces computational requirements to facilitate adoption by domain experts. Furthermore, we incorporate large language models (LLMs)-generated queries and synthetic data to improve recognition in low-resource settings. Experimental results on three scenarios (CDR, HPO, and HOIP) highlight the effectiveness of MA-COIR in recognizing both explicit and implicit concepts without the need for mention-level annotations during inference, advancing ontology-driven concept recognition in biomedical domain applications. Our code and constructed data are available at https://github.com/sl-633/macoir-master.

large language model, ma-coir, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.acl-srw.39

2505.12964

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Media > News (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback