scientific innovation
LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
Ruan, Kai, Wang, Xuan, Hong, Jixiang, Wang, Peng, Liu, Yang, Sun, Hao
While Large Language Models (LLMs) have demonstrated remarkable capabilities in scientific tasks, existing evaluation frameworks primarily assess their performance using rich contextual inputs, overlooking their ability to generate novel ideas from minimal information. We introduce LiveIdeaBench, a comprehensive benchmark that evaluates LLMs' scientific creativity and divergent thinking capabilities using single-keyword prompts. Drawing from Guilford's creativity theory, our framework employs a dynamic panel of state-of-the-art LLMs to assess generated ideas across four key dimensions: originality, feasibility, fluency, and flexibility. Through extensive experimentation with 20 leading models across 1,180 keywords spanning 18 scientific domains, we reveal that scientific creative ability shows distinct patterns from general intelligence metrics. Notably, our results demonstrate that models like QwQ-32B-preview achieve comparable creative performance to top-tier models like o1-preview, despite significant gaps in their general intelligence scores. These findings highlight the importance of specialized evaluation frameworks for scientific creativity and suggest that the development of creative capabilities in LLMs may follow different trajectories than traditional problem-solving abilities.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Hong Kong (0.04)
- Research Report > New Finding (1.00)
- Research Report > Promising Solution (0.88)
- Education (0.68)
- Health & Medicine (0.46)
Toward a Cohesive AI and Simulation Software Ecosystem for Scientific Innovation
Heroux, Michael A., Shende, Sameer, McInnes, Lois Curfman, Gamblin, Todd, Willenbring, James M.
ParaTools, Inc. Sameer Shende, ParaTools, Inc. Lois Curfman McInnes, Argonne National Laboratory Todd Gamblin, Lawrence Livermore National Laboratory James M. Willenbring, Sandia National Laboratories In this document, we outline key considerations for the next-generation software stack that will support scientific applications integrating AI and modeling & simulation (ModSim) to provide a unified AI/ModSim software stack. The scientific computing community needs a cohesive AI/ModSim software stack. This AI/ModSim stack must support binary distributions to enable emerging scientific workflows. A Cohesive Software Stack for AI and Modeling & Simulation To address future scientific challenges, the next-generation scientific software stack must provide a cohesive portfolio of libraries and tools that facilitate AI and ModSim approaches. As scientific research becomes increasingly interdisciplinary, scientists require both of these toolsets to address complex, data-rich problems in problem domains such as climate modeling, material discovery, and energy optimization.
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.49)
Young STEM Student Uses Artificial Intelligence To Help Emerging Nations
In emerging markets, many health and economic problems are related to access to clean water and sanitation infrastructure. This causes issues related to hygiene, pollution, agriculture, diseases, food and education. ISEF competitor Arya Tschand, 17, from Marlboro, New Jersey, has seen these problems first hand. " I developed the idea for my project on a trip to India visiting family," he says. "I saw the land was affected by water shortages and wastage, which affects people worldwide."