syscap
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Emami, Patrick, Li, Zhaonan, Sinha, Saumya, Nguyen, Truc
Data-driven simulation surrogates help computational scientists study complex systems. They can also help inform impactful policy decisions. We introduce a learning framework for surrogate modeling where language is used to interface with the underlying system being simulated. We call a language description of a system a "system caption", or SysCap. To address the lack of datasets of paired natural language SysCaps and simulation runs, we use large language models (LLMs) to synthesize high-quality captions. Using our framework, we train multimodal text and timeseries regression models for two real-world simulators of complex energy systems. Our experiments demonstrate the feasibility of designing language interfaces for real-world surrogate models at comparable accuracy to standard baselines. We qualitatively and quantitatively show that SysCaps unlock text-prompt-style surrogate modeling and new generalization abilities beyond what was previously possible. We will release the generated SysCaps datasets and our code to support follow-on studies.
- North America > United States > Maryland > Baltimore (0.04)
- North America > Dominican Republic (0.04)
- North America > Canada > Newfoundland and Labrador > Labrador (0.04)
- Asia > China > Hong Kong (0.04)
- Energy > Renewable > Wind (0.70)
- Construction & Engineering > HVAC (0.69)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)