Revealing economic facts: LLMs know more than they say

Buckmann, Marcus, Nguyen, Quynh Anh, Hill, Edward

arXiv.org Artificial Intelligence 

During training, generative large language models (LLMs) are exposed to vast amounts of information, including data relevant to economic modelling, such as geospatial statistics and firm-level financial metrics. If LLMs can effectively retrieve and utilise this knowledge, they could reduce dependence on external data sources that are time-consuming to access, clean, and merge, or that incur financial costs. Moreover, if LLMs accurately represent data, they could support downstream tasks like data imputation and outlier detection. In this study, we evaluate whether and how LLMs can be used for typical economic data processes. Not all knowledge within an LLM may be explicit and retrievable in natural language by prompting the model.