Data-driven Discovery with Large Generative Models