VCBench: Benchmarking LLMs in Venture Capital
Chen, Rick, Ternasky, Joseph, Kwesi, Afriyie Samuel, Griffin, Ben, Yin, Aaron Ontoyin, Salifu, Zakari, Amoaba, Kelvin, Mu, Xianling, Alican, Fuat, Ihlamur, Yigit
–arXiv.org Artificial Intelligence
Benchmarks such as SWE-bench and ARC-AGI demonstrate how shared datasets accelerate progress toward artificial general intelligence (AGI). We introduce VCBench, the first benchmark for predicting founder success in venture capital (VC), a domain where signals are sparse, outcomes are uncertain, and even top investors perform modestly. At inception, the market index achieves a precision of 1.9%. Y Combinator outperforms the index by a factor of 1.7x, while tier-1 firms are 2.9x better. VCBench provides 9,000 anonymized founder profiles, standardized to preserve predictive features while resisting identity leakage, with adversarial tests showing more than 90% reduction in re-identification risk. We evaluate nine state-of-the-art large language models (LLMs). DeepSeek-V3 delivers over six times the baseline precision, GPT-4o achieves the highest F0.5, and most models surpass human benchmarks. Designed as a public and evolving resource available at vcbench.com, VCBench establishes a community-driven standard for reproducible and privacy-preserving evaluation of AGI in early-stage venture forecasting.
arXiv.org Artificial Intelligence
Sep-19-2025
- Country:
- Asia > India (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- North America > United States
- California (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Banking & Finance
- Capital Markets (1.00)
- Trading (0.68)
- Education (1.00)
- Banking & Finance
- Technology: