VCBench: Benchmarking LLMs in Venture Capital

Chen, Rick, Ternasky, Joseph, Kwesi, Afriyie Samuel, Griffin, Ben, Yin, Aaron Ontoyin, Salifu, Zakari, Amoaba, Kelvin, Mu, Xianling, Alican, Fuat, Ihlamur, Yigit

Sep-19-2025–arXiv.org Artificial Intelligence

Benchmarks such as SWE-bench and ARC-AGI demonstrate how shared datasets accelerate progress toward artificial general intelligence (AGI). We introduce VCBench, the first benchmark for predicting founder success in venture capital (VC), a domain where signals are sparse, outcomes are uncertain, and even top investors perform modestly. At inception, the market index achieves a precision of 1.9%. Y Combinator outperforms the index by a factor of 1.7x, while tier-1 firms are 2.9x better. VCBench provides 9,000 anonymized founder profiles, standardized to preserve predictive features while resisting identity leakage, with adversarial tests showing more than 90% reduction in re-identification risk. We evaluate nine state-of-the-art large language models (LLMs). DeepSeek-V3 delivers over six times the baseline precision, GPT-4o achieves the highest F0.5, and most models surpass human benchmarks. Designed as a public and evolving resource available at vcbench.com, VCBench establishes a community-driven standard for reproducible and privacy-preserving evaluation of AGI in early-stage venture forecasting.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

Sep-19-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Education (1.00)
- Banking & Finance
  - Capital Markets (1.00)
  - Trading (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found