Sumil Khosla on LinkedIn: FinancialAdvisor.AI
Is it worth training your own large language model (LLM) on domain-specific data from scratch? Researchers at Bloomberg did just that and shared a detailed technical report describing the dataset, model configuration, and training procedure. The core question is, is it worth training the LLM from scratch? In my experience, it makes total sense if we want to apply LLMs to novel data sources (e.g., protein amino acid sequences as ProtBERT demonstrated). BloombergGPT is a 50-billion parameter language model for finance, trained on 363 billion tokens from finance data and 345 billion tokens from a general dataset.
Apr-14-2023, 07:12:57 GMT
- Technology: