Deep Learning Models that Write Code - Issue #6
Simply put, a language model is a statistical model that learns the distribution or probabilities of words in a sequence. It turns out that if we can achieve such a model with high fidelity, we can solve a few interesting tasks. For example, if we know that a word is likely to occur given some sequence of words, we can implement some useful functionality like email autocomplete (e.g., given the sequence "Have a great " .. we can predict that the next likely word is "day"). When these statistical models are derived using large neural networks with billions of parameters (hence the term large language models or LLMs), the results and application areas are even more impressive. Results from transformer-based model architectures like BERT, GPT etc., show that these models excel at several complex tasks e.g., they can mimic creative writing, predict sentiment, identify topics within sentences with few examples, meaningfully summarize lengthy documents, translate languages etc.
Mar-14-2022, 18:30:06 GMT
- Technology: