Every day, new organizations announce how AI is revolutionizing the industry with disruptive results . As more and more business decisions are based on AI and advanced data analytics it is critical to provide transparency to the inner workings within that technology. McKinsey Global InstituteHarvard Business Review According to a recent McKinsey Global Institute analysis, the financial services sector is a leading adopter of AI and has the most ambitious AI investment plans. In a related article by the Harvard Business Review, adoption will center on AI technologies like neural-based machine learning and natural language processing because those are the technologies that are beginning to mature and prove their value. Below, we explore a challenge and opportunity that is unique to the rapid adoption of machine learning.
Nowadays, NLP has become synonymous with Deep Learning. But, Deep Learning is not the'magic bullet' for every NLP task. For example, in sentence classification tasks, a simple linear classifier could work reasonably well. Especially if you have a small training dataset. However, some NLP tasks flourish with Deep Learning.
A language model is a probability distribution over sequences of words. It predicts the next word based on all the previous words. It is useful for a variety of AI applications, such the auto-completion in your email or chatbot service. GPT-3 is very large language model, with 175 billion parameters, that uses deep learning to produce human-like text. Many researchers and news articles described GPT-3 as "one of the most interesting and important AI systems ever produced".
A simple task, to reduce all the words in an article to a compact sequence of words that explains the article's central point, is among the benchmark tasks in deep learning where Amazon's Alexa AI scientsts say they can best the efforts of vastly larger computer programs from DeepMind, Google, Meta, OpenAI and others. The work has implications for energy use and carbon footprint-efficiency. Two threads of research strongly dominate machine learning these days: making programs more general in their approach, to handle any potentially task, and making them bigger. The biggest neural nets, as measured by their parameters, or "weights," are clocking in at over half a trillion weights, with models such as Google's Pathways Language Model, or PaLM, and Nvidia and Microsoft's Megatron-Turing NLG 530B being among the biggest, with 540 billion and 530 billion parameters, respectively. The cognoscenti of AI insist the path is definitely up and to the right for parameter count, toward a trillion parameters and way beyond in the not-to-distant future. The figure of 100 trillion is a kind of magical target because it is believed to be the number of synapses in a human brain, so it serves as a benchmark of sorts.
Massive scale, both in terms of data availability and computation, enables significant breakthroughs in key application areas of deep learning such as natural language processing (NLP) and computer vision. There is emerging evidence that scale may be a key ingredient in scientific deep learning, but the importance of physical priors in scientific domains makes the strategies and benefits of scaling uncertain. Here, we investigate neural scaling behavior in large chemical models by varying model and dataset sizes over many orders of magnitude, studying models with over one billion parameters, pre-trained on datasets of up to ten million datapoints. We consider large language models for generative chemistry and graph neural networks for machine-learned interatomic potentials. To enable large-scale scientific deep learning studies under resource constraints, we develop the Training Performance Estimation (TPE) framework to reduce the costs of scalable hyperparameter optimization by up to 90%.
This article discusses three techniques that practitioners could use to effectively start working with natural language processing (NLP). This will also give good visibility to people interested in having a sense of what NLP is about -- if you are an expert, please feel free to connect, comment, or suggest. At erreVol, we leverage similar tools to extract useful insights from transcripts of earnings reports of public corporations -- the interested reader can go test the platform. Note, we will present lines of codes for the reader interested in replicating or using what is presented below. Otherwise, please feel free to skip those technical lines as the reading should result seamless.
Since the advent of Transformers in 2017, Large Language Models (LLMs) have completely changed the process of training ML models for language tasks. Earlier, for a given task and a given dataset, we used to play around with various models like RNNs, LSTMs, Decision Trees, etc by training each of them on a subset of the data and testing on the rest. And whichever model gave the best accuracy was chosen as the winner. Of course, a lot of model hyper-parameters also needed to be tuned and experimented with. And for many problems, feature engineering was also necessary.
Despite the significant advances made in computer science in the last decades, which are daily improving medical services and research, patient care has always been about human-to-human interaction and empathy. However, through artificial intelligence, medical professionals can obtain more accurate patient information and make better decisions. The application of sophisticated mathematical algorithms is going far beyond the collection of information. Artificial intelligence now has the ability to learn, distinguish patterns and find substantive inconsistencies usually invisible to the human eye. IBM Watson, the supercomputer, is one of the best examples of the practical application of AI in healthcare.