"The field of Machine Learning seeks to answer these questions: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?"
– from The Discipline of Machine Learning by Tom Mitchell. CMU-ML-06-108, 2006.
In the last post on Hidden Markov models (HMM), we never solved the problem of finding the most probable sequence of coins used. If you didn't read the post on HMMs, I would highly encourage you to do so. For those of you who did not, I'll outline the problem. Let's say some guru came up to you and told you to pick a coin from a bag (there are only two coins in the bag) and flip the coin. You'll either observe a head or a tail.
More than 200,000 candidate materials were virtually screened by the system at Osaka University in Japan. The team of researchers then synthesized one of the most promising, and found its properties were consistent with the system's predictions. Machine learning allows computers to make predictions about complex situations, as long as the algorithms are supplied with sufficient example data. This is especially useful for complicated problems in material science such as designing molecules for organic solar cells, the researchers said, as it can depend on a vast array of factors and unknown molecular structures. It could take humans years to sift data to find underlying patterns, and even longer to test all the possible candidate combinations of'donor' polymers and'acceptor' molecules that make up organic solar cells.
Artificial intelligence – AI – was a mere computational theory back in the 1950s when Alan Turing designed the first Turing Test to measure a machine's intelligence. Today, AI inhabits consumer electronics in the form of Siri, Cortana, Alexa and Google Assistant – it lives behind our internet browsers, within the relative confines of wireless networks and circuit boards. We interact with AI all the time – Google's auto-suggest function, customer service bots and YouTube's search algorithm are all examples of AI. In just half a century, AI's role in society has become firmly established. Developments in software programming and IT have facilitated important innovations in AI.
Open-source workflow managers are popular because they make it easy to orchestrate machine learning (ML) jobs for productions. Taking models into productions following a GitOps pattern is best managed by a container-friendly workflow manager, also known as MLOps. Kubeflow Pipelines (KFP) is one of the Kubernetes-based workflow managers used today. However, it doesn't provide all the functionality you need for a best-in-class data science and ML engineer experience. A common issue when developing ML models is having access to the tensor-level metadata of how the job is performing.
Professional traders are anticipating artificial intelligence and machine learning to be the most influential technology over the next three years. JP Morgan's flagship survey reveals more than half of professional and institutional traders anticipate machine learning to lead technology. Currently a third of client traders predict mobile trading applications to be the most influential this year. Certainly the Reddit Gamestop rally powered by low cost trading platform is already testament to just how quickly the environment has changed. Read more: 'Robots will take our jobs': Eigen boss Lewis Liu on the future of the City worker Electronic trading picked up last year and all surveyed expect to increase electronic volumes this year.
Key value databases come with a high speed and performance where we mostly cannot reach in relational databases. Herein similar to Cassandra, Redis is a fast key value store solution. In this post, we are going to adopt Redis to build an overperforming face recognition application. On the other hand, this could be adapted to NLP studies or any reverse image search case such as in Google Images. The official redis distribution is available for Linux and MacOS here.
We are going to work on a specific sub-task of NLP called text classification, this is the process of recognizing a pattern in a text and assign it a label. Examples that are used in your day to day life without you even noticing it include spam detection (in your mailbox), sentiment analysis (when you review a product or leave a comment) and tagging customer queries (when you fill in a contact form on a website). What we will try to do is to classify science-fiction books into different subgenres (dystopia, cyberpunk, space opera, …) based on their plot. In the end, we want a model that is able to take a book plot as an input and output the subgenres detected in the text and the confidence of the model that a subgenre is detected. The demonstrator can take up to 1 minute to open because I use a free version of Heroku to host my app, thus it goes to sleep when nobody uses it and it's better for the planet! This kind of algorithms could help an online market place to classify the books they receive to make more performant recommendations or a librarian to organize originally the books by subgenres instead of alphabetically, to create an experience in the library. Data is one of the most important (if not the most important) thing in data science.
Deep learning is the current darling of AI. Used by behemoths such as Microsoft, Google and Amazon, it leverages artificial neural networks that "learn" through exposure to immense amounts of data. By immense we mean internet-scale amounts -- or billions of documents at a minimum. If your project draws upon publicly available data, deep learning can be a valuable tool. The same is true if budget isn't an issue.
AbbVie is a research-based biopharmaceutical company that serves more than 30 million patients in 175 countries. With its global scale, AbbVie partnered with Intel to optimize processes for its more than 47,000 employees. This whitepaper highlights two use cases that are important to AbbVie's research. The first is Abbelfish Machine Translation, AbbVie's language translation service based on the Transformer NLP model, that leverages second-generation Intel Xeon Scalable processors and the Intel Optimization for TensorFlow with Intel oneAPI Deep Neural Network Library (oneDNN). AbbVie was able to achieve a 1.9x improvement in throughput for Abbelfish language translation using Intel Optimization for TensorFlow 1.15 with oneAPI Deep Neural Network Library when compared to TensorFlow 1.15 without oneDNN.1
It is essential to understand how different Machine Learning algorithms work to succeed in your Data Science projects. I have written this story as part of the series that dives into each ML algorithm explaining its mechanics, supplemented by Python code examples and intuitive visualizations. Support Vector Machines (SVMs) are most frequently used for solving classification problems, which fall under the supervised machine learning category. The exact place of these algorithms is displayed in the diagram below. Let's assume we have a set of points that belong to two separate classes.