prediction service
Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations
Our goal is to evaluate the accuracy of a black-box classification model, not as a single aggregate on a given test data distribution, but as a surface over a large number of combinations of attributes characterizing multiple test data distributions. Such attributed accuracy measures become important as machine learning models get deployed as a service, where the training data distribution is hidden from clients, and different clients may be interested in diverse regions of the data distribution.
Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations
Our goal is to evaluate the accuracy of a black-box classification model, not as a single aggregate on a given test data distribution, but as a surface over a large number of combinations of attributes characterizing multiple test data distributions. Such attributed accuracy measures become important as machine learning models get deployed as a service, where the training data distribution is hidden from clients, and different clients may be interested in diverse regions of the data distribution. Each attribute combination, called an'arm' is associated with a Beta density from which the service's accuracy is sampled. We expect the GP to smooth the parameters of the Beta density over related arms to mitigate sparsity. We show that obvious application of GPs cannot address the challenge of heteroscedastic uncertainty over a huge attribute space that is sparsely and unevenly populated.
PlasmoFAB: A Benchmark to Foster Machine Learning for Plasmodium falciparum Protein Antigen Candidate Prediction
Ditz, Jonas Christian, Wistuba-Hamprecht, Jacqueline, Maier, Timo, Fendel, Rolf, Pfeifer, Nico, Reuter, Bernhard
Motivation: Machine learning methods can be used to support scientific discovery in healthcare-related research fields. However, these methods can only be reliably used if they can be trained on high-quality and curated datasets. Currently, no such dataset for the exploration of Plasmodium falciparum protein antigen candidates exists. The parasite Plasmodium falciparum causes the infectious disease malaria. Thus, identifying potential antigens is of utmost importance for the development of antimalarial drugs and vaccines. Since exploring antigen candidates experimentally is an expensive and time-consuming process, applying machine learning methods to support this process has the potential to accelerate the development of drugs and vaccines, which are needed for fighting and controlling malaria. Results: We developed PlasmoFAB, a curated benchmark that can be used to train machine learning methods for the exploration of Plasmodium falciparum protein antigen candidates. We combined an extensive literature search with domain expertise to create high-quality labels for Plasmodium falciparum specific proteins that distinguish between antigen candidates and intracellular proteins. Additionally, we used our benchmark to compare different well-known prediction models and available protein localization prediction services on the task of identifying protein antigen candidates. We show that available general-purpose services are unable to provide sufficient performance on identifying protein antigen candidates and are outperformed by our models that were trained on this tailored data. Availability: PlasmoFAB is publicly available on Zenodo with DOI 10.5281/zenodo.7433087. Furthermore, all scripts that were used in the creation of PlasmoFAB and the training and evaluation of machine learning models are open source and publicly available on GitHub here: https://github.com/msmdev/PlasmoFAB.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.16)
- Africa (0.04)
Running a Stable Diffusion Cluster on GCP with tensorflow-serving (Part 2)
In part 1, we learned how to use terraform to set up and manage our infrastructure conveniently. In this part, we will continue on our journey to deploy a running Stable Diffusion model on the provisioned cluster. Note: You can follow this tutorial end-to-end even if you're a free user (as long as you have some of free tier credits left). Let's take a look at what the final result would be. If you add a bit of noise to an image gradually for many steps, you will end up with an image containing noise.
iMedBot: A Web-based Intelligent Agent for Healthcare Related Prediction and Deep Learning
Background: Breast cancer is a multifactorial disease, genetic and environmental factors will affect its incidence probability. Breast cancer metastasis is one of the main cause of breast cancer related deaths reported by the American Cancer Society (ACS). Method: the iMedBot is a web application that we developed using the python Flask web framework and deployed on Amazon Web Services. It contains a frontend and a backend. The backend is supported by a python program we developed using the python Keras and scikit-learn packages, which can be used to learn deep feedforward neural network (DFNN) models. Result: the iMedBot can provide two main services: 1. it can predict 5-, 10-, or 15-year breast cancer metastasis based on a set of clinical information provided by a user. The prediction is done by using a set of DFNN models that were pretrained, and 2. It can train DFNN models for a user using user-provided dataset. The model trained will be evaluated using AUC and both the AUC value and the AUC ROC curve will be provided. Conclusion: The iMedBot web application provides a user-friendly interface for user-agent interaction in conducting personalized prediction and model training. It is an initial attempt to convert results of deep learning research into an online tool that may stir further research interests in this direction. Keywords: Deep learning, Breast Cancer, Web application, Model training.
- North America > United States (0.28)
- Europe > Middle East > Malta (0.04)
Machine Learning at Scale with Databricks and Kubernetes
Machine Learning Operationalisation (ML Ops) is a set of practices that aim to quickly and reliably build, deploy and monitor machine learning applications. Many organizations standardize around certain tools to develop a platform to enable these goals. One combination of tools includes using Databricks to build and manage machine learning models and Kubernetes to deploy models. This article will explore how to design this solution on Microsoft Azure followed by step-by-step instructions on how to implement this solution as a proof-of-concept. This approach aims to use common open source technologies and can easily be adapted for other cloud platforms.
Best Practices for MLOps Documentation
Throughout history, technical documentation has always been needed to act as a medium for passing information or a collection of instructions on using specific tools. Even dating back to the oldest example recorded in the western world, The Rhind Papyrus(ca 1650 B.C.), which contained material on ancient Egyptian mathematics. To paint a mental picture of the significance of this particular subject, let us create an imaginary scenario? Consider how difficult it would be if, for example, You purchased furniture parts from Ikea and it did not come with a manual, and you could not find any material online either, Really think for a second how challenging it would be to assemble such furniture; This scenario should help keep things in context. Communication is a recurring theme in all of the above practices.
GitHub - bentoml/BentoML: Model Serving Made Easy
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. By providing a standard interface for describing a prediction service, BentoML abstracts away how to run model inference efficiently and how model serving workloads can integrate with cloud infrastructures. Be sure to check out deployment overview doc to understand which deployment option is best suited for your use case. BentoML provides APIs for defining a prediction service, a servable model so to speak, which includes the trained ML model itself, plus its pre-processing, post-processing code, input/output specifications and dependencies. The generated BentoML bundle is a file directory that contains all the code files, serialized models, and configs required for reproducing this prediction service for inference. BentoML automatically captures all the python dependencies information and have everything versioned and managed together in one place.
AI predicts sales from weather to cut food waste in Fukuoka test
Fukuoka – The city of Fukuoka, jointly with the Japan Weather Association, is conducting an experiment to reduce food waste using artificial intelligence. In the experiment, AI is used to predict sales of products in line with weather conditions, allowing stores to adjust their order and production volumes. Participating stores were able to reduce waste and boost sales in the fiscal year that ended in March. The experiment uses the JWA's weather-based demand prediction service, which analyzes mainly weather conditions, temperatures, social media posts and past retail sales data to predict demand for more than 660 products, including fresh food and prepared food, in seven stages. In the experiment last fiscal year, six of the eight participating companies in the city saw their food waste decline, while seven logged increased sales.
How Uber Implements CI/CD Of Machine Learning Models
The ride-hailing giant Uber is currently present in 10K cities across 71 countries, and its platform is used by 93 million customers and 3.5 million drivers globally. Every quarter, the ride-hailing platform completes nearly 1.44 billion trips. However, as a result of a global pandemic and travel restrictions, the total number of quarterly Uber trips decreased by 24.21% in 2020. "At Uber, we have witnessed a significant increase in ML adoption across various organisations and use-cases over the last few years," said the company in its latest blog post co-authored by Yi Zhang, Joseph Wang, Jia Li, and Yunfeng Bai. The blog further highlighted various pain points, alongside explaining the solution implementation of continuous integration (CI) and continuous deployment (CD) of machine learning models as a solution.
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (0.60)