Collaborating Authors

Shapash- Python Library To Make Machine Learning Interpretable


The above quote is quite interesting and yes, they speak the truth most of us are from the technical field so we probably know about what machine learning is? it is the current worldwide digital technology ruled over the world. If you are familiar with machine learning then you come across the words data, train, test, accuracy, and many more, and many of you are capable of writing machine learning scripts if you notice that we didn't see the background calculations of the machine learning models because machine learning is not interpretable. Many people say that the machine learning models are the black box models, suppose if we give input there are a lot of calculations are happening inside and we got the output, that particular calculation based on what feature we are actually giving. Suppose we give the input of 5 features inside this, it may be a situation where some of the feature value may be increasing and some of them are decreasing, so we not able to see this, but python has a beautiful library which makes a machine learning model interpretable by this we can able to understand that underground calculations. This beautiful library is developed by a group of MAIF Data Scientists.

Top Stories, Mar 29 – Apr 4: Top 10 Python Libraries Data Scientists should know in 2021; Shapash: Making Machine Learning Models Understandable - KDnuggets


Shapash: Making Machine Learning Models Understandable, by Yann Golhen What's ETL?, by Omer Mahmood Easy AutoML in Python, by Dylan Sherry Deep Learning Is Becoming Overused, by Michael Grogan The 8 Most Common Data Scientists, by JABDE How To Overcome The Fear of Math and Learn Math For Data Science, by Arnuld On Data More Data Science Cheatsheets, by Matthew Mayo How to Succeed in Becoming a Freelance Data Scientist, by Devin Partida Top 10 Python Libraries Data Scientists should know in 2021, by Terence Shin Are You Still Using Pandas to Process Big Data in 2021? Are You Still Using Pandas to Process Big Data in 2021?

The connection between transparency, auditability, and AI


Up until recently, we accepted the "black box" narrative surrounding AI as a necessary evil that could not be extrapolated away from AI as a concept. We understood that tradeoffs were sometimes necessary to achieve performance accuracy at the expense of transparency and explainability. Fortunately, there have been advancements in the last few years that make it technologically feasible to explain why AI models reach decisions, which represents an inflection point for the future of this transformative technology. In order for AI to become mainstream, we must have transparency and insight into how AI-enabled decisions were reached--meaning we should be able pinpoint what factors contributed to a decision. Today's approach is one full of risk as organizations rely on trusting already overburdened data scientists to document and explain their work.

Data Readiness Report Artificial Intelligence

Data exploration and quality analysis is an important yet tedious process in the AI pipeline. Current practices of data cleaning and data readiness assessment for machine learning tasks are mostly conducted in an arbitrary manner which limits their reuse and results in loss of productivity. We introduce the concept of a Data Readiness Report as an accompanying documentation to a dataset that allows data consumers to get detailed insights into the quality of input data. Data characteristics and challenges on various quality dimensions are identified and documented keeping in mind the principles of transparency and explainability. The Data Readiness Report also serves as a record of all data assessment operations including applied transformations. This provides a detailed lineage for the purpose of data governance and management. In effect, the report captures and documents the actions taken by various personas in a data readiness and assessment workflow. Overtime this becomes a repository of best practices and can potentially drive a recommendation system for building automated data readiness workflows on the lines of AutoML [8]. We anticipate that together with the Datasheets [9], Dataset Nutrition Label [11], FactSheets [1] and Model Cards [15], the Data Readiness Report makes significant progress towards Data and AI lifecycle documentation.

A Methodology for Creating AI FactSheets Artificial Intelligence

As AI models and services are used in a growing number of highstakes areas, a consensus is forming around the need for a clearer record of how these models and services are developed to increase trust. Several proposals for higher quality and more consistent AI documentation have emerged to address ethical and legal concerns and general social impacts of such systems. However, there is little published work on how to create this documentation. This is the first work to describe a methodology for creating the form of AI documentation we call FactSheets. We have used this methodology to create useful FactSheets for nearly two dozen models. This paper describes this methodology and shares the insights we have gathered. Within each step of the methodology, we describe the issues to consider and the questions to explore with the relevant people in an organization who will be creating and consuming the AI facts in a FactSheet. This methodology will accelerate the broader adoption of transparent AI documentation.