Paradoxes in Data Science
Paradoxes are a class of phenomena that arise when, although starting from premises known as true, we derive some sort of logically unreasonable result. As Machine Learning models create knowledge from data, this makes them susceptible to possible cognitive paradoxes between training and testing. One of the most common forms of paradox in Data Science is Simpson's Paradox. As an example, let us consider a thought experiment: we carried out a research study in order to find out if doing daily physical exercises can help or not reduce Cholesterol levels (in mg/dL) and we are now starting to examine the obtained results. First, we divide our population sample into two main categories based on the individual's age (under/over 60 years old) and then we plot their cholesterol levels against the number of hours the subjects exercised per day.
Mar-1-2022, 11:05:47 GMT
- Industry:
- Technology: