MIT study finds labelling errors in datasets used to test AI

Mar-29-2021, 04:00:42 GMT–Engadget

A team led by computer scientists from MIT examined ten of the most-cited datasets used to test machine learning systems. They found that around 3.4 percent of the data was inaccurate or mislabeled, which could cause problems in AI systems that use these datasets. The datasets, which have each been cited more than 100,000 times, include text-based ones from newsgroups, Amazon and IMDb. Errors emerged from issues like Amazon product reviews being mislabeled as positive when they were actually negative and vice versa. Some of the image-based errors result from mixing up animal species.

dataset, mit study find, test ai

Engadget

Mar-29-2021, 04:00:42 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found