Leveraging Unlabeled Data

Communications of the ACM 

Despite the rapid advances it has made it over the past decade, deep learning presents many industrial users with problems when they try to implement the technology, issues that the Internet giants have worked around through brute force. "The challenge that today's systems face is the amount of data they need for training," says Tim Ensor, head of artificial intelligence (AI) at U.K.-based technology company Cambridge Consultants. "On top of that, it needs to be structured data." Most of the commercial applications and algorithm benchmarks used to test deep neural networks (DNNs) consume copious quantities of labeled data; for example, images or pieces of text that have already been tagged in some way by a human to indicate what the sample represents. The Internet giants, who have collected the most data for use in training deep learning systems, have often resorted to crowdsourcing measures such as asking people to prove they are human during logins by identifying objects in a collection of images, or simply buying manual labor through services such as Amazon's Mechanical Turk.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found