Why you need to improve your training data, and how to do it

@machinelearnbot 

Andrej Karpathy showed this slide as part of his talk at Train AI and I loved it! Academic papers are almost entirely focused on new and improved models, with datasets usually chosen from a small set of public archives. Everyone I know who uses deep learning as part of an actual application spends most of their time worrying about the training data instead. There are lots of good reasons why researchers are so fixated on model architectures, but it does mean that there are very few resources available to guide people who are focused on deploying machine learning in production. To address that, my talk at the conference was on "the unreasonable effectiveness of training data", and I want to expand on that a bit in this blog post, explaining why data is so important along with some practical tips on improving it. As part of my job I work closely with a lot of researchers and product teams, and my belief in the power of data improvements comes from the massive gains I've seen them achieve when they concentrate on that side of their model building.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found