This New Technique Called Distillation Can Vastly Speed Up Today's Neural Networks

May-2-2018, 01:56:13 GMT–#artificialintelligence

Common Crawl is an open repository of web crawl data. The researchers used English language documents which have long paragraphs because they wanted a data which allowed modelling of long range dependencies. The researchers constructed batches of 32 word pieces. To begin with, the goal was to determine the maximum number of GPU workers which can be employed for SGD. The researchers also tried asynchronous SGD with 32 and 128 workers and found that with large number of workers it is difficult to keep training stable.

artificial intelligence, machine learning, neural network, (6 more...)

#artificialintelligence

May-2-2018, 01:56:13 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found