Graphics chip giant Nvidia mopped up the floor with its competition in a benchmark set of tests released Wednesday afternoon, demonstrating better performance on a host of artificial intelligence tasks. The benchmark, called MLPerf, announced by the MLPerf organization, an industry consortium that administers the tests, showed Nvidia getting better speed on a variety of tasks that use neural networks, from categorizing images to recommending which products a person might like. Predictions are the part of AI where a trained neural network produces output on real data, as opposed to the training phase when the neural network system is first being refined. Benchmark results on training tasks were announced by MLPerf back in July. Many of the scores on the test results pertain to Nvidia's T4 chip that has been in the market for some time, but even more impressive results were reported for its A100 chips unveiled in May.
Nvidia and Google on Wednesday each announced that they had aced a series of tests called MLPerf to be the biggest and best in hardware and software to crunch common artificial intelligence tasks. The devil's in the details, but both companies' achievements show the trend in AI continues to be that of bigger and bigger machine learning endeavors, backed by more-brawny computers. Benchmark tests are never without controversy, and some upstart competitors of Nvidia and Google, notably Cerebras Systems and Graphcore, continued to avoid the benchmark competition. In the results announced Wednesday by the MLPerf organization, an industry consortium that administers the tests, Nvidia took top marks across the board for a variety of machine learning "training" tasks, meaning the computing operations required to develop a machine learning neural network from scratch. The full roster of results can be seen in a spreadsheet form.
Google used a cluster of 2,048 TPUs to train its largest-ever version of its BERT natural language program, consisting of 481 billion parameters, in 19 hours, as a submission to the MLPerf benchmark competition. The deep learning world of artificial intelligence continues to be obsessed with size. As ZDNet has reported, the state of the art in deep learning programs such as OpenAI's GPT-3 is to keep using more and more GPU chips, from Nvidia and AMD, or novel kinds of accelerator chips, to build ever-larger software programs. In general, accuracy of the programs increases with size, researchers contend. That concern with size was on full display Wednesday in the latest industry benchmark results reported by the MLCommons, which sets the standard for measuring how quickly computer chips can crunch deep learning code.
MLPerf, the benchmark suite of tests for how long it takes to train a computer to perform machine learning tasks, has a new contender with the release Wednesday of results showing Graphcore, the Bristol, U.K.-based startup, notching respectable times versus the two consistent heavyweights, Nvidia and Google. Graphcore, which was founded five years ago and has $710 million in financing, didn't take the top score in any of the MLPerf tests, but it reported results that are significant when compared with the other two in terms of number of chips used. Moreover, when leaving aside Google's submission, which isn't commercially available, Graphcore was the only competitor to enter into the top five commercially available results alongside Nvidia. "It's called the democratization of AI," said Matt Fyles, the head of software for Graphcore, in a press briefing. Companies that want to use AI, he said, "can get a very respectable result as an alternative to Nvidia, and it only gets better over time, we'll keep pushing our system."
Every few months, the artificial intelligence industry has a bake-off of the latest machine learning computer systems. The meet-up, which has been going on for several years, is typically focused on the best performance in multi-processor computers put together by chip vendors such as Nvidia and Qualcomm and their partners such as Dell, measured against a set of benchmark tasks such as object detection and image classification. This year, the bake-off has a novel twist: An examination of how much energy such massively parallel computer systems cost, as a kind of proxy for how energy-efficient the products are. The test, MLPerf, has now added industry standard measures of how much electricity in either watts our joules is drawn for a given task. "One of the things I'm really excited about is the MLPerf Power Project, which is how do we do full-system power measurement," Kanter, in a press briefing to discuss the MLPerf results, which were announced via press on release Wednesday.