BERT Fine Tuning Benchmark on Quadro RTX 8000 GPUs

#artificialintelligence 

For this post, we measured fine tuning performance (training and inference) for the BERT (Bidirectional Encoder Representations from Transformers) implementation in TensorFlow using NVIDIA Quadro RTX 8000 GPUs. For testing, we used an Exxact Valence Workstation fitted with 4x Quadro RTX 8000's with NVLink, giving us 192 GB of GPU memory for our system. These tests measure performance for a popular use case for BERT and NLP in general, and are meant to show typical GPU performance for such a task. We made slight modifications to the training benchmark script to get the larger batch size metrics. The script runs multiple tests on the SQuAD v1.1 dataset using batch sizes 1, 2, 4, 8, 16, 32, and 64 for training, and 1, 2, 4, and 8 for inference.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found