Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs

Yinkfu, Ngeyen

arXiv.org Artificial Intelligence 

Question answering (QA) systems have become a cornerstone of natural language processing (NLP), enabling machines to extract precise answers from textual contexts. The Stanford Question Answering Dataset (SQuAD) v1.1 [Rajpurkar et al., 2016] is a widely adopted benchmark for evaluating QA models, comprising over 87,000 training examples of context-question-answer triples. While transformer-based models like BERT [Devlin et al., 2019] have achieved state-of-the-art performance on SQuAD, their computational complexity often demands GPU acceleration, limiting deployment on resource-constrained devices like mid-range CPUs. This study addresses the challenge of developing a transformer-based QA model optimized for inference on a 13th Gen Intel i7-1355U CPU, a 10-core processor with a 5.0 GHz turbo frequency. We focus on DistilBERT [Sanh et al., 2020], a lightweight transformer, to balance performance--measured by F1 score and accuracy--with inference speed. Our contributions include: Comprehensive exploratory data analysis (EDA) of SQuAD v1.1 to inform modeling decisions. Data augmentation strategies to enhance model robustness to low-overlap question-context pairs.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found