Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs
–arXiv.org Artificial Intelligence
Question answering (QA) systems have become a cornerstone of natural language processing (NLP), enabling machines to extract precise answers from textual contexts. The Stanford Question Answering Dataset (SQuAD) v1.1 [Rajpurkar et al., 2016] is a widely adopted benchmark for evaluating QA models, comprising over 87,000 training examples of context-question-answer triples. While transformer-based models like BERT [Devlin et al., 2019] have achieved state-of-the-art performance on SQuAD, their computational complexity often demands GPU acceleration, limiting deployment on resource-constrained devices like mid-range CPUs. This study addresses the challenge of developing a transformer-based QA model optimized for inference on a 13th Gen Intel i7-1355U CPU, a 10-core processor with a 5.0 GHz turbo frequency. We focus on DistilBERT [Sanh et al., 2020], a lightweight transformer, to balance performance--measured by F1 score and accuracy--with inference speed. Our contributions include: Comprehensive exploratory data analysis (EDA) of SQuAD v1.1 to inform modeling decisions. Data augmentation strategies to enhance model robustness to low-overlap question-context pairs.
arXiv.org Artificial Intelligence
May-30-2025
- Country:
- Africa > Rwanda
- North America > United States
- Colorado (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Genre:
- Research Report (0.86)
- Technology: