Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs

May-30-2025–arXiv.org Artificial Intelligence

Question answering (QA) systems have become a cornerstone of natural language processing (NLP), enabling machines to extract precise answers from textual contexts. The Stanford Question Answering Dataset (SQuAD) v1.1 [Rajpurkar et al., 2016] is a widely adopted benchmark for evaluating QA models, comprising over 87,000 training examples of context-question-answer triples. While transformer-based models like BERT [Devlin et al., 2019] have achieved state-of-the-art performance on SQuAD, their computational complexity often demands GPU acceleration, limiting deployment on resource-constrained devices like mid-range CPUs. This study addresses the challenge of developing a transformer-based QA model optimized for inference on a 13th Gen Intel i7-1355U CPU, a 10-core processor with a 5.0 GHz turbo frequency. We focus on DistilBERT [Sanh et al., 2020], a lightweight transformer, to balance performance--measured by F1 score and accuracy--with inference speed. Our contributions include: Comprehensive exploratory data analysis (EDA) of SQuAD v1.1 to inform modeling decisions. Data augmentation strategies to enhance model robustness to low-overlap question-context pairs.

large language model, machine learning, model 2, (19 more...)

arXiv.org Artificial Intelligence

May-30-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)
- Africa > Rwanda (0.14)

Genre:
- Research Report (0.86)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.70)
  - Machine Learning
    - Inductive Learning (0.69)
    - Neural Networks > Deep Learning (0.56)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found