RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation

Ettori, Davide, Darabi, Nastaran, Senthilkumar, Sureshkumar, Trivedi, Amit Ranjan

Sep-30-2025–arXiv.org Artificial Intelligence

Large deep learning models such as BERT and ResNet achieve state-of-the-art performance but are costly to deploy at the edge due to their size and compute demands. We present RMT-KD, a compression method that leverages Random Matrix Theory (RMT) for knowledge distillation to iteratively reduce network size. Instead of pruning or heuristic rank selection, RMT-KD preserves only informative directions identified via the spectral properties of hidden representations. RMT-based causal reduction is applied layer by layer with self-distillation to maintain stability and accuracy. On GLUE, AG News, and CIFAR-10, RMT-KD achieves up to 80% parameter reduction with only 2% accuracy loss, delivering 2.8x faster inference and nearly halved power consumption. These results establish RMT-KD as a mathematically grounded approach to network distillation.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Sep-30-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Middle East
  - Tunisia > Ben Arous Governorate > Ben Arous (0.05)
- North America > United States
  - Illinois > Cook County > Chicago (0.05)

Genre:
- Research Report (0.51)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.88)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found