LoRO: Real-Time on-Device Secure Inference for LLMs via TEE-Based Low Rank Obfuscation

Jun-15-2026, 09:33:27 GMT–Neural Information Processing Systems

While Large Language Models (LLMs) have gained remarkable success, they are consistently at risk of being stolen when deployed on untrusted edge devices. As a solution, TEE-based secure inference has been proposed to protect valuable model property. However, we identify a statistical vulnerability in existing protection methods, and furtherly compromise their security guarantees by proposed Model Stealing Attack with Prior. To eliminate this vulnerability, LoRO is presented in this paper, which leverages dense mask to completely obfuscate parameters. LoRO includes two innovations: (1) Low Rank Mask, which uses low-rank factors to generate dense masks efficiently. The computing complexity in TEE is hence reduced by an exponential amount to achieve inference speed up, while providing robust model confidentiality.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Jun-15-2026, 09:33:27 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > Minnesota (0.28)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found