ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

Created by MG96

External Public cs.CR cs.LG

Statistics

Citations
0
References
18
Last updated
Loading...
Authors

Tong Zhou Shijin Duan Gaowen Liu Charles Fleming Ramana Rao Kompella Shaolei Ren Xiaolin Xu
Project Resources

Name Type Source Actions
ArXiv Paper Paper arXiv
Semantic Scholar Paper Semantic Scholar
Abstract

Pre-trained models are valuable intellectual property, capturing both domain-specific and domain-invariant features within their weight spaces. However, model extraction attacks threaten these assets by enabling unauthorized source-domain inference and facilitating cross-domain transfer via the exploitation of domain-invariant features. In this work, we introduce **ProDiF**, a novel framework that leverages targeted weight space manipulation to secure pre-trained models against extraction attacks. **ProDiF** quantifies the transferability of filters and perturbs the weights of critical filters in unsecured memory, while preserving actual critical weights in a Trusted Execution Environment (TEE) for authorized users. A bi-level optimization further ensures resilience against adaptive fine-tuning attacks. Experimental results show that **ProDiF** reduces source-domain accuracy to near-random levels and decreases cross-domain transferability by 74.65\%, providing robust protection for pre-trained models. This work offers comprehensive protection for pre-trained DNN models and highlights the potential of weight space manipulation as a novel approach to model security.

Note:

No note available for this project.

No note available for this project.
Contact:

No contact available for this project.

No contact available for this project.