IACR News item: 22 November 2024

THOR: Secure Transformer Inference with Homomorphic Encryption

Jungho Moon, Dongwoo Yoo, Xiaoqian Jiang, Miran Kim

ePrint Report

As language models are increasingly deployed in cloud environments, privacy concerns have become a significant issue. To address this, we design THOR, a secure inference framework for transformer models on encrypted data. Specifically, we first propose new fast matrix multiplication algorithms based on diagonal-major order encoding and extend them to parallel matrix computation through the compact ciphertext packing technique. Second, we design efficient protocols for secure computations of four non-linear functions such as softmax, LayerNorm, GELU, and Tanh, by integrating advanced underlying approximation methods with tailored optimizations. Our matrix multiplication algorithms reduce the number of key-switching operations in the linear layers of the attention block in the BERT-base model by up to 14.5x, compared to the state-of-the-art HE-based secure inference protocol (Park et al., Preprint). Combined with cryptographic optimizations, our experimental results demonstrate that THOR provides secure inference for the BERT-base model with a latency of 10.43 minutes on a single GPU, while maintaining comparable inference accuracy on the MRPC dataset.

Expand

Additional news items may be found on the IACR news page.

International Association for Cryptologic Research

International Association
for Cryptologic Research