IACR News item: 14 September 2024
Dongjin Park, Eunsang Lee, Joon-Woo Lee
ePrint Report
We propose an efficient non-interactive privacy-preserving Transformer inference architecture called Powerformer. Since softmax is a non-algebraic operation, previous studies have attempted to modify it to be HE-friendly, but these methods have encountered issues with accuracy degradation or prolonged execution times due to the use of multiple bootstrappings. We propose replacing softmax with a new ReLU-based function called the \textit{Batch Rectifier-Power max} (BRPmax) function without any unstable approximation methods, which outperforms even original BERT performance within BERT-Large model while requiring fewer levels, allowing it to operate with only a single bootstrapping. We also present a matrix multiplication algorithms specialized for attention block that reduce the number of key-switchings by 35% to 91% compared to existing state-of-the-art methods. We design clear end-to-end HE-based implementation for private Transformer model, and our implementation of Powerformer on the BERT-tiny model using RNS-CKKS takes 503 seconds on a single-threaded CPU, and to the best of our knowledge, this is the first end-to-end non-interactive Transformer implementation using HE.
Additional news items may be found on the IACR news page.