International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News item: 24 June 2024

Peng Yang, Zoe Lin Jiang, Jiehang Zhuang, Junbin Fang, Siu Ming Yiu, Xuan Wang
ePrint Report ePrint Report
Neural network inference as a service refers to the process where a cloud server holding a model provides inference services to a client initiating inference requests. Secure neural network inference built on this service ensures the privacy of both the cloud server's model and the client's data. A binarized neural network (BNN) is a type of neural network with binary weights and activations, expected to accelerate inference. When achieving secure BNN inference using multi-party computation, we must address the issue of non-uniform bitwidths, i.e., secure computation protocols cannot directly operate on values of different bitwidths and require bitwidth conversion. Existing bitwidth conversion schemes have to expand the bitwidths of weights and activations, incurring significant communication latency and computational load.

To address the above issues, we propose a secure BNN inference framework, FSSiBNN, with free bitwidth conversion based on function secret sharing (FSS). Specifically, by leveraging the property of FSS that supports arbitrary input and output bitwidths, we propose a bitwidth conversion embedding scheme. We naturally embed the bitwidth conversion into the FSS-based secure activation and max pooling computation, thereby avoiding the additional computational and communication overhead introduced by the bitwidth conversion. Moreover, we combine and convert multiple BNN layer functions into fewer matrix multiplication and comparison operations, and precompute multiplication tuples and FSS keys in the offline phase to achieve constant-round online inference.

In the experiment, we conduct tests on various datasets and models, and compare our results with state-of-the-art work. Compared to the existing best two-party framework XONN (USENIX Security '19), our work is approximately 7$\times$ faster in inference time and reduces communication overhead by about 577$\times$. Compared with the existing best three-party frameworks, SecureBiNN (ESORICS '22) and FLEXBNN (TIFS '23), our work is approximately 2.5$\times$ faster in inference time and reduces communication overhead by 1.3 to 16.4$\times$.
Expand

Additional news items may be found on the IACR news page.