Pei Cao

CryptoDB

Pei Cao

Publications and invited talks

Year

Venue

Title

2024

TCHES

Optimized Hardware-Software Co-Design for Kyber and Dilithium on RISC-V SoC FPGA Abstract

Tengfei Wang Chi Zhang Xiaolin Zhang Dawu Gu Pei Cao

Kyber and Dilithium are both lattice-based post-quantum cryptography (PQC) algorithms that have been selected for standardization by the American National Institute of Standards and Technology (NIST). NIST recommends them as two primary algorithms to be implemented for most use cases. As the applications of RISC-V processors move from specialized scenarios to general scenarios, efficient implementations of PQC algorithms on general-purpose RISC-V platforms are required. In this work, we present an optimized hardware-software co-design for Kyber and Dilithium on the industry’s first RISC-V System-on-Chip (SoC) Field Programmable Gate Array (FPGA) platform. The performance of both algorithms is enhanced through the utilization of hardware acceleration and software optimization, while a certain level of flexibility is still maintained. The polynomial arithmetic operations in Kyber and Dilithium are accelerated by the customized accelerators. We employ a unified high-level architecture to depict their shared characteristics and design dedicated underlying modular multipliers to explore their distinctive features. The hashing functions are optimized using RISC-V assembly instructions, resulting in improved performance and reduced code size without additional hardware resources. For other operations involving matrices and vectors, we present a multi-core acceleration scheme based on the multi-core RISC-V Microprocessor Sub-System (MSS). Combining these acceleration and optimization methods, experimental results show that the overall performance of Kyber and Dilithium across different security levels improves by 3 to 5 times, while the utilized FPGA resources account for less than 5% of the total resources provided by the platform.

2021

TCHES

Pay Attention to Raw Traces: A Deep Learning Architecture for End-to-End Profiling Attacks 📺 Abstract

Xiangjun Lu Chi Zhang Pei Cao Dawu Gu Haining Lu

With the renaissance of deep learning, the side-channel community also notices the potential of this technology, which is highly related to the profiling attacks in the side-channel context. Many papers have recently investigated the abilities of deep learning in profiling traces. Some of them also aim at the countermeasures (e.g., masking) simultaneously. Nevertheless, so far, all of these papers work with an (implicit) assumption that the number of time samples in raw traces can be reduced before the profiling, i.e., the position of points of interest (PoIs) can be manually located. This is arguably the most challenging part of a practical black-box analysis targeting an implementation protected by masking. Therefore, we argue that to fully utilize the potential of deep learning and get rid of any manual intervention, the end-to-end profiling directly mapping raw traces to target intermediate values is demanded.In this paper, we propose a neural network architecture that consists of encoders, attention mechanisms and a classifier, to conduct the end-to-end profiling. The networks built by our architecture could directly classify the traces that contain a large number of time samples (i.e., raw traces without manual feature extraction) while whose underlying implementation is protected by masking. We validate our networks on several public datasets, i.e., DPA contest v4 and ASCAD, where over 100,000 time samples are directly used in profiling. To our best knowledge, we are the first that successfully carry out end-to-end profiling attacks. The results on the datasets indicate that our networks could get rid of the tricky manual feature extraction. Moreover, our networks perform even systematically better (w.r.t. the number of traces in attacks) than those trained on the reduced traces. These validations imply our approach is not only a first but also a concrete step towards end-to-end profiling attacks in the side-channel context.

2021

TCHES

Cross-Device Profiled Side-Channel Attack with Unsupervised Domain Adaptation 📺 Abstract

Pei Cao Chi Zhang Xiangjun Lu Dawu Gu

Deep learning (DL)-based techniques have recently proven to be very successful when applied to profiled side-channel attacks (SCA). In a real-world profiled SCA scenario, attackers gain knowledge about the target device by getting access to a similar device prior to the attack. However, most state-of-the-art literature performs only proof-of-concept attacks, where the traces intended for profiling and attacking are acquired consecutively on the same fully-controlled device. This paper reminds that even a small discrepancy between the profiling and attack traces (regarded as domain discrepancy) can cause a successful single-device attack to completely fail. To address the issue of domain discrepancy, we propose a Cross-Device Profiled Attack (CDPA), which introduces an additional fine-tuning phase after establishing a pretrained model. The fine-tuning phase is designed to adjust the pre-trained network, such that it can learn a hidden representation that is not only discriminative but also domain-invariant. In order to obtain domain-invariance, we adopt a maximum mean discrepancy (MMD) loss as a constraint term of the classic cross-entropy loss function. We show that the MMD loss can be easily calculated and embedded in a standard convolutional neural network. We evaluate our strategy on both publicly available datasets and multiple devices (eight Atmel XMEGA 8-bit microcontrollers and three SAKURA-G evaluation boards). The results demonstrate that CDPA can improve the performance of the classic DL-based SCA by orders of magnitude, which significantly eliminates the impact of domain discrepancy caused by different devices.