Michiel Van Beirendonck

CryptoDB

Michiel Van Beirendonck

Publications and invited talks

Year

Venue

Title

2025

TCHES

OPTIMSM: FPGA hardware accelerator for Zero-Knowledge MSM Abstract

Xander Pottier Thomas de Ruijter Jonas Bertels Wouter Legiest Michiel Van Beirendonck Ingrid Verbauwhede

The Multi-Scalar Multiplication (MSM) is the main barrier to accelerating Zero-Knowledge applications. In recent years, hardware acceleration of this algorithm on both FPGA and GPU has become a popular research topic and the subject of a multi-million dollar prize competition (ZPrize). This work presents OPTIMSM: Optimized Processing Through Iterative Multi-Scalar Multiplication. This novel accelerator focuses on the acceleration of the MSM algorithm for any Elliptic Curve (EC) by improving upon the Pippenger algorithm. A new iteration technique is introduced to decouple the required buckets from the window size, resulting in fewer EC computations for the same on-chip memory resources. Furthermore, we combine known optimizations from the literature for the first time to achieve additional latency improvements. Our enhanced MSM implementation significantly reduces computation time, achieving a speedup of up to x12.77 compared to recent FPGA implementations. Specifically, for the BLS12-381 curve, we reduce the computation time for an MSM of size 224 to 914 ms using a single compute unit on the U55C FPGA or to 231 ms using four U55C devices. These results indicate a substantial improvement in efficiency, paving the way for more scalable and efficient Zero-Knowledge proof systems.

2023

TCHES

BASALISC: Programmable Hardware Accelerator for BGV Fully Homomorphic Encryption Abstract

Robin Geelen Michiel Van Beirendonck Hilder V. L. Pereira Brian Huffman Tynan McAuley Ben Selfridge Daniel Wagner Georgios Dimou Ingrid Verbauwhede Frederik Vercauteren David W. Archer

Fully Homomorphic Encryption (FHE) allows for secure computation on encrypted data. Unfortunately, huge memory size, computational cost and bandwidth requirements limit its practicality. We present BASALISC, an architecture family of hardware accelerators that aims to substantially accelerate FHE computations in the cloud. BASALISC is the first to implement the BGV scheme with fully-packed bootstrapping – the noise removal capability necessary for arbitrary-depth computation. It supports a customized version of bootstrapping that can be instantiated with hardware multipliers optimized for area and power.BASALISC is a three-abstraction-layer RISC architecture, designed for a 1 GHz ASIC implementation and underway toward 150mm2 die tape-out in a 12nm GF process. BASALISC’s four-layer memory hierarchy includes a two-dimensional conflict-free inner memory layer that enables 32 Tb/s radix-256 NTT computations without pipeline stalls. Its conflict-resolution permutation hardware is generalized and re-used to compute BGV automorphisms without throughput penalty. BASALISC also has a custom multiply-accumulate unit to accelerate BGV key switching.The BASALISC toolchain comprises a custom compiler and a joint performance and correctness simulator. To evaluate BASALISC, we study its physical realizability, emulate and formally verify its core functional units, and we study its performance on a set of benchmarks. Simulation results show a speedup of more than 5,000× over HElib – a popular software FHE library.

2022

TCHES

Masked Accelerators and Instruction Set Extensions for Post-Quantum Cryptography Abstract

Tim Fritzmann Michiel Van Beirendonck Debapriya Basu Roy Patrick Karl Thomas Schamberger Ingrid Verbauwhede Georg Sigl

Side-channel attacks can break mathematically secure cryptographic systems leading to a major concern in applied cryptography. While the cryptanalysis and security evaluation of Post-Quantum Cryptography (PQC) have already received an increasing research effort, a cost analysis of efficient side-channel countermeasures is still lacking. In this work, we propose a masked HW/SW codesign of the NIST PQC finalists Kyber and Saber, suitable for their different characteristics. Among others, we present a novel masked ciphertext compression algorithm for non-power-of-two moduli. To accelerate linear performance bottlenecks, we developed a generic Number Theoretic Transform (NTT) multiplier, which, in contrast to previously published accelerators, is also efficient and suitable for schemes not based on NTT. For the critical non-linear operations, masked HW accelerators were developed, allowing a secure execution using RISC-V instruction set extensions. With the proposed design, we achieved a cycle count of K:214k/E:298k/D:313k for Kyber and K:233k/E:312k/D:351k for Saber with NIST Level III parameter sets. For the same parameter sets, the masking overhead for the first-order secure decapsulation operation including randomness generation is a factor of 4.48 for Kyber (D:1403k)and 2.60 for Saber (D:915k).

2022

TCHES

Higher-Order Masked Ciphertext Comparison for Lattice-Based Cryptography Abstract

Jan-Pieter D’Anvers Daniel Heinz Peter Pessl Michiel Van Beirendonck Ingrid Verbauwhede

Checking the equality of two arrays is a crucial building block of the Fujisaki-Okamoto transformation, and as such it is used in several post-quantum key encapsulation mechanisms including Kyber and Saber. While this comparison operation is easy to perform in a black box setting, it is hard to efficiently protect against side-channel attacks. For instance, the hash-based method by Oder et al. is limited to first-order masking, a higher-order method by Bache et al. was shown to be flawed, and a very recent higher-order technique by Bos et al. suffers in runtime. In this paper, we first demonstrate that the hash-based approach, and likely many similar first-order techniques, succumb to a relatively simple side-channel collision attack. We can successfully recover a Kyber512 key using just 6000 traces. While this does not break the security claims, it does show the need for efficient higher-order methods. We then present a new higher-order masked comparison algorithm based on the (insecure) higher-order method of Bache et al. Our new method is 4.2x, resp. 7.5x, faster than the method of Bos et al. for a 2nd, resp. 3rd, -order masking on the ARM Cortex-M4, and unlike the method of Bache et al., the new technique takes ciphertext compression into account. We prove correctness, security, and masking security in detail and provide performance numbers for 2nd and 3rd-order implementations. Finally, we verify our the side-channel security of our implementation using the test vector leakage assessment (TVLA) methodology.

2021

TCHES

Analysis and Comparison of Table-based Arithmetic to Boolean Masking 📺 Abstract

Michiel Van Beirendonck Jan-Pieter D’Anvers Ingrid Verbauwhede

Masking is a popular technique to protect cryptographic implementations against side-channel attacks and comes in several variants including Boolean and arithmetic masking. Some masked implementations require conversion between these two variants, which is increasingly the case for masking of post-quantum encryption and signature schemes. One way to perform Arithmetic to Boolean (A2B) mask conversion is a table-based approach first introduced by Coron and Tchulkine, and later corrected and adapted by Debraize in CHES 2012. In this work, we show both analytically and experimentally that the table-based A2B conversion algorithm proposed by Debraize does not achieve the claimed resistance against differential power analysis due to a non-uniform masking of an intermediate variable. This non-uniformity is hard to find analytically but leads to clear leakage in experimental validation. To address the non-uniform masking issue, we propose two new A2B conversions: one that maintains efficiency at the cost of additional memory and one that trades efficiency for a reduced memory footprint. We give analytical and experimental evidence for their security, and will make their implementations, which are shown to be free from side-channel leakage in 100.000 power traces collected on the ARM Cortex-M4, available online. We conclude that when designing side-channel protection mechanisms, it is of paramount importance to perform both a theoretical analysis and an experimental validation of the method.

2021

TCHES

Attacking and Defending Masked Polynomial Comparison for Lattice-Based Cryptography 📺 Abstract

Shivam Bhasin Jan-Pieter D’Anvers Daniel Heinz Thomas Pöppelmann Michiel Van Beirendonck

In this work, we are concerned with the hardening of post-quantum key encapsulation mechanisms (KEM) against side-channel attacks, with a focus on the comparison operation required for the Fujisaki-Okamoto (FO) transform. We identify critical vulnerabilities in two proposals for masked comparison and successfully attack the masked comparison algorithms from TCHES 2018 and TCHES 2020. To do so, we use first-order side-channel attacks and show that the advertised security properties do not hold. Additionally, we break the higher-order secured masked comparison from TCHES 2020 using a collision attack, which does not require side-channel information. To enable implementers to spot such flaws in the implementation or underlying algorithms, we propose a framework that is designed to test the re-encryption step of the FO transform for information leakage. Our framework relies on a specifically parametrized t-test and would have identified the previously mentioned flaws in the masked comparison. Our framework can be used to test both the comparison itself and the full decapsulation implementation.