Rishub Nagpal

CryptoDB

Rishub Nagpal

Publications and invited talks

Year

Venue

Title

2025

CIC

On Loopy Belief Propagation for SASCAs Abstract

Rishub Nagpal Gaëtan Cassiers Robert Primas Christian Knoll Franz Pernkopf Stefan Mangard

Profiled power analysis is one of the most powerful forms of passive side-channel attacks. Over the last two decades, many works have analyzed their impact on cryptographic implementations as well as corresponding countermeasure techniques. To date, the most advanced variants of profiled power analysis are based on Soft-analytical Side-Channel Attacks (SASCA). After the initial profiling phase, a SASCA adversary creates a probabilistic graphical model, called a factor graph, of the target implementation and encodes the results of the previous step as prior information. Then, an inference algorithm such as loopy Belief Propagation (BP) can be used to recover the distribution of a target variable in the graph, i.e., sensitive data/keys. Designers of cryptographic implementations aim to reduce information leakage as much as possible and assess how much leakage can be allowed without compromising security requirements. Despite the existence of many works on profiled power analysis, it is still notoriously difficult to state under which conditions a cryptographic implementation provides sufficient protection against a profiling attacker with certain capabilities. In particular, it is unknown when a BP-based attack is optimal or whether tuning some heuristics in that algorithm may significant strengthen the attack. This knowledge gap led us to investigate the effectiveness of BP for SASCAs by studying the modes of failures of BP in the context of the SASCA, and systematically analyzing the behavior of BP on practically-relevant factor graphs. We use exact inference to gauge the quality of the approximation provided by BP. Through this assessment, we show that there exists a significant disparity between BP and exact inference in terms of guessing entropy when performing SASCAs on several classes of factor graphs. We further review and analyze various BP improvement heuristics from the literature.

2024

TCHES

Compress: Generate Small and Fast Masked Pipelined Circuits Abstract

Gaëtan Cassiers Barbara Gigerl Stefan Mangard Charles Momin Rishub Nagpal

Masking is an effective countermeasure against side-channel attacks. It replaces every logic gate in a computation by a gadget that performs the operation over secret sharings of the circuit’s variables. When masking is implemented in hardware, care should be taken to protect against leakage from glitches, which could otherwise undermine the security of masking. This is generally done by adding registers, which stop the propagation of glitches, but introduce additional latency and area cost. In masked pipeline circuits, a high latency further increases the area overheads of masking, due to the need for additional registers that synchronize signals between pipeline stages. In this work, we propose a technique to minimize the number of such pipeline registers, which relies on optimizing the scheduling of the computations across the pipeline stages. We release an implementation of this technique as an open-source tool, Compress. Further, we introduce other optimizations to deduplicate logic between gadgets, perform an optimal selection of masked gadgets, and introduce new gadgets with smaller area. Overall, our optimizations lead to circuits that improve the state-of-the art in area and achieve state-of-the-art latency. For example, a masked AES based on an S-box generated by Compress reduces latency by 19% and area by 27% over a state-of-the-art implementation, or, for the same latency, reduces area by 45%.

2022

TCHES

Riding the Waves Towards Generic Single-Cycle Masking in Hardware Abstract

Rishub Nagpal Barbara Gigerl Robert Primas Stefan Mangard

Research on the design of masked cryptographic hardware circuits in the past has mostly focused on reducing area and randomness requirements. However, many embedded devices like smart cards and IoT nodes also need to meet certain performance criteria, which is why the latency of masked hardware circuits also represents an important metric for many practical applications.The root cause of latency in masked hardware circuits is the need for additional register stages that synchronize the propagation of shares. Otherwise, glitches would violate the basic assumptions of the used masking scheme. This issue can be addressed to some extent, e.g., by using lightweight cryptographic algorithms with low-degree Sboxes, however, many applications still require the usage of schemes with higher-degree S-boxes like AES. Several recent works have already proposed solutions that help reduce this latency yet they either come with noticeably increased area/randomness requirements, limitations on masking orders, or specific assumptions on the general architecture of the crypto core.In this work, we introduce a generic and efficient method for designing single-cycle glitch-resistant (higher-order) masked hardware of cryptographic S-boxes. We refer to this technique as (generic) Self-Synchronized Masking (“SESYM”). The main idea of our approach is to replace register stages with a partial dual-rail encoding of masked signals that ensures synchronization within the circuit. More concretely, we show that WDDL gates and Muller C-elements can be used in combination with standard masking schemes to design single-cycle S-box circuits that, especially in case of higher-degree S-boxes, have noticeably lower requirements in terms of area and online randomness. We apply our method to DOM-based S-boxes of Ascon and AES and compare the resulting circuits to existing latency optimized circuits based on TI, GLM, and LMDPL. The latency of all three designs is reduced to single-cycle operation and are dth-order secure. Compared to GLM-masked Ascon, our approach comes with a 6.4 times reduction in online randomness for all protection orders. Compared to 1st-order LMDPL-masked AES, our approach achieves comparable results, while it is more generic, amongst others, by also supporting higher-order designs. We also underline the practical protection of our constructions against power analysis attacks via empirical and formal verification approaches.