Thorben Moos

CryptoDB

Thorben Moos

ORCID: 0000-0003-3809-9803

Publications and invited talks

Year

Venue

Title

2025

TCHES

mid-pSquare: Leveraging the Strong Side-Channel Security of Prime-Field Masking in Software Abstract

Brieuc Balon Lorenzo Grassi Pierrick Méaux Thorben Moos François-Xavier Standaert Matthias Johann Steiner

Efficiently protecting embedded software implementations of standard symmetric cryptographic primitives against side-channel attacks has been shown to be a considerable challenge in practice. This is, in part, due to the most natural countermeasure for such ciphers, namely Boolean masking, not amplifying security well in the absence of sufficient physical noise in the measurements. So-called primefield masking has been demonstrated to provide improved theoretical guarantees in this context, and the Feistel for Prime Masking (FPM) family of Tweakable Block Ciphers (TBCs) has been recently introduced by Grassi et al. (Eurocrypt’24) to efficiently leverage these advantages. However, it was so far only instantiated for and empirically evaluated in a hardware implementation context, by using a small (7-bit) prime modulus.In this paper, we build on the theoretical incentive to increase the prime field size to obtain improved side-channel (Faust et al., Eurocrypt’24) and fault (Moos et al., CHES’24) resistance, as well as on the practical incentive to instantiate an FPM instance with optimized performance on 32-bit software platforms. We introduce mid-pSquare for this purpose, a lightweight TBC operating over a 31-bit Mersenne prime field. We first provide an in-depth black-box security analysis with a particular focus on algebraic attacks – which, contrary to the cryptanalysis of instances over smaller primes, are more powerful than statistical ones in our setting. We also design a strong tweak schedule to account for potential related-tweak algebraic attacks which, so far, are almost unknown in the literature. We then demonstrate that mid-pSquare implementations deliver very competitive performance results on the target platform compared to analogous binary TBCs regardless of masked or unmasked implementation (we use fix-sliced SKINNY for our comparisons). Finally, we experimentally establish the side-channel security improvements that masked mid-pSquare can lead to, reaching unmatched resistance to profiled horizontal attacks on lightweight 32-bit processors (ARM Cortex-M4).

2024

EUROCRYPT

Generalized Feistel Ciphers for Efficient Prime Field Masking Abstract

Lorenzo Grassi Loïc Masure Pierrick Méaux Thorben Moos François-Xavier Standaert

A recent work from Eurocrypt 2023 suggests that prime-field masking has excellent potential to improve the efficiency vs. security tradeoff of masked implementations against side-channel attacks, especially in contexts where physical leakages show low noise. We pick up on the main open challenge that this seed result leads to, namely the design of an optimized prime cipher able to take advantage of this potential. Given the interest of tweakable block ciphers with cheap inverses in many leakage-resistant designs, we start by describing the FPM (Feistel for Prime Masking) family of tweakable block ciphers based on a generalized Feistel structure. We then propose a first instantiation of FPM, which we denote as small-pSquare. It builds on the recent observation that the square operation (which is non-linear in Fp) can lead to masked gadgets that are more efficient than those for multiplication, and is tailored for efficient masked implementations in hardware. We analyze the mathematical security of the FPM family of ciphers and the small-pSquare instance, trying to isolate the parts of our study that can be re-used for other instances. We additionally evaluate the implementation features of small-pSquare by comparing the efficiency vs. security tradeoff of masked FPGA circuits against those of a state-of-the art binary cipher, namely SKINNY, confirming significant gains in relevant contexts.

2024

CIC

Randomness Generation for Secure Hardware Masking – Unrolled Trivium to the Rescue Abstract

Gaëtan Cassiers Loïc Masure Charles Momin Thorben Moos Amir Moradi François-Xavier Standaert

<p>Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency vs. security tradeoff of masked implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform most competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a higher rate per cycle for the same or lower cost as Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating $n$ fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as $20n$ to $30n$ ASIC gate equivalents (GE) or $3n$ to $4n$ FPGA look-up tables (LUTs), where $n$ is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable. This provides further motivation to potentially move low randomness usage from a primary to a secondary design goal in hardware masking research. </p>

2024

TCHES

Static Leakage in Dual-Rail Precharge Logics Abstract

Bijan Fadaeinia Thorben Moos Amir Moradi

In recent research studies, an observable dependency has been found between the static power consumption of a Complementary Metal-Oxide-Semiconductor (CMOS) chip and its internally stored and processed data. For the most part, these studies have focused on utilizing the leakage currents as a side channel to conduct key-recovery attacks on cryptographic devices. There are two main reasons why information leakage through the static power side channel is considered particularly harmful for the security of implementations, namely 1) the low influence of noise due to averaging over time and 2) the ability to target secrets even outside of the time window that they are actively computed upon (data is leaked for as long as it is saved anywhere in the circuit). Hence, developing effective countermeasures against this threat is of significant importance for the security of cryptographic hardware. Hiding techniques known as Dual-Rail Precharge (DRP) logic have been proposed and studied in literature as an instrument to equalize a circuit’s dynamic power consumption irrespective of the processed data. The specific instance called improved Masked Dual-Rail Precharge Logic (iMDPL) is – despite its high overhead – known as one of the most potent and attractive DRP-based Side-Channel Analysis (SCA) countermeasures. While its ability to prevent data extraction through the dynamic power consumption is well studied and documented, we thoroughly analyze its susceptibility to Static Power Side-Channel Analysis (SPSCA) attacks in this work. To conduct our study we have taped-out a custom Application-Specific Integrated Circuit (ASIC) prototype in 65nm CMOS technology which contains multiple cryptographic co-processors protected by iMDPL, partially combined with other countermeasures. Additionally, it contains circuits protected by a new variant of iMDPL that we specifically hardened against SPSCA, which we call Static Robust iMDPL (SRiMDPL). Our careful experiments performed in a controlled environment under exploitation of voltage and temperature dependencies show that SRiMDPL circuits combined with modern hardware masking offer an extremely high level of security against both dynamic and static power SCA attacks. While the cost of such combinations is admittedly significant (≈ 108 kGE post-layout area for a corresponding PRESENT core), we obtain the strongest combined resistance to both power side channels that has been experimentally demonstrated on real silicon so far. In summary, we believe that our analysis can assist hardware designers in making important decisions on the trade-offs between cost and security that such countermeasures facilitate.

2024

TCHES

Prime Masking vs. Faults - Exponential Security Amplification against Selected Classes of Attacks Abstract

Thorben Moos Sayandeep Saha François-Xavier Standaert

Fault injection attacks are a serious concern for cryptographic hardware. Adversaries may extract sensitive information from the faulty output that is produced by a cryptographic circuit after actively disturbing its computation. Alternatively, the information whether an output would have been faulty, even if it is withheld from being released, may be exploited. The former class of attacks, which requires the collection of faulty outputs, such as Differential Fault Analysis (DFA), then either exploits some knowledge about the position of the injected fault or about its value. The latter class of attacks, which can be applied without ever obtaining faulty outputs, such as Statistical Ineffective Fault Attacks (SIFA), then either exploits a dependency between the effectiveness of the fault injection and the value to be faulted (e.g., an LSB stuck-at-0 only affecting odd numbers), denoted as SIFA-1, or a conditional propagation of a faulted value based on a sensitive intermediate (e.g., multiplication of a faulted value by 0 prevents propagation), denoted as SIFA-2. The aptitude of additive masking schemes, which were designed to prevent side-channel analysis, to also thwart fault attacks is typically assumed to be limited. Common fault models, such as toggle/bit-flip, stuck-at-0 or stuck-at-1 survive the recombination of Boolean shares well enough for generic attacks to succeed. More precisely, injecting a fault into one or multiple Boolean shares often results in the same, or at least a predictable, error appearing in the sensitive variable after recombination. In this work, we show that additive masking in prime-order fields breaks such relationships, causing frequently exploited biases to decrease exponentially in the number of shares. As a result, prime masking offers surprisingly strong protection against generic statistical attacks, which require a dependency between the effectiveness of an injected fault and the secret variable that is manipulated, such as SIFA-1. Operation-dependent statistical attacks, such as SIFA-2 and Fault Template Attacks (FTA), may still be performed against certain prime-field structures, even if they are masked with many shares. Yet, we analyze the corresponding cases and are able to provide specific guidelines on how to avoid vulnerabilities either at the cipher design or implementation level by making informed decisions about the primes, non-linear mappings and masked gadgets used. Since prime-field masking appears to be one of the rare instances of affordable countermeasures that naturally provide sound protection against side-channel analysis and certain fault injection attacks, we believe there is a strong incentive for developing new ciphers to leverage these advantages.

2023

EUROCRYPT

Effective and Efficient Masking with Low Noise using Small-Mersenne-Prime Ciphers Abstract

Loïc Masure Pierrick Méaux Thorben Moos François-Xavier Standaert

Embedded devices used in security applications are natural targets for physical attacks. Thus, enhancing their side-channel resistance is an important research challenge. A standard solution for this purpose is the use of Boolean masking schemes, as they are well adapted to current block ciphers with efficient bitslice representations. Boolean masking guarantees that the security of an implementation grows exponentially in the number of shares under the assumption that leakages are sufficiently noisy (and independent). Unfortunately, it has been shown that this noise assumption is hardly met on low-end devices. In this paper, we therefore investigate techniques to mask cryptographic algorithms in such a way that their resistance can survive an almost complete lack of noise. Building on seed theoretical results of Dziembowski et al., we put forward that arithmetic encodings in prime fields can reach this goal. We first exhibit the gains that such encodings lead to thanks to a simulated information theoretic analysis of their leakage (with up to six shares). We then provide figures showing that on platforms where optimized arithmetic adders and multipliers are readily available (i.e., most MCUs and FPGAs), performing masked operations in small to medium Mersenne-prime fields as opposed to binary extension fields will not lead to notable implementation overheads. We compile these observations into a new AES-like block cipher, called AES-prime, which is well-suited to illustrate the remarkable advantages of masking in prime fields. We also confirm the practical relevance of our findings by evaluating concrete software (ARM Cortex-M3) and hardware (Xilinx Spartan-6) implementations. Our experimental results show that security gains over Boolean masking (and, more generally, binary encodings) can reach orders of magnitude despite the same amount of information being leaked per share.

2023

TCHES

Prime-Field Masking in Hardware and its Soundness against Low-Noise SCA Attacks Abstract

Gaëtan Cassiers Loïc Masure Charles Momin Thorben Moos François-Xavier Standaert

A recent study suggests that arithmetic masking in prime fields leads to stronger security guarantees against passive physical adversaries than Boolean masking. Indeed, it is a common observation that the desired security amplification of Boolean masking collapses when the noise level in the measurements is too low. Arithmetic encodings in prime fields can help to maintain an exponential increase of the attack complexity in the number of shares even in such a challenging context. In this work, we contribute to this emerging topic in two main directions. First, we propose novel masked hardware gadgets for secure squaring in prime fields (since squaring is non-linear in non-binary fields) which prove to be significantly more resource-friendly than corresponding masked multiplications. We then formally show their local and compositional security for arbitrary orders. Second, we attempt to >experimentally evaluate the performance vs. security tradeoff of prime-field masking. In order to enable a first comparative case study in this regard, we exemplarily consider masked implementations of the AES as well as the recently proposed AESprime. AES-prime is a block cipher partially resembling the standard AES, but based on arithmetic operations modulo a small Mersenne prime. We present cost and performance figures for masked AES and AES-prime implementations, and experimentally evaluate their susceptibility to low-noise side-channel attacks. We consider both the dynamic and the static power consumption for our low-noise analyses and emulate strong adversaries. Static power attacks are indeed known as a threat for side-channel countermeasures that require a certain noise level to be effective because of the adversary’s ability to reduce the noise through intra-trace averaging. Our results show consistently that for the noise levels in our practical experiments, the masked prime-field implementations provide much higher security for the same number of shares. This compensates for the overheads prime computations lead to and remains true even if / despite leaking each share with a similar Signal-to-Noise Ratio (SNR) as their binary equivalents. We hope our results open the way towards new cipher designs tailored to best exploit the advantages of prime-field masking.

2022

TCHES

Beware of Insufficient Redundancy: An Experimental Evaluation of Code-based FI Countermeasures Abstract

Timo Bartkewitz Sven Bettendorf Thorben Moos Amir Moradi Falk Schellenberg

Fault injection attacks pose a serious threat to cryptographic implementations. Countermeasures beyond sensors and shields usually deploy some form of redundancy to detect or even correct errors. A few years ago, a novel design methodology called Impeccable Circuits has been introduced on how to correctly integrate Concurrent Error Detection (CED) schemes, based on Error-Detection Codes (EDCs), into cryptographic hardware circuits. The underlying adversary model limits attackers to inject at most t single-bit faults. By additionally considering the propagation of faults in combinational circuits, the countermeasure guarantees detection of any faulty computation caused by up to t single-bit faults.In this work, we present an experimental analysis of the Impeccable Circuits countermeasure and its underlying assumptions in modern semiconductor technology. More precisely, we have taken hardware implementations of the lightweight block cipher SKINNY equipped with various forms of the EDC-based CED schemes and realized them as cryptographic co-processors on a 40nm ASIC to experimentally evaluate their resistance to Laser Fault Injection (LFI) attacks. In short, our results show that it is fairly simple to overcome the protection offered by the integrated countermeasures when the length of the code n is smaller than twice its rank k (i.e., no full redundancy). This is not caused by any flaw in the underlying design methodology or concept, but merely demonstrates how easily the defined adversary model can be overcome. In our case, a standard black-box scan over the target using a common single-shot LFI setup is sufficient to occasionally inject more single-bit faults than those bounded by the underlying adversary model when n < 2k. The probability of such events proved to be large enough to perform successful key-recovery attacks via Differential Fault Analysis (DFA) in a matter of hours. Thus, we caution against limiting the redundancy in code-based FI countermeasures to less than the number of bits per word, especially in nanometer technologies, and point out that less-complex countermeasures like duplication showed a higher level of resistance in our experiments at a lower cost.

2021

TCHES

Let’s Take it Offline: Boosting Brute-Force Attacks on iPhone’s User Authentication through SCA 📺 Abstract

Oleksiy Lisovets David Knichel Thorben Moos Amir Moradi

In recent years, smartphones have become an increasingly important storage facility for personal sensitive data ranging from photos and credentials up to financial and medical records like credit cards and person’s diseases. Trivially, it is critical to secure this information and only provide access to the genuine and authenticated user. Smartphone vendors have already taken exceptional care to protect user data by the means of various software and hardware security features like code signing, authenticated boot chain, dedicated co-processor and integrated cryptographic engines with hardware fused keys. Despite these obstacles, adversaries have successfully broken through various software protections in the past, leaving only the hardware as the last standing barrier between the attacker and user data. In this work, we build upon existing software vulnerabilities and break through the final barrier by performing the first publicly reported physical Side-Channel Analysis (SCA) attack on an iPhone in order to extract the hardware-fused devicespecific User Identifier (UID) key. This key – once at hand – allows the adversary to perform an offline brute-force attack on the user passcode employing an optimized and scalable implementation of the Key Derivation Function (KDF) on a Graphics Processing Unit (GPU) cluster. Once the passcode is revealed, the adversary has full access to all user data stored on the device and possibly in the cloud.As the software exploit enables acquisition and processing of hundreds of millions oftraces, this work further shows that an attacker being able to query arbitrary many chosen-data encryption/decryption requests is a realistic model, even for compact systems with advanced software protections, and emphasizes the need for assessing resilience against SCA for a very high number of traces.

2021

TCHES

DL-LA: Deep Learning Leakage Assessment: A modern roadmap for SCA evaluations 📺 Abstract

Thorben Moos Felix Wegener Amir Moradi

In recent years, deep learning has become an attractive ingredient to side-channel analysis (SCA) due to its potential to improve the success probability or enhance the performance of certain frequently executed tasks. One task that is commonly assisted by machine learning techniques is the profiling of a device’s leakage behavior in order to carry out a template attack. At CHES 2019, deep learning has also been applied to non-profiled scenarios for the first time, extending its reach within SCA beyond template attacks. The proposed method, called DDLA, has some tempting advantages over traditional SCA due to merits inherited from (convolutional) neural networks. Most notably, it greatly reduces the need for pre-processing steps< when the SCA traces are misaligned or when the leakage is of a multivariate nature. However, similar to traditional attack scenarios the success of this approach highly depends on the correct choice of a leakage model and the intermediate value to target. In this work we explore, for the first time in literature, whether deep learning can similarly be used as an instrument to advance another crucial (non-profiled) discipline of SCA which is inherently independent of leakage models and targeted intermediates, namely leakage assessment. In fact, given the simple classification-based nature of common leakage assessment techniques, in particular distinguishing two groups fixed-vs-random or fixed-vs-fixed, it comes as a surprise that machine learning has not been brought into this context, yet. Our contribution is the development of the first full leakage assessment methodology based on deep learning. It gives the evaluator the freedom to not worry about location, alignment and statistical order of the leakages and easily covers multivariate and horizontal patterns as well. We test our approach against a number of case studies based on FPGA, ASIC and μC implementations of the PRESENT block cipher, equipped with state-of-the-art SCA countermeasures. Our results clearly show that the proposed methodology and network structures are robust across all case studies and outperform the classical detection approaches (t-test and X2-test) in all considered scenarios.

2021

TCHES

Countermeasures against Static Power Attacks: – Comparing Exhaustive Logic Balancing and Other Protection Schemes in 28 nm CMOS – 📺 Abstract

Thorben Moos Amir Moradi

In recent years it has been demonstrated convincingly that the standby power of a CMOS chip reveals information about the internally stored and processed data. Thus, for adversaries who seek to extract secrets from cryptographic devices via side-channel analysis, the static power has become an attractive quantity to obtain. Most works have focused on the destructive side of this subject by demonstrating attacks. In this work, we examine potential solutions to protect circuits from silently leaking sensitive information during idle times. We focus on countermeasures that can be implemented using any common digital standard cell library and do not consider solutions that require full-custom or analog design flow. In particular, we evaluate and compare a set of five distinct standard-cell-based hiding countermeasures, including both, randomization and equalization techniques. We then combine the hiding countermeasures with state-of-the-art hardware masking in order to amplify the noise level and achieve a high resistance against attacks. An important part of our contribution is the proposal and evaluation of the first ever standard-cell-based balancing scheme which achieves perfect data-independence on paper, i.e., in absence of intra-die process variations and aging effects. We call our new countermeasure Exhaustive Logic Balancing (ELB). While this scheme, applied to a threshold implementation, provides the highest level of resistance in our experiments, it may not be the most cost effective option due to the significant resource overhead associated. All evaluated countermeasures and combinations thereof are applied to a serialized hardware implementation of the PRESENT block cipher and realized as cryptographic co-processors on a 28nm CMOS ASIC prototype. Our experimental results are obtained through real-silicon measurements of a fabricated die of the ASIC in a temperature-controlled environment using a source measure unit (SMU). We believe that our elaborate comparison serves as a useful guideline for hardware designers to find a proper tradeoff between security and cost for almost any application.

2021

TCHES

The SPEEDY Family of Block Ciphers: Engineering an Ultra Low-Latency Cipher from Gate Level for Secure Processor Architectures 📺 Abstract

Gregor Leander Thorben Moos Amir Moradi Shahram Rasoolzadeh

We introduce SPEEDY, a family of ultra low-latency block ciphers. We mix engineering expertise into each step of the cipher’s design process in order to create a secure encryption primitive with an extremely low latency in CMOS hardware. The centerpiece of our constructions is a high-speed 6-bit substitution box whose coordinate functions are realized as two-level NAND trees. In contrast to other low-latency block ciphers such as PRINCE, PRINCEv2, MANTIS and QARMA, we neither constrain ourselves by demanding decryption at low overhead, nor by requiring a super low area or energy. This freedom together with our gate- and transistor-level considerations allows us to create an ultra low-latency cipher which outperforms all known solutions in single-cycle encryption speed. Our main result, SPEEDY-6-192, is a 6-round 192-bit block and 192-bit key cipher which can be executed faster in hardware than any other known encryption primitive (including Gimli in Even-Mansour scheme and the Orthros pseudorandom function) and offers 128-bit security. One round more, i.e., SPEEDY-7-192, provides full 192-bit security. SPEEDY primarily targets hardware security solutions embedded in high-end CPUs, where area and energy restrictions are secondary while high performance is the number one priority.

2020

TCHES

Unrolled Cryptography on Silicon: A Physical Security Analysis 📺 Abstract

Thorben Moos

Cryptographic primitives with low-latency performance have gained momentum lately due to an increased demand for real-time applications. Block ciphers such as PRINCE enable data encryption (resp. decryption) within a single clock cycle at a moderately high operating frequency when implemented in a fully-unrolled fashion. Unsurprisingly, many typical environments for unrolled ciphers require protection against physical adversaries as well. Yet, recent works suggest that most common SCA countermeasures are hard to apply to low-latency circuits. Hardware masking, for example, requires register stages to offer resistance, thus adding delay and defeating the purpose of unrolling. On another note, it has been indicated that unrolled primitives without any additional means of protection offer an intrinsic resistance to SCA attacks due to their parallelism, asynchronicity and speed of execution. In this work, we take a closer look at the physical security properties provided by unrolled cryptographic IC implementations. We are able to confirm that the nature of unrolling indeed bears the potential to decrease the susceptibility of cipher implementations significantly when reset methods are applied. With respect to certain adversarial models, e.g., ciphertext-only access, an amazingly high level of protection can be achieved. While this seems to be a great result for cryptographic hardware engineers, there is an attack vector hidden in plain sight which still threatens the security of unrolled implementations remarkably – namely the static power consumption of CMOS-based circuits. We point out that essentially all reasons which make it hard to extract meaningful information from the dynamic behavior of unrolled primitives are not an issue when exploiting the static currents for key recovery. Our evaluation is based on real-silicon measurements of an unrolled PRINCE core in a custom 40nm ASIC. The presented results serve as a neat educational case study to demonstrate the broad differences between dynamic and static power information leakage in the light of technological advancement.

2019

TCHES

Glitch-Resistant Masking Revisited 📺 ★ Abstract

Thorben Moos Amir Moradi Tobias Schneider François-Xavier Standaert

Implementing the masking countermeasure in hardware is a delicate task. Various solutions have been proposed for this purpose over the last years: we focus on Threshold Implementations (TIs), Domain-Oriented Masking (DOM), the Unified Masking Approach (UMA) and Generic Low Latency Masking (GLM). The latter generally come with innovative ideas to cope with physical defaults such as glitches. Yet, and in contrast to the situation in software-oriented masking, these schemes have not been formally proven at arbitrary security orders and their composability properties were left unclear. So far, only a 2-cycle implementation of the seminal masking scheme by Ishai, Sahai and Wagner has been shown secure and composable in the robust probing model – a variation of the probing model aimed to capture physical defaults such as glitches – for any number of shares.In this paper, we argue that this lack of proofs for TIs, DOM, UMA and GLM makes the interpretation of their security guarantees difficult as the number of shares increases. For this purpose, we first put forward that the higher-order variants of all these schemes are affected by (local or composability) security flaws in the (robust) probing model, due to insufficient refreshing. We then show that composability and robustness against glitches cannot be analyzed independently. We finally detail how these abstract flaws translate into concrete (experimental) attacks, and discuss the additional constraints robust probing security implies on the need of registers. Despite not systematically leading to improved complexities at low security orders, e.g., with respect to the required number of measurements for a successful attack, we argue that these weaknesses provide a case for the need of security proofs in the robust probing model (or a similar abstraction) at higher security orders.

2019

TCHES

Static Power SCA of Sub-100 nm CMOS ASICs and the Insecurity of Masking Schemes in Low-Noise Environments 📺 Abstract

Thorben Moos

Semiconductor technology scaling faced tough engineering challenges while moving towards and beyond the deep sub-micron range. One of the most demanding issues, limiting the shrinkage process until the present day, is the difficulty to control the leakage currents in nanometer-scaled field-effect transistors. Previous articles have shown that this source of energy dissipation, at least in case of digital CMOS logic, can successfully be exploited as a side-channel to recover the secrets of cryptographic implementations. In this work, we present the first fair technology comparison with respect to static power side-channel measurements on real silicon and demonstrate that the effect of down-scaling on the potency of this security threat is huge. To this end, we designed two ASICs in sub-100nm CMOS nodes (90 nm, 65 nm) and got them fabricated by one of the leading foundries. Our experiments, which we performed at different operating conditions, show consistently that the ASIC technology with the smaller minimum feature size (65 nm) indeed exhibits substantially more informative leakages (factor of ~10) than the 90nm one, even though all targeted instances have been derived from identical RTL code. However, the contribution of this work extends well beyond a mere technology comparison. With respect to the real-world impact of static power attacks, we present the first realistic scenarios that allow to perform a static power side-channel analysis (including noise reduction) without requiring control over the clock signal of the target. Furthermore, as a follow-up to some proof-of-concept work indicating the vulnerability of masking schemes to static powerattacks, we perform a detailed study on how the reduction of the noise level in static leakage measurements affects the security provided by masked implementations. As a result of this study, we do not only find out that the threat for masking schemes is indeed real, but also that common leakage assessment techniques, such as the Welch’s t-test, together with essentially any moment-based analysis of the leakage traces, is simply not sufficient in low-noise contexts. In fact, we are able to show that either a conversion (resp. compression) of the leakage order or the recently proposed X2 test need to be considered in assessment and attack to avoid false negatives.

2019

TCHES

Exploring the Effect of Device Aging on Static Power Analysis Attacks 📺 Abstract

Naghmeh Karimi Thorben Moos Amir Moradi

Vulnerability of cryptographic devices to side-channel analysis attacks, and in particular power analysis attacks has been extensively studied in the recent years. Among them, static power analysis attacks have become relevant with moving towards smaller technology nodes for which the static power is comparable to the dynamic power of a chip, or even dominant in future technology generations. The magnitude of the static power of a chip depends on the physical characteristics of transistors (e.g., the dimensions) as well as operating conditions (e.g., the temperature) and the electrical specifications such as the threshold voltage. In fact, the electrical specifications of transistors deviate from their originally intended ones during device lifetime due to aging mechanisms. Although device aging has been extensively investigated from reliability point of view, the impact of aging on the security of devices, and in particular on the vulnerability of devices to power analysis attacks are yet to be considered.This paper fills the gap and investigates how device aging can affect the susceptibility of a chip exposed to static power analysis attacks. To this end, we conduct both, simulation and practical experiments on real silicon. The experimental results are extracted from a realization of the PRESENT cipher fabricated using a 65nm commercial standard cell library. The results show that the amount of exploitable leakage through the static power consumption as a side channel is reduced when the device is aged. This can be considered as a positive development which can (even slightly) harden such static power analysis attacks. Additionally, this result is of great interest to static power side-channel adversaries since state-of-the-art leakage current measurements are conducted over long time periods under increased working temperatures and supply voltages to amplify the exploitable information, which certainly fuels aging-related device degradation.