## CryptoDB

### Papers from Transactions on Cryptographic Hardware and Embedded Systems 2019

Year
Venue
Title
2019
TCHES
Security on Plastics: Fake or Real?
Electronic devices on plastic foil, also referred to as flexible electronics, are making their way into mainstream applications. In the near future, flexible electronic labels can be embedded in smart blisters, but also used as mainstream technology for flexible medical patches. A key technology for flexible electronics is based on thin-film transistors, which have the potential to be manufactured at low cost, making them an ideal candidate for these applications. Yet, up to now, no-one is taking digital security into account in the design of flexible electronics.In this paper, we present, to our knowledge, the first cryptographic core on plastic foil. Two main research challenges arise. The first challenge is related to the reliability of the circuit, which typically decreases when the circuit area increases. By integrating cryptographic modules, we explore the limits of the technology, since the smallest lightweight block ciphers feature a larger area than the largest digital circuit on flex foil reported up to now. The second challenge is related to key hiding. The relatively large features on the chip and the fact that electronic chips on plastics are used as bare dies, i.e. they are not packaged, make it easy to read out the value of the stored secret key. Because there is no dedicated non-volatile memory technology yet, existing methods for writing data to the flexible chip after fabrication are based on wire cutting with a laser or inkjet printing. With these techniques, however, it is extremely easy to “see” the value of the secret key under a microscope. We propose a novel solution that allows us to invisibly program the key after fabrication.
2019
TCHES
Improving CEMA using Correlation Optimization
Sensitive cryptographic information, e.g. AES secret keys, can be extracted from the electromagnetic (EM) leakages unintentionally emitted by a device using techniques such as Correlation Electromagnetic Analysis (CEMA). In this paper, we introduce Correlation Optimization (CO), a novel approach that improves CEMA attacks by formulating the selection of useful EM leakage samples in a trace as a machine learning optimization problem. To this end, we propose the correlation loss function, which aims to maximize the Pearson correlation between a set of EM traces and the true AES key during training. We show that CO works with high-dimensional and noisy traces, regardless of time-domain trace alignment and without requiring prior knowledge of the power consumption characteristics of the cryptographic hardware. We evaluate our approach using the ASCAD benchmark dataset and a custom dataset of EM leakages from an Arduino Duemilanove, captured with a USRP B200 SDR. Our results indicate that the masked AES implementation used in all three ASCAD datasets can be broken with a shallow Multilayer Perceptron model, whilst requiring only 1,000 test traces on average. A similar methodology was employed to break the unprotected AES implementation from our custom dataset, using 22,000 unaligned and unfiltered test traces.
2019
TCHES
Fully Automated Differential Fault Analysis on Software Implementations of Block Ciphers
Differential Fault Analysis (DFA) is considered as the most popular fault analysis method. While there are techniques that provide a fault analysis automation on the cipher level to some degree, it can be shown that when it comes to software implementations, there are new vulnerabilities, which cannot be found by observing the cipher design specification.This work bridges the gap by providing a fully automated way to carry out DFA on assembly implementations of symmetric block ciphers. We use a customized data flow graph to represent the program and develop a novel fault analysis methodology to capture the program behavior under faults. We establish an effective description of DFA as constraints that are passed to an SMT solver. We create a tool that takes assembly code as input, analyzes the dependencies among instructions, automatically attacks vulnerable instructions using SMT solver and outputs the attack details that recover the last round key (and possibly the earlier keys). We support our design with evaluations on lightweight ciphers SIMON, SPECK, and PRIDE, and a current NIST standard, AES. By automated assembly analysis, we were able to find new efficient DFA attacks on SPECK and PRIDE, exploiting implementation specific vulnerabilities, and previously published DFA on SIMON and AES. Moreover, we present a novel DFA on multiplication operation that has never been shown for symmetric block ciphers before. Our experimental evaluation also shows reasonable execution times that are scalable to current cipher designs and can easily outclass the manual analysis. Moreover, we present a method to check the countermeasure-protected implementations in a way that helps implementers to decide how many rounds should be protected. We note that this is the first work that automatically carries out DFA on cipher implementations without any plaintext or ciphertext information and therefore, can be generally applied to any input data to the cipher.
2019
TCHES
AuCPace: Efficient verifier-based PAKE protocol tailored for the IIoT
Increasingly connectivity becomes integrated in products and devices that previously operated in a stand-alone setting. This observation holds for many consumer applications in the so-called "Internet of Things" (IoT) as well as for corresponding industry applications (IIoT), such as industrial process sensors. Often the only practicable means for authentication of human users is a password. The security of password-based authentication schemes frequently forms the weakest point of the security infrastructure. In this paper we first explain why a tailored protocol designed for the IIoT use case is considered necessary. The differences between IIoT and the conventional Internet use-cases result in largely modified threats and require special procedures for allowing both, convenient and secure use in the highly constrained industrial setting. Specifically the use of a verifier-based password-authenticated key-exchange (V-PAKE) protocol as a hedge against public-key-infrastructure (PKI) failures is considered important. Availability concerns for the case of failures of (part of) the communication infrastructure makes local storage of access credentials mandatory. The larger threat of physical attacks makes it important to use memory-hard password hashing. This paper presents a corresponding tailored protocol, AuCPace, together with a security proof within the Universal Composability (UC) framework considering fully adaptive adversaries. We also introduce a new security notion of partially augmented PAKE that provides specific performance advantages and makes them suitable for a larger set of IIoT applications. We also present an actual instantiation of our protocol, AuCPace25519, and present performance results on ARM Cortex-M0 and Cortex-M4 microcontrollers. Our implementation realizes new speed-records for PAKE and X25519 Diffie-Hellman for the ARM Cortex M4 architecture.
2019
TCHES
Non-Profiled Deep Learning-based Side-Channel attacks with Sensitivity Analysis
Deep Learning has recently been introduced as a new alternative to perform Side-Channel analysis [MPP16]. Until now, studies have been focused on applying Deep Learning techniques to perform Profiled Side-Channel attacks where an attacker has a full control of a profiling device and is able to collect a large amount of traces for different key values in order to characterize the device leakage prior to the attack. In this paper we introduce a new method to apply Deep Learning techniques in a Non-Profiled context, where an attacker can only collect a limited number of side-channel traces for a fixed unknown key value from a closed device. We show that by combining key guesses with observations of Deep Learning metrics, it is possible to recover information about the secret key. The main interest of this method is that it is possible to use the power of Deep Learning and Neural Networks in a Non-Profiled scenario. We show that it is possible to exploit the translation-invariance property of Convolutional Neural Networks [CDP17] against de-synchronized traces also during Non-Profiled side-channel attacks. In this case, we show that this method can outperform classic Non-Profiled attacks such as Correlation Power Analysis. We also highlight that it is possible to break masked implementations in black-box, without leakages combination pre-preprocessing and with no assumptions nor knowledge about the masking implementation. To carry the attack, we introduce metrics based on Sensitivity Analysis that can reveal both the secret key value as well as points of interest, such as leakages and masks locations in the traces. The results of our experiments demonstrate the interests of this new method and show that this attack can be performed in practice.
2019
TCHES
Consolidating Security Notions in Hardware Masking
In this paper, we revisit the security conditions of masked hardware implementations. We describe a new, succinct, information-theoretic condition called d-glitch immunity which is both necessary and sufficient for security in the presence of glitches. We show that this single condition includes, but is not limited to, previous security notions such as those used in higher-order threshold implementations and in abstractions using ideal gates. As opposed to these previously known necessary conditions, our new condition is also sufficient. On the other hand, it excludes avoidable notions such as uniformity. We also treat the notion of (strong) noninterference from an information-theoretic point-of-view in order to unify the different security concepts and pave the way to the verification of composability in the presence of glitches. We conclude the paper by demonstrating how the condition can be used as an efficient and highly generic flaw detection mechanism for a variety of functions and schemes based on different operations.
2019
TCHES
3-Share Threshold Implementation of AES S-box without Fresh Randomness
Threshold implementation is studied as a countermeasure against sidechannel attack. There had been no threshold implementation for the AES and Keccak S-boxes that satisfies an important property called uniformity. In the conventional implementations, intermediate values are remasked to compensate for the lack of uniformity. The remasking consumes thousands of fresh random bits and its implementation cost is a serious concern. Daemen recently proposed a 3-share uniform threshold implementation of the Keccak S-box. This is enabled by a new technique called the changing of the guards which can be applied to any invertible functions. Subsequently, Wegener et al. proposed a 4-share threshold implementation of the AES S-box based on the changing of the guards technique. However, a 3-share threshold implementation of AES S-box remains open. The difficulty stays in 2-input multiplication, used in decomposed S-box representations, which is non-invertible because of different input and output sizes. In this study, this problem is addressed by introducing a certain generalization of the changing of the guards technique. The proposed method provides a generic way to construct a uniform sharing for a target function having different input and output sizes. The key idea is to transform a target function into an invertible one by adding additional inputs and outputs. Based on the proposed technique, the first 3-share threshold implementation of AES S-box without fresh randomness is presented. Performance evaluation and simulation-based leakage assessment of the implementation are also presented.
2019
TCHES
On-Device Power Analysis Across Hardware Security Domains.
Side-channel power analysis is a powerful method of breaking secure cryptographic algorithms, but typically power analysis is considered to require specialized measurement equipment on or near the device. Assuming an attacker first gained the ability to run code on the unsecure side of a device, they could trigger encryptions and use the on-board ADC to capture power traces of that hardware encryption engine.This is demonstrated on a SAML11 which contains a M23 core with a TrustZone-M implementation as the hardware security barrier. This attack requires 160 × 106 traces, or approximately 5 GByte of data. This attack does not use any external measurement equipment, entirely performing the power analysis using the ADC on-board the microcontroller under attack. The attack is demonstrated to work both from the non-secure and secure environment on the chip, being a demonstration of a cross-domain power analysis attack.To understand the effect of noise and sample rate reduction, an attack is mounted on the SAML11 hardware AES peripheral using classic external equipment, and results are compared for various sample rates and hardware setups. A discussion on how users of this device can help prevent such remote attacks is also presented, along with metrics that can be used in evaluating other devices. Complete copies of all recorded power traces and scripts used by the authors are publicly presented.
2019
TCHES
Deep Learning to Evaluate Secure RSA Implementations
This paper presents the results of several successful profiled side-channel attacks against a secure implementation of the RSA algorithm. The implementation was running on a ARM Core SC 100 completed with a certified EAL4+ arithmetic co-processor. The analyses have been conducted by three experts’ teams, each working on a specific attack path and exploiting information extracted either from the electromagnetic emanation or from the power consumption. A particular attention is paid to the description of all the steps that are usually followed during a security evaluation by a laboratory, including the acquisitions and the observations preprocessing which are practical issues usually put aside in the literature. Remarkably, the profiling portability issue is also taken into account and different device samples are involved for the profiling and testing phases. Among other aspects, this paper shows the high potential of deep learning attacks against secure implementations of RSA and raises the need for dedicated countermeasures.
2019
TCHES
Return of the Hidden Number Problem. A Widespread and Novel Key Extraction Attack on ECDSA and DSA
Side channels have long been recognized as a threat to the security of cryptographic applications. Implementations can unintentionally leak secret information through many channels, such as microarchitectural state changes in processors, changes in power consumption, or electromagnetic radiation. As a result of these threats, many implementations have been hardened to defend against these attacks. Despite these mitigations, this work presents a novel side-channel attack against ECDSA and DSA. The attack targets a common implementation pattern that is found in many cryptographic libraries. In fact, about half of the libraries that were tested exhibited the vulnerable pattern. This pattern is exploited in a full proof of concept attack against OpenSSL, demonstrating that it is possible to extract a 256-bit ECDSA private key using a simple cache attack after observing only a few thousand signatures. The target of this attack is a previously unexplored part of (EC)DSA signature generation, which explains why mitigations are lacking and the issue is so widespread. Finally, estimates are provided for the minimum number of signatures needed to perform the attack, and countermeasures are suggested to protect against this attack.
2019
TCHES
Make Some Noise. Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis
Profiled side-channel analysis based on deep learning, and more precisely Convolutional Neural Networks, is a paradigm showing significant potential. The results, although scarce for now, suggest that such techniques are even able to break cryptographic implementations protected with countermeasures. In this paper, we start by proposing a new Convolutional Neural Network instance able to reach high performance for a number of considered datasets. We compare our neural network with the one designed for a particular dataset with masking countermeasure and we show that both are good designs but also that neither can be considered as a superior to the other one.Next, we address how the addition of artificial noise to the input signal can be actually beneficial to the performance of the neural network. Such noise addition is equivalent to the regularization term in the objective function. By using this technique, we are able to reduce the number of measurements needed to reveal the secret key by orders of magnitude for both neural networks. Our new convolutional neural network instance with added noise is able to break the implementation protected with the random delay countermeasure by using only 3 traces in the attack phase. To further strengthen our experimental results, we investigate the performance with a varying number of training samples, noise levels, and epochs. Our findings show that adding noise is beneficial throughout all training set sizes and epochs.
2019
TCHES
Fast and simple constant-time hashing to the BLS12-381 elliptic curve
Pairing-friendly elliptic curves in the Barreto-Lynn-Scott family are seeing a resurgence in popularity because of the recent result of Kim and Barbulescu that improves attacks against other pairing-friendly curve families. One particular Barreto-Lynn-Scott curve, called BLS12-381, is the locus of significant development and deployment effort, especially in blockchain applications. This effort has sparked interest in using the BLS12-381 curve for BLS signatures, which requires hashing to one of the groups of the bilinear pairing defined by BLS12-381.While there is a substantial body of literature on the problem of hashing to elliptic curves, much of this work does not apply to Barreto-Lynn-Scott curves. Moreover, the work that does apply has the unfortunate property that fast implementations are complex, while simple implementations are slow.In this work, we address these issues. First, we show a straightforward way of adapting the “simplified SWU” map of Brier et al. to BLS12-381. Second, we describe optimizations to this map that both simplify its implementation and improve its performance; these optimizations may be of interest in other contexts. Third, we implement and evaluate. We find that our work yields constant-time hash functions that are simple to implement, yet perform within 9% of the fastest, non–constant-time alternatives, which require much more complex implementations.
2019
TCHES
Towards Globally Optimized Masking: From Low Randomness to Low Noise Rate
We improve the state-of-the-art masking schemes in two important directions. First, we propose a new masked multiplication algorithm that satisfies a recently introduced notion called Probe-Isolating Non-Interference (PINI). It captures a sufficient requirement for designing masked implementations in a trivial way, by combining PINI multiplications and linear operations performed share by share. Our improved algorithm has the best reported randomness complexity for large security orders (while the previous PINI multiplication was best for small orders). Second, we analyze the security of most existing multiplication algorithms in the literature against so-called horizontal attacks, which aim to reduce the noise of the actual leakages measured by an adversary, by combining the information of multiple target intermediate values. For this purpose, we leave the (abstract) probing model and consider a specialization of the (more realistic) noisy leakage / random probing models. Our (still partially heuristic but quantitative) analysis allows confirming the improved security of an algorithm by Battistello et al. from CHES 2016 in this setting. We then use it to propose new improved algorithms, leading to better tradeoffs between randomness complexity and noise rate, and suggesting the possibility to design efficient masked multiplication algorithms with constant noise rate in F2.
2019
TCHES
Implementing RLWE-based Schemes Using an RSA Co-Processor
We repurpose existing RSA/ECC co-processors for (ideal) lattice-based cryptography by exploiting the availability of fast long integer multiplication. Such co-processors are deployed in smart cards in passports and identity cards, secured microcontrollers and hardware security modules (HSM). In particular, we demonstrate an implementation of a variant of the Module-LWE-based Kyber Key Encapsulation Mechanism (KEM) that is tailored for high performance on a commercially available smart card chip (SLE 78). To benefit from the RSA/ECC co-processor we use Kronecker substitution in combination with schoolbook and Karatsuba polynomial multiplication. Moreover, we speed-up symmetric operations in our Kyber variant using the AES co-processor to implement a PRNG and a SHA-256 co-processor to realise hash functions. This allows us to execute CCA-secure Kyber768 key generation in 79.6 ms, encapsulation in 102.4 ms and decapsulation in 132.7 ms.
2019
TCHES
Sapphire: A Configurable Crypto-Processor for Post-Quantum Lattice-based Protocols
Public key cryptography protocols, such as RSA and elliptic curve cryptography, will be rendered insecure by Shor’s algorithm when large-scale quantum computers are built. Cryptographers are working on quantum-resistant algorithms, and lattice-based cryptography has emerged as a prime candidate. However, high computational complexity of these algorithms makes it challenging to implement lattice-based protocols on low-power embedded devices. To address this challenge, we present Sapphire – a lattice cryptography processor with configurable parameters. Efficient sampling, with a SHA-3-based PRNG, provides two orders of magnitude energy savings; a single-port RAM-based number theoretic transform memory architecture is proposed, which provides 124k-gate area savings; while a low-power modular arithmetic unit accelerates polynomial computations. Our test chip was fabricated in TSMC 40nm low-power CMOS process, with the Sapphire cryptographic core occupying 0.28 mm2 area consisting of 106k logic gates and 40.25 KB SRAM. Sapphire can be programmed with custom instructions for polynomial arithmetic and sampling, and it is coupled with a low-power RISC-V micro-processor to demonstrate NIST Round 2 lattice-based CCA-secure key encapsulation and signature protocols Frodo, NewHope, qTESLA, CRYSTALS-Kyber and CRYSTALS-Dilithium, achieving up to an order of magnitude improvement in performance and energy-efficiency compared to state-of-the-art hardware implementations. All key building blocks of Sapphire are constant-time and secure against timing and simple power analysis side-channel attacks. We also discuss how masking-based DPA countermeasures can be implemented on the Sapphire core without any changes to the hardware.
2019
TCHES
NTTRU: Truly Fast NTRU Using NTT
We present NTTRU – an IND-CCA2 secure NTRU-based key encapsulation scheme that uses the number theoretic transform (NTT) over the cyclotomic ring Z7681[X]/(X768−X384+1) and produces public keys and ciphertexts of approximately 1.25 KB at the 128-bit security level. The number of cycles on a Skylake CPU of our constant-time AVX2 implementation of the scheme for key generation, encapsulation and decapsulation is approximately 6.4K, 6.1K, and 7.9K, which is more than 30X, 5X, and 8X faster than these respective procedures in the NTRU schemes that were submitted to the NIST post-quantum standardization process. These running times are also, by a large margin, smaller than those for all the other schemes in the NIST process as well as the KEMs based on elliptic curve Diffie-Hellman. We additionally give a simple transformation that allows one to provably deal with small decryption errors in OW-CPA encryption schemes (such as NTRU) when using them to construct an IND-CCA2 key encapsulation.
2019
TCHES
Novel Side-Channel Attacks on Quasi-Cyclic Code-Based Cryptography
Chou suggested a constant-time implementation for quasi-cyclic moderatedensity parity-check (QC-MDPC) code-based cryptography to mitigate timing attacks at CHES 2016. This countermeasure was later found to become vulnerable to a differential power analysis (DPA) in private syndrome computation, as described by Rossi et al. at CHES 2017. The proposed DPA, however, still could not completely recover accurate secret indices, requiring further solving linear equations to obtain entire secret information. In this paper, we propose a multiple-trace attack which enables to completely recover accurate secret indices. We further propose a singletrace attack which can even work when using ephemeral keys or applying Rossi et al.’s DPA countermeasures. Our experiments show that the BIKE and LEDAcrypt may become vulnerable to our proposed attacks. The experiments are conducted using power consumption traces measured from ChipWhisperer-Lite XMEGA (8-bit processor) and ChipWhisperer UFO STM32F3 (32-bit processor) target boards.
2019
TCHES
Shaping the Glitch: Optimizing Voltage Fault Injection Attacks
Voltage fault injection is a powerful active side channel attack that modifies the execution-flow of a device by creating disturbances on the power supply line. The attack typically aims at skipping security checks or generating side-channels that gradually leak sensitive data, including the firmware code. In this paper we propose a new voltage fault injection technique that generates fully arbitrary voltage glitch waveforms using off-the-shelf and low cost equipment. To show the effectiveness of our setup, we present new, unpublished firmware extraction attacks on six microcontrollers from three major manufacturers: STMicroelectronics, Texas Instruments and Renesas Electronics that, in 2016 declared a market of $1.5 billion,$800 million and \$2.5 billion on units sold, respectively. Among the presented attacks, the most challenging ones exploit multiple vulnerabilities and inject over one million glitches, heavily leveraging on the performance and repeatability of the new proposed technique. We perform a thorough evaluation of arbitrary glitch waveforms by comparing the attack performance against two other major V-FI techniques in the literature. Along a responsible disclosure policy, all the vulnerabilities have been timely reported to the manufacturers.
2019
TCHES
Static Power SCA of Sub-100 nm CMOS ASICs and the Insecurity of Masking Schemes in Low-Noise Environments
Semiconductor technology scaling faced tough engineering challenges while moving towards and beyond the deep sub-micron range. One of the most demanding issues, limiting the shrinkage process until the present day, is the difficulty to control the leakage currents in nanometer-scaled field-effect transistors. Previous articles have shown that this source of energy dissipation, at least in case of digital CMOS logic, can successfully be exploited as a side-channel to recover the secrets of cryptographic implementations. In this work, we present the first fair technology comparison with respect to static power side-channel measurements on real silicon and demonstrate that the effect of down-scaling on the potency of this security threat is huge. To this end, we designed two ASICs in sub-100nm CMOS nodes (90 nm, 65 nm) and got them fabricated by one of the leading foundries. Our experiments, which we performed at different operating conditions, show consistently that the ASIC technology with the smaller minimum feature size (65 nm) indeed exhibits substantially more informative leakages (factor of ~10) than the 90nm one, even though all targeted instances have been derived from identical RTL code. However, the contribution of this work extends well beyond a mere technology comparison. With respect to the real-world impact of static power attacks, we present the first realistic scenarios that allow to perform a static power side-channel analysis (including noise reduction) without requiring control over the clock signal of the target. Furthermore, as a follow-up to some proof-of-concept work indicating the vulnerability of masking schemes to static powerattacks, we perform a detailed study on how the reduction of the noise level in static leakage measurements affects the security provided by masked implementations. As a result of this study, we do not only find out that the threat for masking schemes is indeed real, but also that common leakage assessment techniques, such as the Welch’s t-test, together with essentially any moment-based analysis of the leakage traces, is simply not sufficient in low-noise contexts. In fact, we are able to show that either a conversion (resp. compression) of the leakage order or the recently proposed X2 test need to be considered in assessment and attack to avoid false negatives.
2019
TCHES
The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations
We concentrate on machine learning techniques used for profiled sidechannel analysis in the presence of imbalanced data. Such scenarios are realistic and often occurring, for instance in the Hamming weight or Hamming distance leakage models. In order to deal with the imbalanced data, we use various balancing techniques and we show that most of them help in mounting successful attacks when the data is highly imbalanced. Especially, the results with the SMOTE technique are encouraging, since we observe some scenarios where it reduces the number of necessary measurements more than 8 times. Next, we provide extensive results on comparison of machine learning and side-channel metrics, where we show that machine learning metrics (and especially accuracy as the most often used one) can be extremely deceptive. This finding opens a need to revisit the previous works and their results in order to properly assess the performance of machine learning in side-channel analysis.
2019
TCHES
Cache-Timing Attacks on RSA Key Generation
During the last decade, constant-time cryptographic software has quickly transitioned from an academic construct to a concrete security requirement for real-world libraries. Most of OpenSSL’s constant-time code paths are driven by cryptosystem implementations enabling a dedicated flag at runtime. This process is perilous, with several examples emerging in the past few years of the flag either not being set or software defects directly mishandling the flag. In this work, we propose a methodology to analyze security-critical software for side-channel insecure code path traversal. Applying our methodology to OpenSSL, we identify three new code paths during RSA key generation that potentially leak critical algorithm state. Exploiting one of these leaks, we design, implement, and mount a single trace cache-timing attack on the GCD computation step. We overcome several hurdles in the process, including but not limited to: (1) granularity issues due to word-size operands to the GCD function; (2) bulk processing of desynchronized trace data; (3) non-trivial error rate during information extraction; and (4) limited high-confidence information on the modulus factors. Formulating lattice problem instances after obtaining and processing this limited information, our attack achieves roughly a 27% success rate for key recovery using the empirical data from 10K trials.
2019
TCHES
Analysis and Improvement of Differential Computation Attacks against Internally-Encoded White-Box Implementations
White-box cryptography is the last security barrier for a cryptographic software implementation deployed in an untrusted environment. The principle of internal encodings is a commonly used white-box technique to protect block cipher implementations. It consists in representing an implementation as a network of look-up tables which are then encoded using randomly generated bijections (the internal encodings). When this approach is implemented based on nibble (i.e. 4-bit wide) encodings, the protected implementation has been shown to be vulnerable to differential computation analysis (DCA). The latter is essentially an adaptation of differential power analysis techniques to computation traces consisting of runtime information, e.g., memory accesses, of the target software. In order to thwart DCA, it has then been suggested to use wider encodings, and in particular byte encodings, at least to protect the outer rounds of the block cipher which are the prime targets of DCA.In this work, we provide an in-depth analysis of when and why DCA works. We pinpoint the properties of the target variables and the encodings that make the attack (in)feasible. In particular, we show that DCA can break encodings wider than 4-bit, such as byte encodings. Additionally, we propose new DCA-like attacks inspired from side-channel analysis techniques. Specifically, we describe a collision attack particularly effective against the internal encoding countermeasure. We also investigate mutual information analysis (MIA) which naturally applies in this context. Compared to the original DCA, these attacks are also passive and they require very limited knowledge of the attacked implementation, but they achieve significant improvements in terms of trace complexity. All the analyses of our work are experimentally backed up with various attack simulation results. We also verified the practicability of our analyses and attack techniques against a publicly available white-box AES implementation protected with byte encodings –which DCA has failed to break before– and against a “masked” white-box AES implementation –which intends to resist DCA.
2019
TCHES
Exploring the Effect of Device Aging on Static Power Analysis Attacks
Vulnerability of cryptographic devices to side-channel analysis attacks, and in particular power analysis attacks has been extensively studied in the recent years. Among them, static power analysis attacks have become relevant with moving towards smaller technology nodes for which the static power is comparable to the dynamic power of a chip, or even dominant in future technology generations. The magnitude of the static power of a chip depends on the physical characteristics of transistors (e.g., the dimensions) as well as operating conditions (e.g., the temperature) and the electrical specifications such as the threshold voltage. In fact, the electrical specifications of transistors deviate from their originally intended ones during device lifetime due to aging mechanisms. Although device aging has been extensively investigated from reliability point of view, the impact of aging on the security of devices, and in particular on the vulnerability of devices to power analysis attacks are yet to be considered.This paper fills the gap and investigates how device aging can affect the susceptibility of a chip exposed to static power analysis attacks. To this end, we conduct both, simulation and practical experiments on real silicon. The experimental results are extracted from a realization of the PRESENT cipher fabricated using a 65nm commercial standard cell library. The results show that the amount of exploitable leakage through the static power consumption as a side channel is reduced when the device is aged. This can be considered as a positive development which can (even slightly) harden such static power analysis attacks. Additionally, this result is of great interest to static power side-channel adversaries since state-of-the-art leakage current measurements are conducted over long time periods under increased working temperatures and supply voltages to amplify the exploitable information, which certainly fuels aging-related device degradation.
2019
TCHES
Error Amplification in Code-based Cryptography
Code-based cryptography is one of the main techniques enabling cryptographic primitives in a post-quantum scenario. In particular, the MDPC scheme is a basic scheme from which many other schemes have been derived. These schemes rely on iterative decoding in the decryption process and thus have a certain small probability p of having a decryption (decoding) error.In this paper we show a very fundamental and important property of code-based encryption schemes. Given one initial error pattern that fails to decode, the time needed to generate another message that fails to decode is strictly much less than 1/p. We show this by developing a method for fast generation of undecodable error patterns (error pattern chaining), which additionally proves that a measure of closeness in ciphertext space can be exploited through its strong linkage to the difficulty of decoding these messages. Furthermore, if side-channel information is also available (time to decode), then the initial error pattern no longer needs to be given since one can be easily generated in this case.These observations are fundamentally important because they show that a, say, 128- bit encryption scheme is not inherently safe from reaction attacks even if it employs a decoder with a failure rate of 2−128. In fact, unless explicit protective measures are taken, having a failure rate at all – of any magnitude – can pose a security problem because of the error amplification effect of our method.A key-recovery reaction attack was recently shown on the MDPC scheme as well as similar schemes, taking advantage of decoding errors in order to recover the secret key. It was also shown that knowing the number of iterations in the iterative decoding step, which could be received in a timing attack, would also enable and enhance such an attack. In this paper we apply our error pattern chaining method to show how to improve the performance of such reaction attacks in the CPA case. We show that after identifying a single decoding error (or a decoding step taking more time than expected in a timing attack), we can adaptively create new error patterns that have a much higher decoding error probability than for a random error. This leads to a significant improvement of the attack based on decoding errors in the CPA case and it also gives the strongest known attack on MDPC-like schemes, both with and without using side-channel information.
2019
TCHES
The Interpose PUF: Secure PUF Design against State-of-the-art Machine Learning Attacks
The design of a silicon Strong Physical Unclonable Function (PUF) that is lightweight and stable, and which possesses a rigorous security argument, has been a fundamental problem in PUF research since its very beginnings in 2002. Various effective PUF modeling attacks, for example at CCS 2010 and CHES 2015, have shown that currently, no existing silicon PUF design can meet these requirements. In this paper, we introduce the novel Interpose PUF (iPUF) design, and rigorously prove its security against all known machine learning (ML) attacks, including any currently known reliability-based strategies that exploit the stability of single CRPs (we are the first to provide a detailed analysis of when the reliability based CMA-ES attack is successful and when it is not applicable). Furthermore, we provide simulations and confirm these in experiments with FPGA implementations of the iPUF, demonstrating its practicality. Our new iPUF architecture so solves the currently open problem of constructing practical, silicon Strong PUFs that are secure against state-of-the-art ML attacks.
2019
TCHES
M&amp;M: Masks and Macs against Physical Attacks
Cryptographic implementations on embedded systems need to be protected against physical attacks. Today, this means that apart from incorporating countermeasures against side-channel analysis, implementations must also withstand fault attacks and combined attacks. Recent proposals in this area have shown that there is a big tradeoff between the implementation cost and the strength of the adversary model. In this work, we introduce a new combined countermeasure M&amp;M that combines Masking with information-theoretic MAC tags and infective computation. It works in a stronger adversary model than the existing scheme ParTI, yet is a lot less costly to implement than the provably secure MPC-based scheme CAPA. We demonstrate M&amp;M with a SCA- and DFA-secure implementation of the AES block cipher. We evaluate the side-channel leakage of the second-order secure design with a non-specific t-test and use simulation to validate the fault resistance.
2019
TCHES
Implementing the masking countermeasure in hardware is a delicate task. Various solutions have been proposed for this purpose over the last years: we focus on Threshold Implementations (TIs), Domain-Oriented Masking (DOM), the Unified Masking Approach (UMA) and Generic Low Latency Masking (GLM). The latter generally come with innovative ideas to cope with physical defaults such as glitches. Yet, and in contrast to the situation in software-oriented masking, these schemes have not been formally proven at arbitrary security orders and their composability properties were left unclear. So far, only a 2-cycle implementation of the seminal masking scheme by Ishai, Sahai and Wagner has been shown secure and composable in the robust probing model – a variation of the probing model aimed to capture physical defaults such as glitches – for any number of shares.In this paper, we argue that this lack of proofs for TIs, DOM, UMA and GLM makes the interpretation of their security guarantees difficult as the number of shares increases. For this purpose, we first put forward that the higher-order variants of all these schemes are affected by (local or composability) security flaws in the (robust) probing model, due to insufficient refreshing. We then show that composability and robustness against glitches cannot be analyzed independently. We finally detail how these abstract flaws translate into concrete (experimental) attacks, and discuss the additional constraints robust probing security implies on the need of registers. Despite not systematically leading to improved complexities at low security orders, e.g., with respect to the required number of measurements for a successful attack, we argue that these weaknesses provide a case for the need of security proofs in the robust probing model (or a similar abstraction) at higher security orders.
2019
TCHES
Software Toolkit for HFE-based Multivariate Schemes
In 2017, NIST shook the cryptographic world by starting a process for standardizing post-quantum cryptography. Sixty-four submissions have been considered for the first round of the on-going NIST Post-Quantum Cryptography (PQC) process. Multivariate cryptography is a classical post-quantum candidate that turns to be the most represented in the signature category. At this stage of the process, it is of primary importance to investigate efficient implementations of the candidates. This article presents MQsoft, an efficient library which permits to implement HFE-based multivariate schemes submitted to the NIST PQC process such as GeMSS, Gui and DualModeMS. The library is implemented in C targeting Intel 64-bit processors and using avx2 set instructions. We present performance results for our library and its application to GeMSS, Gui and DualModeMS. In particular, we optimize several crucial parts for these schemes. These include root finding for HFE polynomials and evaluation of multivariate quadratic systems in F2. We propose a new method which accelerates root finding for specific HFE polynomials by a factor of two. For GeMSS and Gui, we obtain a speed-up of a factor between 2 and 19 for the keypair generation, between 1.2 and 2.5 for the signature generation, and between 1.6 and 2 for the verifying process. We have also improved the arithmetic in F2n by a factor of 4 compared to the NTL library. Moreover, a large part of our implementation is protected against timing attacks.
2019
TCHES
Practical Evaluation of Protected Residue Number System Scalar Multiplication
The Residue Number System (RNS) arithmetic is gaining grounds in public key cryptography, because it offers fast, efficient and secure implementations over large prime fields or rings of integers. In this paper, we propose a generic, thorough and analytic evaluation approach for protected scalar multiplication implementations with RNS and traditional Side Channel Attack (SCA) countermeasures in an effort to assess the SCA resistance of RNS. This paper constitutes the first robust evaluation of RNS software for Elliptic Curve Cryptography against electromagnetic (EM) side-channel attacks. Four different countermeasures, namely scalar and point randomization, random base permutations and random moduli operation sequence, are implemented and evaluated using the Test Vector Leakage Assessment (TVLA) and template attacks. More specifically, variations of RNS-based Montgomery Powering Ladder scalar multiplication algorithms are evaluated on an ARM Cortex A8 processor using an EM probe for acquisition of the traces. We show experimentally and theoretically that new bounds should be put forward when TVLA evaluations on public key algorithms are performed. On the security of RNS, our data and location dependent template attacks show that even protected implementations are vulnerable to these attacks. A combination of RNS-based countermeasures is the best way to protect against side-channel leakage.
2019
TCHES
Reducing a Masked Implementation’s Effective Security Order with Setup Manipulations
Couplings are a type of physical default that can violate the independence assumption needed for the secure implementation of the masking countermeasure. Two recent works by De Cnudde et al. put forward qualitatively that couplings can cause information leakages of lower order than theoretically expected. However, the (quantitative) amplitude of these lower-order leakages (e.g., measured as the amplitude of a detection metric such as Welch’s T statistic) was usually lower than the one of the (theoretically expected) dth order leakages. So the actual security level of these implementations remained unaffected. In addition, in order to make the couplings visible, the authors sometimes needed to amplify them internally (e.g., by tweaking the placement and routing or iterating linear operations on the shares). In this paper, we first show that the amplitude of low-order leakages in masked implementations can be amplified externally, by tweaking side-channel measurement setups in a way that is under control of a power analysis adversary. Our experiments put forward that the “effective security order” of both hardware (FPGA) and software (ARM-32) implementations can be reduced, leading to concrete reductions of their security level. For this purpose, we move from the detection-based analyzes of previous works to attack-based evaluations, allowing to confirm the exploitability of the lower-order leakages that we amplify. We also provide a tentative explanation for these effects based on couplings, and describe a model that can be used to predict them in function of the measurement setup’s external resistor and implementation’s supply voltage. We posit that the effective security orders observed are mainly due to “externally-amplified couplings” that can be systematically exploited by actual adversaries.
2019
TCHES
New Insights to Key Derivation for Tamper-Evident Physical Unclonable Functions
Several publications presented tamper-evident Physical Unclonable Functions (PUFs) for secure storage of cryptographic keys and tamper-detection. Unfortunately, previously published PUF-based key derivation schemes do not sufficiently take into account the specifics of the underlying application, i.e., an attacker that tampers with the physical parameters of the PUF outside of an idealized noise error model. This is a notable extension of existing schemes for PUF key derivation, as they are typically concerned about helper data leakage, i.e., by how much the PUF’s entropy is diminished when gaining access to its helper data.To address the specifics of tamper-evident PUFs, we formalize the aspect of tamper-sensitivity, thereby providing a new tool to rate by how much an attacker is allowed to tamper with the PUF. This complements existing criteria such as effective number of secret bits for entropy and failure rate for reliability. As a result, it provides a fair comparison among different schemes and independent of the PUF implementation, as its unit is based on the noise standard deviation of the underlying PUF measurement. To overcome the limitations of previous schemes, we then propose an Error-Correcting Code (ECC) based on the Lee metric, i.e., a distance metric well-suited to describe the distance between q-ary symbols as output from an equidistant quantization, i.e., a higher-order alphabet PUF. This novel approach is required, as the underlying symbols’ bits are not i.i.d. which hinders applying previous state-of-the-art approaches. We present the concept for our scheme and demonstrate its feasibility based on an empirical PUF distribution. The benefits of our approach are an increase by over 21% in effective secret bit compared to previous approaches based on equidistant quantization. At the same time, we improve tamper-sensitivity compared to an equiprobable quantization while ensuring similar reliability and entropy. Hence, this work opens up a new direction of how to interpret the PUF output and details a practically relevant scheme outperforming all previous constructions.
2019
TCHES
Leaky Noise: New Side-Channel Attack Vectors in Mixed-Signal IoT Devices
2019
TCHES
Multi-Tuple Leakage Detection and the Dependent Signal Issue
Leakage detection is a common tool to quickly assess the security of a cryptographic implementation against side-channel attacks. The Test Vector Leakage Assessment (TVLA) methodology using Welch’s t-test, proposed by Cryptography Research, is currently the most popular example of such tools, thanks to its simplicity and good detection speed compared to attack-based evaluations. However, as any statistical test, it is based on certain assumptions about the processed samples and its detection performances strongly depend on parameters like the measurement’s Signal-to-Noise Ratio (SNR), their degree of dependency, and their density, i.e., the ratio between the amount of informative and non-informative points in the traces. In this paper, we argue that the correct interpretation of leakage detection results requires knowledge of these parameters which are a priori unknown to the evaluator, and, therefore, poses a non-trivial challenge to evaluators (especially if restricted to only one test). For this purpose, we first explore the concept of multi-tuple detection, which is able to exploit differences between multiple informative points of a trace more effectively than tests relying on the minimum p-value of concurrent univariate tests. To this end, we map the common Hotelling’s T2-test to the leakage detection setting and, further, propose a specialized instantiation of it which trades computational overheads for a dependency assumption. Our experiments show that there is not one test that is the optimal choice for every leakage scenario. Second, we highlight the importance of the assumption that the samples at each point in time are independent, which is frequently considered in leakage detection, e.g., with Welch’s t-test. Using simulated and practical experiments, we show that (i) this assumption is often violated in practice, and (ii) deviations from it can affect the detection performances, making the correct interpretation of the results more difficult. Finally, we consolidate our findings by providing guidelines on how to use a combination of established and newly-proposed leakage detection tools to infer the measurements parameters. This enables a better interpretation of the tests’ results than the current state-of-the-art (yet still relying on heuristics for the most challenging evaluation scenarios).
2019
TCHES
Fast constant-time gcd computation and modular inversion
This paper introduces streamlined constant-time variants of Euclid’s algorithm, both for polynomial inputs and for integer inputs. As concrete applications, this paper saves time in (1) modular inversion for Curve25519, which was previously believed to be handled much more efficiently by Fermat’s method, and (2) key generation for the ntruhrss701 and sntrup4591761 lattice-based cryptosystems.
2019
TCHES
Best Information is Most Successful
Using information-theoretic tools, this paper establishes a mathematical link between the probability of success of a side-channel attack and the minimum number of queries to reach a given success rate, valid for any possible distinguishing rule and with the best possible knowledge on the attacker’s side. This link is a lower bound on the number of queries highly depends on Shannon’s mutual information between the traces and the secret key. This leads us to derive upper bounds on the mutual information that are as tight as possible and can be easily calculated. It turns out that, in the case of an additive white Gaussian noise, the bound on the probability of success of any attack is directly related to the signal to noise ratio. This leads to very easy computations and predictions of the success rate in any leakage model.
2019
TCHES
Secure Physical Enclosures from Covers with Tamper-Resistance
Ensuring physical security of multiple-chip embedded systems on a PCB is challenging, since the attacker can control the device in a hostile environment. To detect physical intruders as part of a layered approach to security, it is common to create a physical security boundary that is difficult to penetrate or remove, e.g., enclosures created from tamper-respondent envelopes or covers. Their physical integrity is usually checked by active sensing, i.e., a battery-backed circuit continuously monitors the enclosure. However, adoption is often hampered by the disadvantages of a battery and due to specialized equipment which is required to create the enclosure. In contrast, we present a batteryless tamper-resistant cover made from standard flexPCB technology, i.e., a commercially widespread, scalable, and proven technology. The cover comprises a fine mesh of electrodes and an evaluation unit underneath the cover checks their integrity by detecting short and open circuits. Additionally, it measures the capacitances between the electrodes of the mesh. Once its preliminary integrity is confirmed, a cryptographic key is derived from the capacitive measurements representing a PUF, to decrypt and authenticate sensitive data of the enclosed system. We demonstrate the feasibility of our concept, provide details on the layout, electrical properties of the cover, and explain the underlying security architecture. Practical results including statistics over a set of 115 flexPCB covers, physical attacks, and environmental testing support our design rationale. Hence, our work opens up a new direction of counteracting physical tampering without the need of batteries, while aiming at a physical security level comparable to FIPS 140-2 level 3.
2019
TCHES
Electromagnetic Information Extortion from Electronic Devices Using Interceptor and Its Countermeasure
The problem of information leakage through electromagnetic waves for various devices has been extensively discussed in literature. Conventionally, devices that are under such a threat suffer from potential electromagnetic information leakage during their operation. Further, the information inside the devices can be obtained by monitoring the electromagnetic waves leaking at the boundaries of the devices. The leakage of electromagnetic waves, however, was not observed for some devices, and such devices were not the target of the threat discussed above. In light of this circumstance, this paper discusses an “interceptor” that forces the leakage of information through electromagnetic waves, even from devices in which potential electromagnetic leakage does not occur. The proposed interceptor is a small circuit consisting of an affordable semiconductor chip and wiring and is powered by electromagnetic waves that irradiate from the outside of a device as its driving energy. The distance at which information is obtained is controlled by increasing the intensity of the irradiated electromagnetic waves. The paper presents the structure of the circuit for implementing the proposed interceptor to be used in major input–output devices and cryptographic modules, mounting a pathway designed on the basis of the construction method onto each device. Moreover, it is shown that it is possible to forcefully cause information leakage through electromagnetic waves. To detect the aforementioned threat, the paper also focuses on the changes in a device itself and the surrounding electromagnetic environment as a result of mounting an interceptor and considers a method of detecting an interceptor by both passive and active monitoring methods.
2019
TCHES
Fast, Furious and Insecure: Passive Keyless Entry and Start Systems in Modern Supercars
The security of immobiliser and Remote Keyless Entry systems has been extensively studied over many years. Passive Keyless Entry and Start systems, which are currently deployed in luxury vehicles, have not received much attention besides relay attacks. In this work we fully reverse engineer a Passive Keyless Entry and Start system and perform a thorough analysis of its security.Our research reveals several security weaknesses. Specifically, we document the use of an inadequate proprietary cipher using 40-bit keys, the lack of mutual authentication in the challenge-response protocol, no firmware readout protection features enabled and the absence of security partitioning.In order to validate our findings, we implement a full proof of concept attack allowing us to clone a Tesla Model S key fob in a matter of seconds with low cost commercial off the shelf equipment. Our findings most likely apply to other manufacturers of luxury vehicles including McLaren, Karma and Triumph motorcycles as they all use the same system developed by Pektron.
2019
TCHES
Secure Data Retrieval on the Cloud: Homomorphic Encryption meets Coresets
Secure report is the problem of a client that retrieves all records matching specified attributes from a database table at the server (e.g. cloud), as in SQL SELECT queries, but where the query and the database are encrypted. Here, only the client has the secret key, but still the server is expected to compute and return the encrypted result. Secure report is theoretically possible with Fully Homomorphic Encryption (FHE). However, the current state-of-the-art solutions are realized by a polynomial of degree that is at least linear in the number m of records, which is too slow in practice even for very small databases. We present the first solution that is realized by a polynomial that attains degree independent of the number of records m, as well as the first implementation of an FHE solution to Secure report. This is by suggesting a novel paradigm that forges a link between cryptography and modern data summarization techniques known as coresets (core-sets), and sketches in particular. The key idea is to compute only a coreset of the desired report. Since the coreset is small, the client can quickly decode the desired report that the server computes after decrypting the coreset. We implemented our main reporting system in an open source library. This is the first implemented system that can answer such database queries when processing only FHE encrypted data and queries. As our analysis promises, the experimental results show that we can run Secure report queries on billions records in minutes on an Amazon EC2 server, compared to less than a hundred-thousands in previous FHE based solutions.
2019
TCHES
Covert Gates: Protecting Integrated Circuits with Undetectable Camouflaging
Integrated circuit (IC) camouflaging has emerged as a promising solution for protecting semiconductor intellectual property (IP) against reverse engineering. Existing methods of camouflaging are based on standard cells that can assume one of many Boolean functions, either through variation of transistor threshold voltage or contact configurations. Unfortunately, such methods lead to high area, delay and power overheads, and are vulnerable to invasive as well as non-invasive attacks based on Boolean satisfiability/VLSI testing. In this paper, we propose, fabricate, and demonstrate a new cell camouflaging strategy, termed as ‘covert gate’ that leverages doping and dummy contacts to create camouflaged cells that are indistinguishable from regular standard cells under modern imaging techniques. We perform a comprehensive security analysis of covert gate, and show that it achieves high resiliency against SAT and test-based attacks at very low overheads. We also derive models to characterize the covert cells, and develop measures to incorporate them into a gate-level design. Simulation results of overheads and attacks are presented on benchmark circuits.
2019
TCHES
New Circuit Minimization Techniques for Smaller and Faster AES SBoxes
In this paper we consider various methods and techniques to find the smallest circuit realizing a given linear transformation on n input signals and m output signals, with a constraint of a maximum depth, maxD, of the circuit. Additional requirements may include that input signals can arrive to the circuit with different delays, and output signals may be requested to be ready at a different depth. We apply these methods and also improve previous results in order to find hardware circuits for forward, inverse, and combined AES SBoxes, and for each of them we provide the fastest and smallest combinatorial circuits. Additionally, we propose a novel technique with “floating multiplexers” to minimize the circuit for the combined SBox, where we have two different linear matrices (forward and inverse) combined with multiplexers. The resulting AES SBox solutions are the fastest and smallest to our knowledge.
2019
TCHES
SMT Attack: Next Generation Attack on Obfuscated Circuits with Capabilities and Performance Beyond the SAT Attacks
In this paper, we introduce the Satisfiability Modulo Theory (SMT) attack on obfuscated circuits. The proposed attack is the superset of Satisfiability (SAT) attack, with many additional features. It uses one or more theory solvers in addition to its internal SAT solver. For this reason, it is capable of modeling far more complex behaviors and could formulate much stronger attacks. In this paper, we illustrate that the use of theory solvers enables the SMT to carry attacks that are not possible by SAT formulated attacks. As an example of its capabilities, we use the SMT attack to break a recent obfuscation scheme that uses key values to alter delay properties (setup and hold time) of a circuit to remain SAT hard. Considering that the logic delay is not a Boolean logical property, the targeted obfuscation mechanism is not breakable by a SAT attack. However, in this paper, we illustrate that the proposed SMT attack, by deploying a simple graph theory solver, can model and break this obfuscation scheme in few minutes. We describe how the SMT attack could be used in one of four different attack modes: (1) We explain how SMT attack could be reduced to a SAT attack, (2) how the SMT attack could be carried out in Eager, and (3) Lazy approach, and finally (4) we introduce the Accelerated SMT (AccSMT) attack that offers significant speed-up to SAT attack. Additionally, we explain how AccSMT attack could be used as an approximate attack when facing SMT-Hard obfuscation schemes.