Aein Rezaei Shahmirzadi
New First-Order Secure AES Performance Records 📺
Being based on a sound theoretical basis, masking schemes are commonly applied to protect cryptographic implementations against Side-Channel Analysis (SCA) attacks. Constructing SCA-protected AES, as the most widely deployed block cipher, has been naturally the focus of several research projects, with a direct application in industry. The majority of SCA-secure AES implementations introduced to the community opted for low area and latency overheads considering Application-Specific Integrated Circuit (ASIC) platforms. Albeit a few, those which particularly targeted Field Programmable Gate Arrays (FPGAs) as the implementation platform yield either a low throughput or a not-highly secure design.In this work, we fill this gap by introducing first-order glitch-extended probing secure masked AES implementations highly optimized for FPGAs, which support both encryption and decryption. Compared to the state of the art, our designs efficiently map the critical non-linear parts of the masked S-box into the built-in Block RAMs (BRAMs).The most performant variant of our constructions accomplishes five first-order secure AES encryptions/decryptions simultaneously in 50 clock cycles. Compared to the equivalent state-of-the-art designs, this leads to at least 70% reduction in utilization of FPGA resources (slices) at the cost of occupying BRAMs. Last but not least, we provide a wide range of such secure and efficient implementations supporting a large set of applications, ranging from low-area to high-throughput.
Second-Order SCA Security with almost no Fresh Randomness 📺
Masking schemes are among the most popular countermeasures against Side-Channel Analysis (SCA) attacks. Realization of masked implementations on hardware faces several difficulties including dealing with glitches. Threshold Implementation (TI) is known as the first strategy with provable security in presence of glitches. In addition to the desired security order d, TI defines the minimum number of shares to also depend on the algebraic degree of the target function. This may lead to unaffordable implementation costs for higher orders.For example, at least five shares are required to protect the smallest nonlinear function against second-order attacks. By cuttingsuch a dependency, the successor schemes are able to achieve the same security level by just d + 1 shares, at the cost of high demand for fresh randomness, particularly at higher orders. In this work, we provide a methodology to realize the second-order glitch-extended probing-secure implementation of a group of quadratic functions with three shares and no fresh randomness. This allows us to construct second-order secure implementations of several cryptographic primitives with very limited number of fresh masks, including Keccak, SKINNY, Midori, PRESENT, and PRINCE.
Low-Latency Keccak at any Arbitrary Order 📺
Correct application of masking on hardware implementation of cryptographic primitives necessitates the instantiation of registers in order to achieve the non-completeness (commonly said to stop the propagation of glitches). This sometimes leads to a high latency overhead, making the implementation not necessarily suitable for the underlying application. As a concrete example, this holds for Keccak. Application of d + 1 Domain Oriented Masking (DOM) on a round-based implementation of Keccak leads to the introduction of two register stages per round, i.e., two times higher latency. On the other hand, Rhythmic-Keccak, introduced in CHES 2018, unrolls two rounds to half the latency compared to an unprotected ordinary round-based implementation. To that end, td + 1 masking is used which requires a notable area, and – apart from the difficulty to construct – its extension to higher orders seems beyond the bounds of feasibility.In this paper, we focus on d + 1 masking and introduce a methodology which enables us to stay with the latency of an unprotected round-based implementation, i.e., one register stage per round. While being secure under glitch-extended probing model, we provide a general design where the desired security order can be easily adjusted without any effect on the above-given latency. Compared to the Rhythmic-Keccak, the synthesis results show that our first-order design is able to accomplish the entire operations of Keccak-f in the same period of time while decreasing the area by 74.5%. Notably, our implementations achieve around 30% less delay compared to the corresponding original DOM-Keccak designs.
FIVER – Robust Verification of Countermeasures against Fault Injections 📺
Fault Injection Analysis is seen as a powerful attack against implementations of cryptographic algorithms. Over the last two decades, researchers proposed a plethora of countermeasures to secure such implementations. However, the design process and implementation are still error-prone, complex, and manual tasks which require long-standing experience in hardware design and physical security. Moreover, the validation of the claimed security is often only done by empirical testing in a very late stage of the design process. To prevent such empirical testing strategies, approaches based on formal verification are applied instead providing the designer early feedback.In this work, we present a fault verification framework to validate the security of countermeasures against fault-injection attacks designed for ICs. The verification framework works on netlist-level, parses the given digital circuit into a model based on Binary Decision Diagrams, and performs symbolic fault injections. This verification approach constitutes a novel strategy to evaluate protected hardware designs against fault injections offering new opportunities as performing full analyses under a given fault models.Eventually, we apply the proposed verification framework to real-world implementations of well-established countermeasures against fault-injection attacks. Here, we consider protected designs of the lightweight ciphers CRAFT and LED-64 as well as AES. Due to several optimization strategies, our tool is able to perform more than 90 million fault injections in a single-round CRAFT design and evaluate the security in under 50 min while the symbolic simulation approach considers all 2128 primary inputs.
Re-Consolidating First-Order Masking Schemes: Nullifying Fresh Randomness 📺
Application of masking, known as the most robust and reliable countermeasure to side-channel analysis attacks, on various cryptographic algorithms has dedicated a lion’s share of research to itself. The difficulty originates from the fact that the overhead of application of such an algorithmic-level countermeasure might not be affordable. This includes the area- and latency overheads and the amount of fresh randomness required to fulfill the resulting design’s security properties. There are already techniques applicable in hardware platforms that consider glitches into account. Among them, classical threshold implementations force the designers to use at least three shares in the underlying masking. The other schemes, which can deal with two shares, often necessitates the use of fresh randomness.Here, in this work, we present a technique allowing us to use two shares to realize the first-order glitch-extended probing secure masked realization of several functions, including the S-box of Midori, PRESENT, PRINCE, and AES ciphers without any fresh randomness.