Lightweight Iterative MDS Matrices: How Small Can We Go? 📺
As perfect building blocks for the diffusion layers of many symmetric-key primitives, the construction of MDS matrices with lightweight circuits has received much attention from the symmetric-key community. One promising way of realizing low-cost MDS matrices is based on the iterative construction: a low-cost matrix becomes MDS after rising it to a certain power. To be more specific, if At is MDS, then one can implement A instead of At to achieve the MDS property at the expense of an increased latency with t clock cycles. In this work, we identify the exact lower bound of the number of nonzero blocks for a 4 × 4 block matrix to be potentially iterative-MDS. Subsequently, we show that the theoretically lightest 4 × 4 iterative MDS block matrix (whose entries or blocks are 4 × 4 binary matrices) with minimal nonzero blocks costs at least 3 XOR gates, and a concrete example achieving the 3-XOR bound is provided. Moreover, we prove that there is no hope for previous constructions (GFS, LFS, DSI, and spares DSI) to beat this bound. Since the circuit latency is another important factor, we also consider the lower bound of the number of iterations for certain iterative MDS matrices. Guided by these bounds and based on the ideas employed to identify them, we explore the design space of lightweight iterative MDS matrices with other dimensions and report on improved results. Whenever we are unable to find better results, we try to determine the bound of the optimal solution. As a result, the optimality of some previous results is proved.
Constructing Low-latency Involutory MDS Matrices with Lightweight Circuits 📺
MDS matrices are important building blocks providing diffusion functionality for the design of many symmetric-key primitives. In recent years, continuous efforts are made on the construction of MDS matrices with small area footprints in the context of lightweight cryptography. Just recently, Duval and Leurent (ToSC 2018/FSE 2019) reported some 32 × 32 binary MDS matrices with branch number 5, which can be implemented with only 67 XOR gates, whereas the previously known lightest ones of the same size cost 72 XOR gates.In this article, we focus on the construction of lightweight involutory MDS matrices, which are even more desirable than ordinary MDS matrices, since the same circuit can be reused when the inverse is required. In particular, we identify some involutory MDS matrices which can be realized with only 78 XOR gates with depth 4, whereas the previously known lightest involutory MDS matrices cost 84 XOR gates with the same depth. Notably, the involutory MDS matrix we find is much smaller than the AES MixColumns operation, which requires 97 XOR gates with depth 8 when implemented as a block of combinatorial logic that can be computed in one clock cycle. However, with respect to latency, the AES MixColumns operation is superior to our 78-XOR involutory matrices, since the AES MixColumns can be implemented with depth 3 by using more XOR gates.We prove that the depth of a 32 × 32 MDS matrix with branch number 5 (e.g., the AES MixColumns operation) is at least 3. Then, we enhance Boyar’s SLP-heuristic algorithm with circuit depth awareness, such that the depth of its output circuit is limited. Along the way, we give a formula for computing the minimum achievable depth of a circuit implementing the summation of a set of signals with given depths, which is of independent interest. We apply the new SLP heuristic to a large set of lightweight involutory MDS matrices, and we identify a depth 3 involutory MDS matrix whose implementation costs 88 XOR gates, which is superior to the AES MixColumns operation with respect to both lightweightness and latency, and enjoys the extra involution property.