IACR News
Here you can see all recent updates to the IACR webpage. These updates are also available:
27 July 2024
Munich, Germany, 23 June - 26 June 2025
Event Calendar
Event date: 23 June to 26 June 2025
Submission deadline: 9 September 2024
Notification: 11 November 2024
Submission deadline: 9 September 2024
Notification: 11 November 2024
Bhilai, India, 9 January - 11 February 2025
Event Calendar
Event date: 9 January to 11 February 2025
Submission deadline: 15 August 2024
Notification: 30 September 2024
Submission deadline: 15 August 2024
Notification: 30 September 2024
Taipei, Taiwan, 8 April - 10 April 2025
Event Calendar
Event date: 8 April to 10 April 2025
Submission deadline: 25 October 2024
Notification: 6 January 2025
Submission deadline: 25 October 2024
Notification: 6 January 2025
Toronto, Canada, 13 August - 15 August 2025
Event Calendar
Event date: 13 August to 15 August 2025
25 July 2024
JAIPUR, India, 16 December - 20 December 2024
Event Calendar
Event date: 16 December to 20 December 2024
Xinyi Ji, Jiankuo Dong, Junhao Huang, Zhijian Yuan, Wangchen Dai, Fu Xiao, Jingqiang Lin
ePrint Report
The field of post-quantum cryptography (PQC) is continuously evolving. Many researchers are exploring efficient PQC implementation on various platforms, including x86, ARM, FPGA, GPU, etc. In this paper, we present an Efficient CryptOgraphy CRYSTALS (ECO-CRYSTALS) implementation on standard 64-bit RISC-V Instruction Set Architecture (ISA). The target schemes are two winners of the National Institute of Standards and Technology (NIST) PQC competition: CRYSTALS-Kyber and CRYSTALS-Dilithium, where the two most time-consuming operations are Keccak and polynomial multiplication. Notably, this paper is the first to deploy Kyber and Dilithium on the 64-bit RISC-V ISA. Firstly, we propose a better scheduling strategy for Keccak, which is specifically tailored for the 64-bit dual-issue RISC-V architecture. Our 24-round Keccak permutation (Keccak-$p$[1600,24]) achieves a 59.18% speed-up compared to the reference implementation. Secondly, we apply two modular arithmetic (Montgomery arithmetic and Plantard arithmetic) in the polynomial multiplication of Kyber and Dilithium to get a better lazy reduction. Then, we propose a flexible dual-instruction-issue scheme of Number Theoretic Transform (NTT). As for the matrix-vector multiplication, we introduce a row-to-column processing methodology to minimize the expensive memory access operations. Compared to the reference implementation, we obtain a speedup of 53.85%$\thicksim$85.57% for NTT, matrix-vector multiplication, and INTT in our ECO-CRYSTALS. Finally, our ECO-CRYSTALS implementation for key generation, encapsulation, and decapsulation in Kyber achieves 399k, 448k, and 479k cycles respectively, achieving speedups of 60.82%, 63.93%, and 65.56% compared to the NIST reference implementation. Similarly, our ECO-CRYSTALS implementation for key generation, sign, and verify in Dilithium reaches 1,364k, 3,191k, and 1,369k cycles, showcasing speedups of 54.84%, 64.98%, and 57.20%, respectively.
Qianqian Yang, Ling Song, Nana Zhang, Danping Shi, Libo Wang, Jiahao Zhao, Lei Hu, Jian Weng
ePrint Report
The rectangle attack has shown to be a very powerful form of cryptanalysis against block ciphers. Given a rectangle distinguisher, one expects to mount key recovery attacks as efficiently as possible. In the literature, there have been four algorithms for rectangle key recovery attacks. However, their performance varies from case to case. Besides, numerous are the applications where the attacks lack optimality. In this paper, we delve into the rectangle key recovery and propose a unified and generic key recovery algorithm, which supports any possible attacking parameters. Not only does it encompass the four existing rectangle key recovery algorithms, but it also reveals five new types of attacks that were previously overlooked. Further, we put forward a counterpart for boomerang key recovery attacks, which supports any possible attacking parameters as well. Along with these new key recovery algorithms, we propose a framework to automatically determine the best parameters for the attack. To demonstrate the efficiency of the new key recovery algorithms, we apply them to \serpent, \aes-192, \craft, \skinny, and \deoxysbc-256 based on existing distinguishers, yielding a series of improved attacks.
Peihan Miao, Xinyi Shi, Chao Wu, Ruofan Xu
ePrint Report
Privacy-preserving machine learning (PPML) enables multiple distrusting parties to jointly train ML models on their private data without revealing any information beyond the final trained models. In this work, we study the client-aided two-server setting where two non-colluding servers jointly train an ML model on the data held by a large number of clients. By involving the clients in the training process, we develop efficient protocols for training algorithms including linear regression, logistic regression, and neural networks. In particular, we introduce novel approaches to securely computing inner product, sign check, activation functions (e.g., ReLU, logistic function), and division on secret shared values, leveraging lightweight computation on the client side. We present constructions that are secure against semi-honest clients and further enhance them to achieve security against malicious clients. We believe these new client-aided techniques may be of independent interest.
We implement our protocols and compare them with the two-server PPML protocols presented in SecureML (Mohassel and Zhang, S&P'17) across various settings and ABY2.0 (Patra et al., Usenix Security'21) theoretically. We demonstrate that with the assistance of untrusted clients in the training process, we can significantly improve both the communication and computational efficiency by orders of magnitude. Our protocols compare favorably in all the training algorithms on both LAN and WAN networks.
We implement our protocols and compare them with the two-server PPML protocols presented in SecureML (Mohassel and Zhang, S&P'17) across various settings and ABY2.0 (Patra et al., Usenix Security'21) theoretically. We demonstrate that with the assistance of untrusted clients in the training process, we can significantly improve both the communication and computational efficiency by orders of magnitude. Our protocols compare favorably in all the training algorithms on both LAN and WAN networks.
Jianming Lin, Chang-An Zhao, Yuhao Zheng
ePrint Report
For many pairing-based cryptographic protocols such as Direct Anonymous Attestation (DAA) schemes, the arithmetic on the first pairing subgroup $\mathbb{G}_1$ is more fundamental. Such operations heavily depend on the sizes of prime fields. At the 192-bit security level, Gasnier and Guillevic presented a curve named GG22D7-457 with CM-discriminant $D = 7$ and embedding degree $k = 22$. Compared to other well-known pairing-friendly curves at the same security level, the curve GG22D7-457 has smaller prime field size and $\rho$-value, which benefits from the fast operations on $\mathbb{G}_1$. However, the pairing computation on GG22D7-457 is not efficient.
In this paper, we investigate to derive a higher performance for the pairing computation on GG22D7-457. We first propose novel formulas of the super-optimal pairing on this curve by utilizing a $2$-isogeny as GLV-endomorphism. Besides, this tool can be generalized to more generic families of pairing-friendly curves with $n$-isogenies as endomorphisms. In our paper, we provide the explicit formulas for the super-optimal pairings exploiting $2, 3$-isogenies. Finally, we make a concrete computational cost analysis and implement the pairing computations on curve GG22D7-457 using our approaches. In terms of Miller function evaluation, employing the techniques in this paper obtain a saving of $24.44\% $ in $\mathbb{F}_p$-multiplications compared to the optimal ate pairing. As for the running time, the experimental results illustrate that the Miller loop on GG22D7-457 by utilizing our methods is $26.0\%$ faster than the state-of-the-art. Additionally, the performance of the super-optimal pairing on GG22D7-457 is competitive compared to the well-known pairing-friendly curves at the 192-bit security level. These results show that GG22D7-457 becomes an attractive candidate for the pairing-based protocols. Furthermore, our techniques have the potential to enhance the applications of super-optimal pairings on more pairing-friendly curves.
Rafael Carrera Rodriguez, Emanuele Valea, Florent Bruguier, Pascal Benoit
ePrint Report
The rapid evolution of post-quantum cryptography, spurred by standardization efforts such as those led by NIST, has highlighted the prominence of lattice-based cryptography, notably exemplified by CRYSTALS-Kyber. However, concerns persist regarding the security of cryptographic implementations, particularly in the face of Side-Channel Attacks (SCA). The usage of operations like the Number Theoretic
Transform (NTT) in CRYSTALS-Kyber introduces vulnerabilities to SCA, especially single-trace ones, such as soft-analytical side-channel attacks. To address this threat, Ravi et al. proposed local masking as a countermeasure by randomizing the NTT’s twiddle factors, but its implementation and security implications require further investigation. This paper presents a hardware implementation of the NTT with local masking, evaluating its performance, area utilization, and security impacts. Additionally, it analyzes the vulnerabilities inherent in local masking and assesses its practical security effectiveness through non-specific t-tests, showing that there are configurations of local masking that are more prone to leakage than others.
Hugues RANDRIAMBOLOLONA
ePrint Report
We present a new distinguisher for alternant and Goppa codes, whose complexity is subexponential in the error-correcting capability, hence better than that of generic decoding algorithms. Moreover it does not suffer from the strong regime limitations of the previous distinguishers or structure recovery algorithms: in particular, it applies to the codes used in the Classic McEliece candidate for postquantum cryptography standardization. The invariants that allow us to distinguish are graded Betti numbers of the homogeneous coordinate ring of a shortening of the dual code.
Since its introduction in 1978, this is the first time an analysis of the McEliece cryptosystem breaks the exponential barrier.
Since its introduction in 1978, this is the first time an analysis of the McEliece cryptosystem breaks the exponential barrier.
Amin Abdulrahman, Felix Oberhansl, Hoang Nguyen Hien Pham, Jade Philipoom, Peter Schwabe, Tobias Stelzer, Andreas Zankl
ePrint Report
This paper presents extensions to the OpenTitan hardware root of trust that aim at enabling high-performance lattice-based cryptography. We start by carefully optimizing ML-KEM and ML-DSA - the two primary algorithms selected by NIST for standardization - in software targeting the OTBN accelerator. Based on profiling results of these implementations, we propose tightly integrated extensions to OTBN, specifically an interface from OTBN to OpenTitan's Keccak accelerator (KMAC core) and extensions to the OTBN ISA to support operations on 256-bit vectors. We implement these extensions in hardware and show that we achieve a speedup by a factor between 6 and 9 for different operations and parameter sets of ML-KEM and ML-DSA compared to our baseline implementation on unmodified OTBN. This speedup is achieved with an increase in cell count of less than 12% in OTBN, which corresponds to an increase of less than 2% for the full Earlgrey OpenTitan core.
Zhengjun Cao, Lihua Liu
ePrint Report
We show that the authentication protocol [IEEE Internet Things J., 2023, 10(1), 867-876] is not correctly specified, because the server cannot complete its computations. To revise, the embedded device needs to compute an extra point multiplication over the underlying elliptic curve. We also find the protocol cannot provide anonymity, not as claimed. It can only provide pseudonymity.
Nan Cheng, Aikaterini Mitrokotsa, Feng Zhang, Frank Hartmann
ePrint Report
Computing the maximum from a list of secret inputs is a widely-used functionality that is employed ei- ther indirectly as a building block in secure computation frameworks, such as ABY (NDSS’15) or directly used in multiple applications that solve optimisation problems, such as secure machine learning or secure aggregation statistics. Incremental distributed point function (I-DPF) is a powerful primitive (IEEE S&P’21) that significantly reduces the client- to-server communication and are employed to efficiently and securely compute aggregation statistics.
In this paper, we investigate whether I-DPF can be used to improve the efficiency of secure two-party computation (2PC) with an emphasis on computing the maximum value and the k-th (with k unknown to the computing parties) ranked value from a list of secret inputs. Our answer is affir- mative, and we propose novel secure 2PC protocols that use I-DPF as a building block, resulting in significant efficiency gains compared to the state-of-the-art. More precisely, our contributions are: (i) We present two new secure computa- tion frameworks that efficiently compute secure aggregation statistics bit-wisely or batch-wisely; (ii) we propose novel protocols to compute the maximum value, the k-th ranked value from a list of secret inputs; (iii) we provide variations of the proposed protocols that can perform batch computations and thus provide further efficiency improvements; and (iv) we provide an extensive performance evaluation for all proposed protocols.
Our protocols have a communication complexity that is independent of the number of secret inputs and linear to the length of the secret input domain. Our experimental re- sults show enhanced efficiency over state-of-the-art solutions, particularly notable when handling large-scale inputs. For instance, in scenarios involving an input set of five million elements with an input domain size of 31 bits, our protocol ΠMax achieves an 18% reduction in online execution time and a 67% decrease in communication volume compared to the most efficient existing solution.
The Espresso Sequencing Network: HotShot Consensus, Tiramisu Data-Availability, and Builder-Exchange
Jeb Bearer, Benedikt Bünz, Philippe Camacho, Binyi Chen, Ellie Davidson, Ben Fisch, Brendon Fish, Gus Gutoski, Fernando Krell, Chengyu Lin, Dahlia Malkhi, Kartik Nayak, Keyao Shen, Alex Xiong, Na ...
ePrint Report
Building a Consensus platform for shared sequencing can power an ecosystem of layer-2 solutions such as rollups which are crucial for scaling blockchains (e.g.,Ethereum). However, it drastically differs from conventional Consensus for blockchains in two key considerations:
• (No) Execution: A shared sequencing platform is not responsible for pre-validating blocks nor for processing state updates. Therefore, agreement is formed on a sequence of certificates of block data-availability (DA) without persisting them or obtaining blocks in full. At the same time, the platform must stream block data with very high efficiency to layer-2 entities for execution, or (in the case of rollups) for proof generation.
• Builder-Exchange: A shared sequencing platform delegates to external entities to build blocks and separates it from the role of a consensus proposer. This allows an ecosystem of specialized builders to pre-validate transactions for diversified rollups, languages, and MEV exploits. However, separating the task of block-building from proposing brings a new challenge. Builders want assurances that their blocks would commit in exchange for revealing their contents, whereas validators/proposers want assurance that the data in committed blocks will be available and fees paid. Neither one trusts the other, hence the shared sequencing platform should facilitate a “fair-exchange” between builders and the sequencing network. The Espresso Sequencing Network is purpose-built to address these unique considerations.
Among the main novelties of the design are (i) a three-layered DA system called Tiramisu, coupled with (ii) a costless integration of the DA with the platform’s consensus core, and (iii) a Builder-Exchange mechanism between builders and the consensus core.
Note that this paper relies substantially on and can be seen as an extension of The Espresso Sequencer: HotShot Consensus and Tiramisu Data Availability [84].
S. M. Dehnavi, M. R. Mirzaee Shamsabad
ePrint Report
In this paper, using the concept of equivalence of mappings we characterize all of the one-XOR matrices which are used in hardware applications and propose a family of lightweight linear mappings for software-oriented applications in symmetric cryptography. Then, we investigate interleaved linear mappings and based upon this study, we present generalized dynamic primitive LFSRs along with dynamic linear components for construction of diffusion layers.
From the mathematical viewpoint, this paper presents involutive sparse binary matrices as well as sparse binary matrices with sparse inverses. Another interesting result of our investigation is that, by our characterization of one-XOR matrices, the search space for finding a $k$ such that $x^n+x^k+1$ is a primitive trinomial could be reduced.
Yaacov Belenky, Hennadii Chernyshchyk, Oleg Karavaev, Oleh Maksymenko, Valery Teper, Daria Ryzhkova, Itamar Levi, Osnat Keren, Yury Kreimer
ePrint Report
Side-channel-analysis (SCA) resistance with cost optimization in AES hardware implementations remains a significant challenge. While traditional masking-based schemes offer provable security, they often incur substantial resource overheads (latency, area, randomness, performance, power consumption). Alternatively, the RAMBAM scheme introduced a redundancy-based approach to control the signal-to-noise ratio, and achieves exponential leakage reduction as redundancy increases. This method results in only a slight increase in area and in power consumption, and a significant decrease in the amount of randomness needed, without any increase in latency. However, it lacks a formal security proof.
In this study, we introduce a scheme, denoted STORM, that synergizes RAMBAM's methodology with the utilization of look-up-tables (LUTs) in memory (ROM/RAM) in a redundant domain. STORM, like RAMBAM, is as fast as a typical unprotected implementation and has the same latency, but has a significantly higher maximal clock frequency than RAMBAM, and consumes less than half the power. RAMBAM and STORM are code-based schemes in the sense that their set of representations is a code in the vector space $GF(2)^{8+d}$. RAMBAM requires a richer structure of a ring on $GF(2)^{8+d}$ and a ring homomorphism whereas STORM utilizes a simple vector space. In code-based-masking (CBM), as in all masking schemes, non-interference based notions (t-S/NI) are fundamental for establishing provable security. RAMBAM and STORM diverge from this approach. While CBM employs codes in vector spaces over $GF(2^8)$ for AES protection, RAMBAM and STORM use codes over $GF(2)$ without the need for t-S/NI-gadgets, leaving them both smaller and more efficient.
Independence in security proofs typically means that in each individual computation (in a clock-cycle), at least one share does not participate. This approach does not work for RAMBAM where several field multiplications are executed sequentially in a cycle. However, in STORM no multiplications are performed due to its memory based tables, leaving only (independent) bitwise-XORs. Therefore, the reasoning necessary for proving security is different and STORM, unlike RAMBAM, enjoys provable security. We consider two distinct scenarios, \emph{both with provable security}: (1) STORM1 --- ``leakage-free’’ memory reads, demonstrating (1,1,0)-robustness for LUTs with redundancy 2 in the 1-probe model and for LUTs with redundancy 6 in the 2-probe model, and (2) STORM2 --- leaky memory reads, where additional protection mechanisms and a notion of memory-read robustness are introduced.
STORM can be implemented not only in HW, but in SW as well. However, this paper and the proofs in it relate to STORM's HW implementations.
In this study, we introduce a scheme, denoted STORM, that synergizes RAMBAM's methodology with the utilization of look-up-tables (LUTs) in memory (ROM/RAM) in a redundant domain. STORM, like RAMBAM, is as fast as a typical unprotected implementation and has the same latency, but has a significantly higher maximal clock frequency than RAMBAM, and consumes less than half the power. RAMBAM and STORM are code-based schemes in the sense that their set of representations is a code in the vector space $GF(2)^{8+d}$. RAMBAM requires a richer structure of a ring on $GF(2)^{8+d}$ and a ring homomorphism whereas STORM utilizes a simple vector space. In code-based-masking (CBM), as in all masking schemes, non-interference based notions (t-S/NI) are fundamental for establishing provable security. RAMBAM and STORM diverge from this approach. While CBM employs codes in vector spaces over $GF(2^8)$ for AES protection, RAMBAM and STORM use codes over $GF(2)$ without the need for t-S/NI-gadgets, leaving them both smaller and more efficient.
Independence in security proofs typically means that in each individual computation (in a clock-cycle), at least one share does not participate. This approach does not work for RAMBAM where several field multiplications are executed sequentially in a cycle. However, in STORM no multiplications are performed due to its memory based tables, leaving only (independent) bitwise-XORs. Therefore, the reasoning necessary for proving security is different and STORM, unlike RAMBAM, enjoys provable security. We consider two distinct scenarios, \emph{both with provable security}: (1) STORM1 --- ``leakage-free’’ memory reads, demonstrating (1,1,0)-robustness for LUTs with redundancy 2 in the 1-probe model and for LUTs with redundancy 6 in the 2-probe model, and (2) STORM2 --- leaky memory reads, where additional protection mechanisms and a notion of memory-read robustness are introduced.
STORM can be implemented not only in HW, but in SW as well. However, this paper and the proofs in it relate to STORM's HW implementations.
Roberto Avanzi, Orr Dunkelman, Kazuhiko Minematsu
ePrint Report
In this note, we introduce the MATTER Tweakable Block Cipher, designed principally for low latency in low-area hardware implementations, but that can also be implemented in an efficient and compact way in software.
MATTER is a 512-bit wide balanced Feistel network with three to six rounds, using the ASCON permutation as the round function. The Feistel network defines a keyed, non-tweakable core, which is made tweakable by using the encryption of the tweak as its key. Key and tweak are 320-bit inputs.
MATTER is particularly suitable for use in an OCB-like mode of operation, with an encrypted checksum for authentication.
MATTER is a 512-bit wide balanced Feistel network with three to six rounds, using the ASCON permutation as the round function. The Feistel network defines a keyed, non-tweakable core, which is made tweakable by using the encryption of the tweak as its key. Key and tweak are 320-bit inputs.
MATTER is particularly suitable for use in an OCB-like mode of operation, with an encrypted checksum for authentication.
Giacomo Borin, Yi-Fu Lai, Antonin Leroux
ePrint Report
We construct two efficient post-quantum ring signatures with anonymity against full key exposure from isogenies, addressing limitations of existing isogeny-based ring signatures.
First, we present an efficient concrete distinguisher for the SQIsign simulator when the signing key is provided using one transcript. This shows that turning SQIsign into an efficient full anonymous ring signature requires some new ideas.
Second, we propose a variant of SQIsign that is resistant to the distinguisher attack with only a $\times 1.33$ increase in size and we render it to a ring signature, that we refer as $\mathsf{Erebor}$. This variant introduces a new zero-knowledge assumption that ensures full anonymity. The efficiency of $\mathsf{Erebor}$ remains comparable to that of SQIsign, with only a proportional increase due to the ring size. This results in a signature size of $0.68 \mathsf{KB}$ for 4 users and $1.35 \mathsf{KB}$ for 8 users, making it the most compact post-quantum ring signature for up to 31 users.
Third, we revisit the GPS signature scheme (Asiacrypt'17), developing efficient subroutines to make the scheme more efficient and significantly reduce the resulting signature size. By integrating our scheme with the paradigm by Beullens, Katsumata, and Pintore (Asiacrypt'20), we achieve an efficient logarithmic ring signature, that we call $\mathsf{Durian}$, resulting in a signature size of $9.87 \mathsf{KB}$ for a ring of size 1024.
Zhaoman Liu, Jianting Ning, Huiying Hou, Yunlei Zhao
ePrint Report
Hyperledger Fabric, an open-source, enterprise-grade consortium platform, employs an endorsement policy wherein a set of endorsers signs transaction proposals from clients to confirm their authenticity. The signatures from endorsers constitute the core component of endorsement. However, when dealing with dynamic transactions with high timeliness and frequent updates (e.g., stock trading, real-time ad delivery, news reporting, etc.), the current endorsement process somewhat slows down the transaction execution. Meanwhile, handling these continuously updated transactions consumes significant resources from endorsers, thereby constraining overall application efficiency.
To address these issues, this paper devises a novel sanitizable and accountable endorsement scheme by proposing a sanitizable multi-signature (SMS) as the theoretical tool. Specifically, we introduce the novel concept of sanitizable multi-signature and detail its instantiation. SMS combines the advantages of multi-signature and sanitizable signature, maintaining the compactness of the signature while allowing the sanitizer to adjust the initial endorsement result to fit the updated transaction content without interacting with the endorsers, so that both the authenticity and timeliness of transactions can be ensured. Additionally, SMS incorporates an innovative accountability mechanism to trace instances of improper data updates, thereby enhancing the security and reliability of the endorsement process.
We demonstrate the security of the proposed scheme through rigorous security analysis. Performance evaluations show that SMS can significantly reduce verification overhead and transaction size compared to the default ECDSA scheme in Fabric. Specifically, when verifying multiple endorsers' endorsements, our scheme exhibits a storage space reduction by approximately 30%-40% and a verification time reduction ranging from 9.2% to nearly 26.3%.
To address these issues, this paper devises a novel sanitizable and accountable endorsement scheme by proposing a sanitizable multi-signature (SMS) as the theoretical tool. Specifically, we introduce the novel concept of sanitizable multi-signature and detail its instantiation. SMS combines the advantages of multi-signature and sanitizable signature, maintaining the compactness of the signature while allowing the sanitizer to adjust the initial endorsement result to fit the updated transaction content without interacting with the endorsers, so that both the authenticity and timeliness of transactions can be ensured. Additionally, SMS incorporates an innovative accountability mechanism to trace instances of improper data updates, thereby enhancing the security and reliability of the endorsement process.
We demonstrate the security of the proposed scheme through rigorous security analysis. Performance evaluations show that SMS can significantly reduce verification overhead and transaction size compared to the default ECDSA scheme in Fabric. Specifically, when verifying multiple endorsers' endorsements, our scheme exhibits a storage space reduction by approximately 30%-40% and a verification time reduction ranging from 9.2% to nearly 26.3%.