IACR News
Here you can see all recent updates to the IACR webpage. These updates are also available:
30 October 2024
Laasya Bangalore, Albert Cheu, Muthuramakrishnan Venkitasubramaniam
ePrint Report
Distributed mean estimation (DME) is a fundamental and important task as it serves as a subroutine in convex optimization, aggregate statistics, and, more generally, federated learning. The inputs for distributed mean estimation (DME) are provided by clients (such as mobile devices), and these inputs often contain sensitive information. Thus, protecting privacy and mitigating the influence of malicious adversaries are critical concerns in DME. A surge of recent works has focused on building multiparty computation (MPC) based protocols tailored for the task of secure aggregation. However, MPC fails to directly address these two issues: (i) the potential manipulation of input by adversaries, and (ii) the leakage of information from the underlying function. This paper presents a novel approach that addresses both these issues. We propose a secure aggregation protocol with a robustness guarantee, effectively protecting the system from "faulty" inputs introduced by malicious clients. Our protocol further ensures differential privacy, so that the underlying function will not leak significant information about individuals.
Notably, this work represents the first comprehensive effort to combine robustness and differential privacy guarantees in the context of DME. In particular, we capture the security of the protocol via a notion of "usefulness" combined with differential privacy inspired by the work of Mironov et al. (CRYPTO 2009) and formally analyze this security guarantee for our protocol.
Daniel Cabarcas, Peigen Li, Javier Verbel, Ricardo Villanueva-Polanco
ePrint Report
SNOVA is a post-quantum digital signature scheme based on multivariate polynomials. It is a first-round candidate in an ongoing NIST standardization process for post-quantum signatures, where it stands out for its efficiency and compactness. Since its initial submission, there have been several improvements to its security analysis, both on key recovery and forgery attacks. All these works reduce to solving a structured system of quadratic polynomials, which we refer to as SNOVA system.
In this work, we propose a polynomial solving algorithm tailored for SNOVA systems, which exploits the stability of the system under the action of a commutative group of matrices. This new algorithm reduces the complexity to solve SNOVA systems, over generic ones. We show how to adapt the reconciliation and direct attacks in order to profit from the new algorithm. Consequently, we improve the reconciliation attack for all SNOVA parameter sets with speedup factors ranging between $2^3$ and $2^{22}$. Our algorithm also reduces the complexity of the direct attack for several parameter sets. It is particularly effective for the parameters that give the best performance to SNOVA $(l=4)$, and which were not taken below NIST's security threshold by previous attacks. Our attack brings these parameter sets $(l=4)$ below that threshold with speedup factors between $2^{33}$ and $2^{52}$, over the state-of-the-art.
In this work, we propose a polynomial solving algorithm tailored for SNOVA systems, which exploits the stability of the system under the action of a commutative group of matrices. This new algorithm reduces the complexity to solve SNOVA systems, over generic ones. We show how to adapt the reconciliation and direct attacks in order to profit from the new algorithm. Consequently, we improve the reconciliation attack for all SNOVA parameter sets with speedup factors ranging between $2^3$ and $2^{22}$. Our algorithm also reduces the complexity of the direct attack for several parameter sets. It is particularly effective for the parameters that give the best performance to SNOVA $(l=4)$, and which were not taken below NIST's security threshold by previous attacks. Our attack brings these parameter sets $(l=4)$ below that threshold with speedup factors between $2^{33}$ and $2^{52}$, over the state-of-the-art.
Phillip Gajland, Jonas Janneck, Eike Kiltz
ePrint Report
Falcon is a winner of NIST's six-year post-quantum cryptography standardisation competition. Based on the celebrated full-domain-hash framework of Gentry, Peikert and Vaikuntanathan (GPV) (STOC'08), Falcon leverages NTRU lattices to achieve the most compact signatures among lattice-based schemes.
Its security hinges on a Rényi divergence-based argument for Gaussian samplers, a core element of the scheme. However, the GPV proof, which uses statistical distance to argue closeness of distributions, fails when applied naively to Falcon due to parameter choices resulting in statistical distances as large as $2^{-34}$. Additional implementation-driven deviations from the GPV framework further invalidate the original proof, leaving Falcon without a security proof despite its selection for standardisation.
This work takes a closer look at Falcon and demonstrates that introducing a few minor, conservative modifications allows for the first formal proof of the scheme in the random oracle model. At the heart of our analysis lies an adaptation of the GPV framework to work with the Rényi divergence, along with an optimised method for parameter selection under this measure. Furthermore, we obtain a provable version of the GPV framework over NTRU rings. Both these tools may be of independent interest.
Unfortunately, our analysis shows that despite our modification of Falcon-512 and Falcon-1024 we do not achieve strong unforgeability for either scheme. For plain unforgeability we are able to show that our modifications to Falcon-512 barely satisfy the claimed 120-bit security target and for Falcon-1024 we confirm the claimed security level. As such we recommend revisiting falcon and its parameters.
Its security hinges on a Rényi divergence-based argument for Gaussian samplers, a core element of the scheme. However, the GPV proof, which uses statistical distance to argue closeness of distributions, fails when applied naively to Falcon due to parameter choices resulting in statistical distances as large as $2^{-34}$. Additional implementation-driven deviations from the GPV framework further invalidate the original proof, leaving Falcon without a security proof despite its selection for standardisation.
This work takes a closer look at Falcon and demonstrates that introducing a few minor, conservative modifications allows for the first formal proof of the scheme in the random oracle model. At the heart of our analysis lies an adaptation of the GPV framework to work with the Rényi divergence, along with an optimised method for parameter selection under this measure. Furthermore, we obtain a provable version of the GPV framework over NTRU rings. Both these tools may be of independent interest.
Unfortunately, our analysis shows that despite our modification of Falcon-512 and Falcon-1024 we do not achieve strong unforgeability for either scheme. For plain unforgeability we are able to show that our modifications to Falcon-512 barely satisfy the claimed 120-bit security target and for Falcon-1024 we confirm the claimed security level. As such we recommend revisiting falcon and its parameters.
Hanzhi Liu, Jingyu Ke, Hongbo Wen, Robin Linus, Lukas George, Manish Bista, Hakan Karakuş, Domo, Junrui Liu, Yanju Chen, Yu Feng
ePrint Report
Bitcoin, while being the most prominent blockchain with the largest market capitalization, suffers from scalability and throughput limitations that impede the development of ecosystem projects like Bitcoin Decentralized Finance (BTCFi). Recent advancements in BitVM propose a promising Layer 2 (L2) solution to enhance Bitcoin's scalability by enabling complex computations off-chain with on-chain verification. However, Bitcoin's constrained programming environment—characterized by its non-Turing-complete Script language lacking loops and recursion, and strict block size limits—makes developing complex applications labor-intensive, error-prone, and necessitates manual partitioning of scripts. Under this complex programming model, subtle mistakes could lead to irreversible damage in a trustless environment like Bitcoin. Ensuring the correctness and security of such programs becomes paramount.
To address these challenges, we introduce the first formal verification tool for BitVM implementations. Our approach involves designing a register-based, higher-level domain-specific language (DSL) that abstracts away complex stack operations, allowing developers to reason about program correctness more effectively while preserving the semantics of the original Bitcoin Script. We present a formal computational model capturing the semantics of BitVM execution and Bitcoin Script, providing a foundation for rigorous verification. To efficiently handle large programs and complex constraints arising from unrolled computations that simulate loops, we summarize repetitive "loop-style" computations using loop invariant predicates in our DSL. We leverage a counterexample-guided inductive synthesis (CEGIS) procedure to lift low-level Bitcoin Script into our DSL, facilitating efficient verification without sacrificing accuracy. Evaluated on 98 benchmarks from BitVM's SNARK verifier, our tool successfully verifies 94% of cases within seconds, demonstrating its effectiveness in enhancing the security and reliability of BitVM.
To address these challenges, we introduce the first formal verification tool for BitVM implementations. Our approach involves designing a register-based, higher-level domain-specific language (DSL) that abstracts away complex stack operations, allowing developers to reason about program correctness more effectively while preserving the semantics of the original Bitcoin Script. We present a formal computational model capturing the semantics of BitVM execution and Bitcoin Script, providing a foundation for rigorous verification. To efficiently handle large programs and complex constraints arising from unrolled computations that simulate loops, we summarize repetitive "loop-style" computations using loop invariant predicates in our DSL. We leverage a counterexample-guided inductive synthesis (CEGIS) procedure to lift low-level Bitcoin Script into our DSL, facilitating efficient verification without sacrificing accuracy. Evaluated on 98 benchmarks from BitVM's SNARK verifier, our tool successfully verifies 94% of cases within seconds, demonstrating its effectiveness in enhancing the security and reliability of BitVM.
Dedy Septono Catur Putranto, Rini Wisnu Wardhani, Jaehan Cho, Howon Kim
ePrint Report
Elliptic Curve Point Multiplication (ECPM) is a key component of the Elliptic Curve Cryptography (ECC) hierarchy protocol. However, the specific estimation of resources required for this process remains underexplored despite its significance in the cryptanalysis of ECC algorithms, particularly binary ECC in GF (2?). Given the extensive use of ECC algorithms in various security protocols and devices, it is essential to conduct this examination to gain valuable insights into its cryptanalysis, specifically in terms of providing precise resource estimations, which serve as a solid basis for further investigation in solving the Elliptic Curve Discrete Logarithm Problem. Expanding on several significant prior research, in this work, we refer to as ECPM cryptanalysis, we estimate quantum resources, including qubits, gates, and circuit depth, by integrating point addition (PA) and point-doubling (PD) into the ECPM scheme, culminating in a Shor’s algorithm-based binary ECC cryptanalysis circuit. Focusing on optimizing depth, we elaborate on and implement the most efficient PD circuit and incorporate optimized Karatsuba multiplication and FLT-based inversion algorithms for PA and PD operations. Compared to the latest PA-only circuits, our preliminary results showcase significant resource optimization for various ECPM implementations, including single-step ECPM, ECPM with combined or selective PA/PD utilization, and total−step ECPM (2? PD+2 PA).
Masayuki Abe, David Balbás, Dung Bui, Miyako Ohkubo, Zehua Shang, Mehdi Tibouchi
ePrint Report
In many multi-round public-coin interactive proof systems, challenges in different rounds serve different roles, but a formulation that actively utilizes this aspect has not been studied extensively. In this paper, we propose new notions called critical-round special honest verifier zero-knowledge and critical-round special soundness. Our notions are simple, intuitive, easy to apply, and capture several practical multi-round proof protocols including, but not limited to, those from the MPC-in-the-Head paradigm.
We demonstrate the usefulness of these notions with two fundamental applications where three-round protocols are known to be useful, but multi-round ones generally fail. First, we show that critical-round proofs yield trapdoor commitment schemes. This result also enables the instantiation of post-quantum secure adaptor signatures and threshold ring signatures from MPCitH, resolving open questions in (Haque and Scafuro, PKC 2020) and in (Liu et al., ASIACRYPT 2024). Second, we show that critical-round proofs can be securely composed using the Cramer-Schoenmakers-Damgård method. This solves an open question posed by Abe et al. in CRYPTO 2024.
Overall, these results shed new light on the potential of multi-round proofs in both theoretical and practical cryptographic protocol design
We demonstrate the usefulness of these notions with two fundamental applications where three-round protocols are known to be useful, but multi-round ones generally fail. First, we show that critical-round proofs yield trapdoor commitment schemes. This result also enables the instantiation of post-quantum secure adaptor signatures and threshold ring signatures from MPCitH, resolving open questions in (Haque and Scafuro, PKC 2020) and in (Liu et al., ASIACRYPT 2024). Second, we show that critical-round proofs can be securely composed using the Cramer-Schoenmakers-Damgård method. This solves an open question posed by Abe et al. in CRYPTO 2024.
Overall, these results shed new light on the potential of multi-round proofs in both theoretical and practical cryptographic protocol design
Toi Tomita, Junji Shikata
ePrint Report
In this paper, we propose an efficient identity-based encryption (IBE) scheme based on the ring learning with errors (RLWE) assumption in the (quantum) random oracle model. Our IBE scheme is (asymptotically) as efficient as the most practical lattice-based IBE scheme proposed by Ducal et al. (ASIACRYPT 2014). Furthermore, our scheme is adaptively and anonymously secure, and its security reduction is tight. We design our IBE scheme by instantiating the framework of Gentry et al. (STOC 2008) using the compact preimage sampling proposed by Yu et al. (CRYPTO 2023). The tightness of our IBE scheme is obtained by combining the proof technique of Katsumata et al. (ASIACRYPT 2018) with the results for ideal lattices developed by Mera et al. (PKC 2022).
Mi-Ying (Miryam) Huang, Baiyu Li, Xinyu Mao, Jiapeng Zhang
ePrint Report
We present an efficient Publicly Verifiable Fully Homomorphic Encryption scheme that, along with being able to evaluate arbitrary boolean circuits over ciphertexts, also generates a succinct proof of correct homomorphic computation. Our scheme is based on FHEW proposed by Ducas and Micciancio (Eurocrypt'15), and we incorporate the GINX homomorphic accumulator (Eurocrypt'16) for improved bootstrapping efficiency. In order to generate the proof efficiently, we generalize the widely used Rank-1 Constraint System (R1CS) to the ring setting and obtain Ring R1CS, to natively express homomorphic computation in FHEW.
In particular, we develop techniques to efficiently express in our Ring R1CS the "non-arithmetic" operations, such as gadget decomposition and modulus switching used in the FHEW construction. We further construct a SNARG for Ring R1CS instances, by translating the Ring R1CS instance into a sum-check protocol over polynomials, and then compiling it into a succinct non-interactive proof by incorporating the lattice-based polynomial commitment scheme of Cini, Malavolta, Nguyen, and Wee (Crypto'24). Putting together, our Publicly Verifiable FHE scheme relies on standard hardness assumptions about lattice problems such that it generates a succinct proof of homomorphic computation of circuit $C$ in time $O(|C|^2\cdot poly(\lambda))$ and of size $O(\log^2{|C|}\cdot poly(\lambda))$. Besides, our scheme achieves the recently proposed IND-SA (indistinguishability under semi-active attack) security by Walter (EPrint 2024/1207) that exactly captures client data privacy when a homomorphic computation can be verified.
Gorjan Alagic, Dana Dachman-Soled, Manasi Shingane, Patrick Struck
ePrint Report
In their seminal work, Gentry and Wichs (STOC'11) established an impossibility result for the task of constructing an adaptively-sound SNARG via black-box reduction from a falsifiable assumption.
An exciting set of recent SNARG constructions demonstrated that, if one adopts a weaker but still quite meaningful notion of adaptive soundness, then impossibility no longer holds (Waters-Wu, Waters-Zhandry, Mathialagan-Peters-Vaikunthanathan ePrint'24). These fascinating new results raise an intriguing possibility: is there a way to remove this slight weakening of adaptive soundness, thereby completely circumventing the Gentry-Wichs impossibility?
A natural route to closing this gap would be to use a quantum black-box reduction, i.e., a reduction that can query the SNARG adversary on superpositions of inputs. This would take advantage of the fact that Gentry-Wichs only consider classical reductions. In this work, we show that this approach cannot succeed. Specifically, we extend the Gentry-Wichs impossibility result to quantum black-box reductions, and thereby establish an important limit on the power of such reductions.
Jingwei Chen, Linhan Yang, Wenyuan Wu, Yang Liu, Yong Feng
ePrint Report
Homomorphically encrypted matrix operations are extensively used in various privacy-preserving applications. Consequently, reducing the cost of encrypted matrix operations is a crucial topic on which numerous studies have been conducted. In this paper, we introduce a novel matrix encoding method, named bicyclic encoding, under which we propose two new algorithms BMM-I and BMM-II for encrypted matrix multiplication. BMM-II outperforms the stat-of-the-art algorithms in theory, while BMM-I, combined with the segmented strategy, performs well in practice, particularly for matrices with high dimensions. Another noteworthy advantage of bicyclic encoding is that it allows for transposing an encrypted matrix entirely free. A comprehensive experimental study based on our proof-of-concept implementation shows that each algorithm introduced in this paper has specific scenarios outperforming existing algorithms, achieving speedups ranging from 2x to 38x.
Hao Cheng, Jiliang Li, Yizhong Liu, Yuan Lu, Weizhi Meng, Zhenfeng Zhang
ePrint Report
Shoup and Smart (SS24) recently introduced a lightweight asynchronous verifiable secret sharing (AVSS) protocol with optimal resilience directly from cryptographic hash functions (JoC 2024), offering plausible quantum resilience and computational efficiency. However, SS24 AVSS only achieves standard secrecy to keep the secret confidential against $n/3$ corrupted parties \textit{if no honest party publishes its share}. In contrast, from ``heavyweight'' public-key cryptography, one can realize so-called \textit{high-threshold} asynchronous verifiable secret sharing (HAVSS), with a stronger \textit{high-threshold} secrecy to tolerate $n/3$ corrupted parties and additional leaked shares from $n/3$ honest parties. This raises the following question: can we bridge the remaining gap to design an efficient HAVSS using only lightweight cryptography?
We answer the question in the affirmative by presenting a lightweight HAVSS with optimal resilience. When executing across $n$ parties to share a secret, it attains a worst-case communication complexity of $\Tilde{\bigO}(\lambda n^3)$ (where $\lambda$ is the cryptographic security parameter) and realizes high-threshold secrecy to tolerate a fully asynchronous adversary that can control $t= \lfloor \frac{n-1}{3} \rfloor$ malicious parties and also learn $t$ additional secret shares from some honest parties. The (worst-case) communication complexity of our lightweight HAVSS protocol matches that of SS24 AVSS---the state-of-the-art lightweight AVSS without high-threshold secrecy. Notably, our design is a direct and concretely efficient reduction to hash functions in the random oracle model, without extra setup assumptions like CRS/PKI or heavy intermediate steps like hash-based zk-STARK.
We answer the question in the affirmative by presenting a lightweight HAVSS with optimal resilience. When executing across $n$ parties to share a secret, it attains a worst-case communication complexity of $\Tilde{\bigO}(\lambda n^3)$ (where $\lambda$ is the cryptographic security parameter) and realizes high-threshold secrecy to tolerate a fully asynchronous adversary that can control $t= \lfloor \frac{n-1}{3} \rfloor$ malicious parties and also learn $t$ additional secret shares from some honest parties. The (worst-case) communication complexity of our lightweight HAVSS protocol matches that of SS24 AVSS---the state-of-the-art lightweight AVSS without high-threshold secrecy. Notably, our design is a direct and concretely efficient reduction to hash functions in the random oracle model, without extra setup assumptions like CRS/PKI or heavy intermediate steps like hash-based zk-STARK.
Henry Corrigan-Gibbs, Alexandra Henzinger, Yael Kalai, Vinod Vaikuntanathan
ePrint Report
We construct somewhat homomorphic encryption schemes from the learning sparse parities with noise (sparse LPN) problem, along with an assumption that implies linearly homomorphic encryption (e.g., the decisional Diffie-Hellman or decisional composite residuosity assumptions). Our resulting schemes support an a-priori bounded number of homomorphic operations: $O(\log \lambda/\log \log \lambda)$ multiplications followed by poly($\lambda$) additions, where $\lambda \in \mathbb{N}$ is a security parameter. These schemes have compact ciphertexts: after homomorphic evaluation, the bit-length of each ciphertext is a fixed polynomial in the security parameter $\lambda$, independent of the number of homomorphic operations applied to it. This gives the first somewhat homomorphic encryption schemes that can evaluate the class of bounded-degree polynomials with a bounded number of monomials without relying on lattice assumptions or bilinear maps.
Much like in the Gentry-Sahai-Waters fully homomorphic encryption scheme, ciphertexts in our scheme are matrices, homomorphic addition is matrix addition, and homomorphic multiplication is matrix multiplication. Moreover, when encrypting many messages at once and performing many homomorphic evaluations at once, the bit-length of ciphertexts in some of our schemes (before and after homomorphic evaluation) can be arbitrarily close to the bit-length of the plaintexts. The main limitation of our schemes is that they require a large evaluation key, whose size scales with the complexity of the homomorphic computation performed, though this key can be re-used across any polynomial number of encryptions and evaluations.
Much like in the Gentry-Sahai-Waters fully homomorphic encryption scheme, ciphertexts in our scheme are matrices, homomorphic addition is matrix addition, and homomorphic multiplication is matrix multiplication. Moreover, when encrypting many messages at once and performing many homomorphic evaluations at once, the bit-length of ciphertexts in some of our schemes (before and after homomorphic evaluation) can be arbitrarily close to the bit-length of the plaintexts. The main limitation of our schemes is that they require a large evaluation key, whose size scales with the complexity of the homomorphic computation performed, though this key can be re-used across any polynomial number of encryptions and evaluations.
Ali Babaei, Taraneh Eghlidos
ePrint Report
With the advent of quantum computers, the security of cryptographic primitives, including digital signature schemes, has been compromised. To deal with this issue, some signature schemes have been introduced to resist against these computers. These schemes are known as post-quantum signature schemes. One group of these schemes is based on the hard problems of coding theory, called code-based cryptographic schemes. Several code-based signature schemes are inspired by the McEliece encryption scheme using three non-singular, parity-check, and permutation matrices as the only components of the private keys, and their product as the public key. In this paper, we focus on the analysis of a class of such signature schemes. For this purpose, we first prove that the linear relationships between the columns of the parity-check/generator matrix appear in the public key matrix, and by exploiting this feature we perform a forgery attack on one of the signature schemes of this class as an evidence. The complexity of this attack is of O(n^4).
Razvan Barbulescu, Mugurel Barcau, Vicentiu Pasol
ePrint Report
Public key cryptography can be based on integer factorization and
the discrete logarithm problem (DLP), applicable in multiplicative groups and
elliptic curves. Regev’s recent quantum algorithm was initially designed for the
factorization and was later extended to the DLP in the multiplicative group.
In this article, we further extend the algorithm to address the DLP for elliptic
curves. Notably, based on celebrated conjectures in Number Theory, Regev’s
algorithm is asymptotically faster than Shor’s algorithm for elliptic curves.
Our analysis covers all cases where Regev’s algorithm can be applied. We
examine the general framework of Regev’s algorithm and offer a geometric
description of its parameters. This preliminary analysis enables us to certify
the success of the algorithm on a particular instance before running it.
In the case of integer factorization, we demonstrate that there exists an in-
finite family of RSA moduli for which the algorithm always fails. On the other
hand, when the parameters align with the Gaussian heuristics, we prove that
Regev’s algorithm succeeds. By noting that the algorithm naturally adapts
to the multidimensional DLP, we proved that it succeeds for a certain range
of parameters.
Alessandro Budroni, Andrea Natale
ePrint Report
In parallel with the standardization of lattice-based cryptosystems, the research community in Post-quantum Cryptography focused on non-lattice-based hard problems for constructing public-key cryptographic primitives. The Linear Code Equivalence (LCE) Problem has gained attention regarding its practical applications and cryptanalysis.
Recent advancements, including the LESS signature scheme and its candidacy in the NIST standardization for additional signatures, supported LCE as a foundation for post-quantum cryptographic primitives. However, recent cryptanalytic results have revealed vulnerabilities in LCE-based constructions when multiple related public keys are available for one specific code rate. In this work, we generalize the proposed attacks to cover all code rates. We show that the complexity of recovering the private key from multiple public keys is significantly reduced for any code rate scenario. Thus, we advise against constructing specific cryptographic primitives using LCE.
Nishat Koti, Varsha Bhat Kukkala, Arpita Patra, Bhavish Raj Gopal
ePrint Report
Privacy-preserving graph analysis allows performing computations on graphs that store sensitive information while ensuring all the information about the topology of the graph, as well as data associated with the nodes and edges, remains hidden. The current work addresses this problem by designing a highly scalable framework, $\mathsf{Graphiti}$, that allows securely realising any graph algorithm. $\mathsf{Graphiti}$ relies on the technique of secure multiparty computation (MPC) to design a generic framework that improves over the state-of-the-art framework of GraphSC by Araki et al. (CCS'21). The key technical contribution is that $\mathsf{Graphiti}$ has round complexity independent of the graph size, which in turn allows attaining the desired scalability. Specifically, this is achieved by (i) decoupling the $\mathsf{Scatter}$ primitive of GraphSC into separate operations of $\mathsf{Propagate}$ and $\mathsf{ApplyE}$, (ii) designing a novel constant-round approach to realise $\mathsf{Propagate}$, as well as (iii) designing a novel constant-round approach to realise the $\mathsf{Gather}$ primitive of GraphSC by leveraging the linearity of the aggregation operation. We benchmark the performance of $\mathsf{Graphiti}$ for the application of contact tracing via BFS for 10 hops and observe that it takes less than 2 minutes when computing over a graph of size $10^7$. Concretely it improves over the state-of-the-art up to a factor of $1034\times$ in online run time. Similar to GraphSC by Araki et al., since $\mathsf{Graphiti}$ relies on a secure protocol for shuffle, we additionally design a shuffle protocol secure against a semi-honest adversary in the 2-party with a helper setting. Given the versatility of shuffle protocol, the designed solution is of independent interest. Hence, we also benchmark the performance of the designed shuffle where we observe improvements of up to $1.83\times$ in online run time when considering an input vector of size $10^7$, in comparison to the state-of-the-art in the considered setting.
Tim Beyne, Clémence Bouvier
ePrint Report
It is shown how bounds on exponential sums derived from modern algebraic geometry, and l-adic cohomology specifically, can be used to upper bound the absolute correlations of linear approximations for cryptographic constructions of low algebraic degree. This is illustrated by applying results of Deligne, Denef and Loeser, and Rojas-León, to obtain correlation bounds for a generalization of the Butterfly construction, three-round Feistel ciphers, and a generalization of the Flystel construction. For each of these constructions, bounds obtained using other methods are significantly weaker. In the case of the Flystel construction, our bounds resolve a conjecture by the designers.
Correlation bounds of this type are relevant for the development of security arguments against linear cryptanalysis, especially in the weak-key setting or for primitives that do not involve a key. Since the methods used in this paper are applicable to constructions defined over arbitrary finite fields, the results are also relevant for arithmetization-oriented primitives such as Anemoi, which uses S-boxes based on the Flystel construction.
Zewen Ye, Junhao Huang, Tianshun Huang, Yudan Bai, Jinze Li, Hao Zhang, Guangyan Li, Donglong Chen, Ray C.C. Cheung, Kejie Huang
ePrint Report
Post-quantum cryptography (PQC) has rapidly evolved in response to the emergence of quantum computers, with the US National Institute of Standards and Technology (NIST) selecting four finalist algorithms for PQC standardization in 2022, including the Falcon digital signature scheme. The latest round of digital signature schemes introduced Hawk, both based on the NTRU lattice, offering compact signatures, fast generation, and verification suitable for deployment on resource-constrained Internet-of-Things (IoT) devices. Despite the popularity of Crystal-Dilithium and Crystal-Kyber, research on NTRU-based schemes has been limited due to their complex algorithms and operations. Falcon and Hawk's performance remains constrained by the lack of parallel execution in crucial operations like the Number Theoretic Transform (NTT) and Fast Fourier Transform (FFT), with data dependency being a significant bottleneck. This paper enhances NTRU-based schemes Falcon and Hawk through hardware/software co-design on a customized Single-Instruction-Multiple-Data (SIMD) processor, proposing new SIMD hardware units and instructions to expedite these schemes along with software optimizations to boost performance. Our NTT optimization includes a novel layer merging technique for SIMD architecture to reduce memory accesses, and the use of modular algorithms (Signed Montgomery and Improved Plantard) targets various modulus data widths to enhance performance. We explore applying layer merging to accelerate fixed-point FFT at the SIMD instruction level and devise a dual-issue parser to streamline assembly code organization to maximize dual-issue utilization. A System-on-chip (SoC) architecture is devised to improve the practical application of the processor in real-world scenarios. Evaluation on 28 nm technology and FPGA platform shows that our design and optimizations can increase the performance of Hawk signature generation and verification by over 7 times.
Zewen Ye, Tianshun Huang, Tianyu Wang, Yonggen Li, Chengxuan Wang, Ray C.C. Cheung, Kejie Huang
ePrint Report
Homomorphic Encryption (HE) technology allows for processing encrypted data, breaking through data isolation barriers and providing a promising solution for privacy-preserving computation. The integration of HE technology into Convolutional Neural Network (CNN) inference shows potential in addressing privacy issues in identity verification, medical imaging diagnosis, and various other applications. The CKKS HE algorithm stands out as a popular option for homomorphic CNN inference due to its capability to handle real number computations. However, challenges such as computational delays and resource overhead present significant obstacles to the practical implementation of homomorphic CNN inference, largely due to the complex nature of HE operations. In addition, current methods for speeding up homomorphic CNN inference primarily address individual images or large batches of input images, lacking a solution for efficiently processing a moderate number of input images with fast homomorphic inference capabilities, which is more suitable for edge computing applications. In response to these challenges, we introduce a novel leveled homomorphic CNN inference scheme aimed at reducing latency and improving throughput using the CKKS scheme. Our proposed inference strategy involves mapping multiple inputs to a set of ciphertext by exploiting the sliding window properties of convolutions to utilize CKKS's inherent Single-Instruction-Multiple-Data (SIMD) capability. To mitigate the delay associated with homomorphic CNN inference, we introduce optimization techniques, including mask-weight merging, rotation multiplexing, stride convolution segmentation, and folding rotations. The efficacy of our homomorphic inference scheme is demonstrated through evaluations carried out on the MNIST and CIFAR-10 datasets. Specifically, results from the MNIST dataset on a single CPU thread show that inference for 163 images can be completed in 10.4 seconds with an accuracy of 98.9%, which is a 6.9 times throughput improvement over state-of-the-art works. Comparative analysis with existing methodologies highlights the superior performance of our proposed inference scheme in terms of latency, throughput, communication overhead, and memory utilization.
29 October 2024
Gaithersburg, USA, 24 September - 26 September 2025
Event Calendar
Event date: 24 September to 26 September 2025