IACR News
Here you can see all recent updates to the IACR webpage. These updates are also available:
28 October 2024
Adam Oumar Abdel-Rahman, Sofiane Azogagh, Zelma Aubin Birba, Arthur Tran Van
ePrint Report
Cloud storage offers convenient data access and sharing, but security concerns remain. Existing secure cloud storage solutions often lack essential features like data integrity, multi-cloud support, user-friendly file sharing, and efficient search. This paper proposes a novel secure cloud storage system that addresses these limitations. Our system uses distributed storage and attribute-based encryption to enhance data availability, access control, and user experience. It also enables private and efficient file search and data retrievability verification. This approach overcomes the trade-offs present in prior work, offering a secure and user-friendly solution for cloud data management.
Elli Androulaki, Angelo De Caro, Kaoutar El Khiyaoui, Romain Gay, Rebekah Mercer, Alessandro Sorniotti
ePrint Report
Offline payments present an opportunity for central bank digital currency to address the lack of digital financial inclusion plaguing existing digital payment solutions. However, the design of secure offline payments is a complex undertaking; for example, the lack of connectivity during the payments renders double spending attacks trivial. While the identification of double spenders and penal sanctions may curb attacks by individuals, they may not be sufficient against concerted efforts by states or well-funded institutions. It is hence important to also rely on preventive measures that reduce the scale of such attacks. An example of such a measure is secure elements. These however are limited in compute and storage, making the design of solutions that offer comparable privacy guarantees to those of physical cash challenging.
We address this with a protocol that offloads most of the payment computation to the user’s mobile device and restricts the computation on the secure element to deleting spent tokens, and generating a signature with a computation equivalent to that of ECDSA. We claim that the use of mobile devices or enhanced smart card-based devices are required for secure consumer-to-consumer payments. To further harden the protocol, we enable the efficient identification of double spenders on the off-chance an attacker successfully double spends. Finally, we prove its security in the ideal/real world paradigm, and evaluate its performance to demonstrate its practicality.
Prabhanjan Ananth, John Bostanci, Aditya Gulati, Yao-Ting Lin
ePrint Report
We study the (in)feasibility of quantum pseudorandom notions in a quantum analog of the random oracle model, where all the parties, including the adversary, have oracle access to the same Haar random unitary. In this model, we show the following:
• (Unbounded-query secure) pseudorandom unitaries (PRU) exist. Moreover, the PRU construction makes two calls to the Haar oracle.
• We consider constructions of PRUs making a single call to the Haar oracle. In this setting, we show that unbounded-query security is impossible to achieve. We complement this result by showing that bounded-query secure PRUs do exist with a single query to the Haar oracle.
• We show that multi-copy pseudorandom state generators and function-like state generators (with classical query access), making a single call to the Haar oracle, exist.
Our results have two consequences: (a) when the Haar random unitary is instantiated suitably, our results present viable approaches for building quantum pseudorandom objects without relying upon one-way functions and, (b) for the first time, we show that the key length in pseudorandom unitaries can be generically shrunk (relative to the output length). Our results are also some of the first usecases of the new ``path recording'' formalism for Haar random unitaries, introduced in the recent breakthrough work of Ma and Huang.
• (Unbounded-query secure) pseudorandom unitaries (PRU) exist. Moreover, the PRU construction makes two calls to the Haar oracle.
• We consider constructions of PRUs making a single call to the Haar oracle. In this setting, we show that unbounded-query security is impossible to achieve. We complement this result by showing that bounded-query secure PRUs do exist with a single query to the Haar oracle.
• We show that multi-copy pseudorandom state generators and function-like state generators (with classical query access), making a single call to the Haar oracle, exist.
Our results have two consequences: (a) when the Haar random unitary is instantiated suitably, our results present viable approaches for building quantum pseudorandom objects without relying upon one-way functions and, (b) for the first time, we show that the key length in pseudorandom unitaries can be generically shrunk (relative to the output length). Our results are also some of the first usecases of the new ``path recording'' formalism for Haar random unitaries, introduced in the recent breakthrough work of Ma and Huang.
Bill Allombert, Jean-François Biasse, Jonathan Komada Eriksen, Péter Kutas, Chris Leonardi, Aurel Page, Renate Scheidler, Márton Tot Bagi
ePrint Report
A crucial ingredient for many cryptographic primitives such as key exchange protocols and advanced signature schemes is a commutative group action where the structure of the underlying group can be computed efficiently. SCALLOP provides such a group action, based on oriented supersingular elliptic curves.
We present PEARL-SCALLOP, a variant of SCALLOP that changes several parameter and design choices, thereby improving on both efficiency and security and enabling feasible parameter generation for larger security levels. Within the SCALLOP framework, our parameters are essentially optimal; the orientation is provided by a $2^e$-isogeny, where $2^e$ is roughly equal to the discriminant of the acting class group.
As an important subroutine we present a practical algorithm for generating oriented supersingular elliptic curves. To demonstrate our improvements, we provide a proof-of-concept implementation which instantiates PEARL-SCALLOP at all relevant security levels. Our timings are more than an order of magnitude faster than any previous implementation.
As an important subroutine we present a practical algorithm for generating oriented supersingular elliptic curves. To demonstrate our improvements, we provide a proof-of-concept implementation which instantiates PEARL-SCALLOP at all relevant security levels. Our timings are more than an order of magnitude faster than any previous implementation.
Emanuele Bellini, David GERAULT, Juan Grados, Thomas Peyrin
ePrint Report
The search for optimal differential trails for ARX ciphers is known to be difficult and scale poorly as the word size (and the branching through the carries of modular additions) increases.To overcome this problem, one may approximate the modular addition with the XOR operation, a process called linearization. The immediate drawback of this approach is that many valid and good trails are discarded. In this work, we explore different partial linearization trade-offs to model the modular addition through the \emph{window heuristic}, which restricts carry propagation to windows of $w_s$ consecutive positions. This strategy
enables the exploration of full linearization ($w_s = 0$), normal modelling ($w_s = n$), and all the different trade-offs between completeness and speed in between.
We give the corresponding SAT and MILP model and their parallel versions, and apply them to \chachacore, \speckfamily, \leafamily, and \hightfamily. Our method greatly outperforms all previous modeling of modular addition.
In particular, we find the first differential path for 4 rounds of \chachacore with a probability greater than $2^{-256}$, and a corresponding 6 rounds boomerang distinguisher.
This indicates that purely differential-based attacks have the potential to become competitive with differential-linear attacks,
currently, the best-known attacks against \chachacore and other ARX ciphers.
Finally, we exhibit an improved key recovery attack on reduced \leafamily.
Pedro Branco, Nico Döttling, Abhishek Jain, Giulio Malavolta, Surya Mathialagan, Spencer Peters, Vinod Vaikuntanathan
ePrint Report
We introduce the notion of pseudorandom obfuscation (PRO), a way to obfuscate (keyed) pseudorandom functions $f_K$ in an average-case sense. We introduce several variants of pseudorandom obfuscation and show constructions and applications. For some of our applications that can be achieved using full-fledged indistinguishability obfuscation (iO), we show constructions using lattice-based assumptions alone; the other applications we enable using PRO are simply not known even assuming iO. We briefly summarize our contributions below.
- Constructions of PRO: We show how to construct the strongest version of PRO, assuming the sub-exponential hardness of the learning with errors (LWE) problem, and of the evasive LWE problem (Wee, EUROCRYPT 2022; Tsabary, CRYPTO 2022). - Applications outside the iO World: We show how to construct a succinct witness encryption scheme from PRO, where the size of the ciphertext is independent of the witness size. Such a witness encryption scheme is not known to exist even assuming iO. - Applications in the iO World: Our weakest variant of pseudorandom obfuscation, named obfuscation for identical pseudorandom functions (iPRO), is weaker than iO: rather than obfuscating arbitrary circuits as in iO, iPRO only obfuscates circuits computing pseudorandom functions. We show that iPRO already enables several applications of iO, such as unleveled fully homomorphic encryption (without assuming circular security) and succinct randomized encodings.
- From iPRO to iO: Despite being a seemingly weaker notion than iO, we show two pathways to constructing full-fledged iO from iPRO. Our first construction builds iO from iPRO and (standard assumptions on) cryptographic bilinear maps. Combined with our construction of iPRO, this gives us a construction of iO from a new combination of assumptions, namely LWE, evasive LWE and bilinear maps. Our second construction builds iO (and even ideal obfuscation) from iPRO in the pseudorandom oracle model (Jain, Lin, Luo and Wichs, CRYPTO 2023). To our knowledge, this is the first purely lattice-based, and hence plausibly post-quantum secure, construction of iO with a proof of security from LWE and evasive LWE.
Finally, we highlight some barriers in achieving the strongest version of pseudorandom obfuscation.
- Constructions of PRO: We show how to construct the strongest version of PRO, assuming the sub-exponential hardness of the learning with errors (LWE) problem, and of the evasive LWE problem (Wee, EUROCRYPT 2022; Tsabary, CRYPTO 2022). - Applications outside the iO World: We show how to construct a succinct witness encryption scheme from PRO, where the size of the ciphertext is independent of the witness size. Such a witness encryption scheme is not known to exist even assuming iO. - Applications in the iO World: Our weakest variant of pseudorandom obfuscation, named obfuscation for identical pseudorandom functions (iPRO), is weaker than iO: rather than obfuscating arbitrary circuits as in iO, iPRO only obfuscates circuits computing pseudorandom functions. We show that iPRO already enables several applications of iO, such as unleveled fully homomorphic encryption (without assuming circular security) and succinct randomized encodings.
- From iPRO to iO: Despite being a seemingly weaker notion than iO, we show two pathways to constructing full-fledged iO from iPRO. Our first construction builds iO from iPRO and (standard assumptions on) cryptographic bilinear maps. Combined with our construction of iPRO, this gives us a construction of iO from a new combination of assumptions, namely LWE, evasive LWE and bilinear maps. Our second construction builds iO (and even ideal obfuscation) from iPRO in the pseudorandom oracle model (Jain, Lin, Luo and Wichs, CRYPTO 2023). To our knowledge, this is the first purely lattice-based, and hence plausibly post-quantum secure, construction of iO with a proof of security from LWE and evasive LWE.
Finally, we highlight some barriers in achieving the strongest version of pseudorandom obfuscation.
25 October 2024
Alexander Poremba, Yihui Quek, Peter Shor
ePrint Report
Random classical codes have good error correcting properties, and yet they are notoriously hard to decode in practice. Despite many decades of extensive study, the fastest known algorithms still run in exponential time. The Learning Parity with Noise (LPN) problem, which can be seen as the task of decoding a random linear code in the presence of noise, has thus emerged as a prominent hardness assumption with numerous applications in both cryptography and learning theory.
Is there a natural quantum analog of the LPN problem? In this work, we introduce the Learning Stabilizers with Noise (LSN) problem, the task of decoding a random stabilizer code in the presence of local depolarizing noise. We give both polynomial-time and exponential-time quantum algorithms for solving LSN in various depolarizing noise regimes, ranging from extremely low noise, to low constant noise rates, and even higher noise rates up to a threshold. Next, we provide concrete evidence that LSN is hard. First, we show that LSN includes LPN as a special case, which suggests that it is at least as hard as its classical counterpart. Second, we prove a worst-case to average-case reduction for variants of LSN. We then ask: what is the computational complexity of solving LSN? Because the task features quantum inputs, its complexity cannot be characterized by traditional complexity classes. Instead, we show that the LSN problem lies in a recently introduced (distributional and oracle) unitary synthesis class. Finally, we identify several applications of our LSN assumption, ranging from the construction of quantum bit commitment schemes to the computational limitations of learning from quantum data.
Is there a natural quantum analog of the LPN problem? In this work, we introduce the Learning Stabilizers with Noise (LSN) problem, the task of decoding a random stabilizer code in the presence of local depolarizing noise. We give both polynomial-time and exponential-time quantum algorithms for solving LSN in various depolarizing noise regimes, ranging from extremely low noise, to low constant noise rates, and even higher noise rates up to a threshold. Next, we provide concrete evidence that LSN is hard. First, we show that LSN includes LPN as a special case, which suggests that it is at least as hard as its classical counterpart. Second, we prove a worst-case to average-case reduction for variants of LSN. We then ask: what is the computational complexity of solving LSN? Because the task features quantum inputs, its complexity cannot be characterized by traditional complexity classes. Instead, we show that the LSN problem lies in a recently introduced (distributional and oracle) unitary synthesis class. Finally, we identify several applications of our LSN assumption, ranging from the construction of quantum bit commitment schemes to the computational limitations of learning from quantum data.
Florian Krieger, Florian Hirner, Ahmet Can Mert, Sujoy Sinha Roy
ePrint Report
Modern cryptographic techniques such as fully homomorphic encryption (FHE) have recently gained broad attention. Most of these cryptosystems rely on lattice problems wherein polynomial multiplication forms the computational bottleneck. A popular method to accelerate these polynomial multiplications is the Number-Theoretic Transformation (NTT). Recent works aim to improve the practical deployability of NTT and propose toolchains supporting the NTT hardware accelerator design processes. However, existing design tools do not provide on-the-fly twiddle factor generation (TFG) which leads to high memory demands. Inspired by this situation, we present OpenNTT, a fully automated, open-source framework to compile NTT hardware accelerators with TFG for various NTT types and parameter sets. We address the challenge of combining conflict-free memory accesses and efficient, linear twiddle factor generation through a dedicated NTT processing order. Following this order, we develop a flexible twiddle factor generation method with minimal memory usage. These core concepts together with a frequency-optimized hardware architecture form our OpenNTT framework. We use OpenNTT to compile and test NTT hardware designs with various parameter sets on FPGAs. The obtained results show a clear memory reduction due to TFG and a speedup by 2.7× in latency and 2.2× in area-time-product, compared to prior arts.
Miranda Christ, Sam Gunn, Tal Malkin, Mariana Raykova
ePrint Report
The recent explosion of high-quality language models has necessitated new methods for identifying AI-generated text. Watermarking is a leading solution and could prove to be an essential tool in the age of generative AI. Existing approaches embed watermarks at inference and crucially rely on the large language model (LLM) specification and parameters being secret, which makes them inapplicable to the open-source setting. In this work, we introduce the first watermarking scheme for open-source LLMs. Our scheme works by modifying the parameters of the model, but the watermark can be detected from just the outputs of the model. Perhaps surprisingly, we prove that our watermarks are unremovable under certain assumptions about the adversary's knowledge. To demonstrate the behavior of our construction under concrete parameter instantiations, we present experimental results with OPT-6.7B and OPT-1.3B. We demonstrate robustness to both token substitution and perturbation of the model parameters. We find that the stronger of these attacks, the model-perturbation attack, requires deteriorating the quality score to 0 out of 100 in order to bring the detection rate down to 50%.
Thomas den Hollander, Sören Kleine, Marzio Mula, Daniel Slamanig, Sebastian A. Spindler
ePrint Report
Proving knowledge of a secret isogeny has recently been proposed as a means to generate supersingular elliptic curves of unknown endomorphism ring, but is equally important for cryptographic protocol design as well as for real world deployments. Recently, Cong, Lai and Levin (ACNS'23) have investigated the use of general-purpose (non-interactive) zero-knowledge proof systems for proving the knowledge of an isogeny of degree $2^k$ between supersingular elliptic curves. In particular, their approach is to model this relation via a sequence of $k$ successive steps of a walk in the supersingular isogeny graph and to show that the respective $j$-invariants are roots of the second modular polynomial. They then arithmetize this relation and show that this approach, when compared to state-of-the-art tailor-made proofs of knowledge by Basso et al. (EUROCRYPT'23), gives a 3-10$\times$ improvement in proof and verification times, with comparable proof sizes.
In this paper we ask whether we can further improve the modular polynomial-based approach and generalize its application to primes ${\ell>2}$, as used in some recent isogeny-based constructions. We will answer these questions affirmatively, by designing efficient arithmetizations for each ${\ell \in \{2, 3, 5, 7, 13\}}$ that achieve an improvement over Cong, Lai and Levin of up to 48%.
Our main technical tool and source of efficiency gains is to switch from classical modular polynomials to canonical modular polynomials. Adapting the well-known results on the former to the latter polynomials, however, is not straight-forward and requires some technical effort. We prove various interesting connections via novel use of resultant theory, and advance the understanding of canonical modular polynomials, which might be of independent interest.
In this paper we ask whether we can further improve the modular polynomial-based approach and generalize its application to primes ${\ell>2}$, as used in some recent isogeny-based constructions. We will answer these questions affirmatively, by designing efficient arithmetizations for each ${\ell \in \{2, 3, 5, 7, 13\}}$ that achieve an improvement over Cong, Lai and Levin of up to 48%.
Our main technical tool and source of efficiency gains is to switch from classical modular polynomials to canonical modular polynomials. Adapting the well-known results on the former to the latter polynomials, however, is not straight-forward and requires some technical effort. We prove various interesting connections via novel use of resultant theory, and advance the understanding of canonical modular polynomials, which might be of independent interest.
Aurore Guillevic, Simon Masson
ePrint Report
Based on the CM method for primality testing (ECPP) by Atkin and Morain published in 1993, we present two algorithms: one to generate embedded elliptic curves of SNARK-friendly curves, with a variable discriminant D; and another to generate families (parameterized by polynomials) with a fixed discriminant D. When D = 3 mod 4, it is possible to obtain a prime-order curve, and form a cycle. We apply our technique first to generate more embedded curves like Bandersnatch with BLS12-381 and we propose a plain twist-secure cycle above BLS12-381 with D = 6673027. We also devise about the scarcity of Bandersnatch-like CM curves, and show that with our algorithm, it is only a question of core-hours to find them. Second, we obtain families of prime-order embedded curves of discriminant D = 3 for BLS and KSS18 curves. Our method obtains families of embedded curves above KSS16 and can work for any KSS family. Our work generalizes the work on Bandersnatch (Masson, Sanso, and Zhang, and Sanso and El Housni).
Sarah Arpin, Tyler Raven Billingsley, Daniel Rayor Hast, Jun Bo Lau, Ray Perlner, Angela Robinson
ePrint Report
We present experimental findings on the decoding failure rate (DFR) of BIKE, a fourth-round candidate in the NIST Post-Quantum Standardization process, at the 20-bit security level using graph-theoretic approaches. We select parameters according to BIKE design principles and conduct a series of experiments using Rust to generate significantly more decoding failure instances than in prior work using SageMath. For each decoding failure, we study the internal state of the decoder at each iteration and find that for 97% of decoding failures at block size $r=587$, the decoder reaches a fixed point within 7 iterations. We then consider the corresponding Tanner graphs of each decoding failure instance to determine whether the decoding failures are due to absorbing sets. We find that 81% of decoding failures at $r=587$ were caused by absorbing sets, and of these the majority were $(d,d)$-near codewords.
Jiangshan Long, Changhai Ou, Zhu Wang, Fan Zhang
ePrint Report
Evaluation of cryptographic implementations with respect to side-channels has been mandated at high security levels nowadays. Typically, the evaluation involves four stages: detection, modeling, certification and secret recovery. In pursuit of specific goal at each stage, inherently different techniques used to be considered necessary. However, since the recent works of Eurocrypt2022 and Eurocrypt2024, linear regression analysis (LRA) has uniquely become the technique that is well-applied throughout all the stages. In this paper, we concentrate on this silver bullet technique within the field of side-channel. First, we address the fundamental problems of why and how to use LRA. The discussion of nominal and binary nature explains its strong applicability. To sustain effective outcomes, we provide in-depth analyses about the design matrix, regarding the sample distribution of plaintext and the chosen polynomial degree. We summarize ideal conditions that totally avoid multicollinearity problem, and explore the novel evaluator-advantageous property of LRA by means of model diagnosis. Then, we trace the roots where we theoretically elaborate its connections with traditional side-channel techniques, including Correlation Power Analysis (CPA), Distance-of-Means analysis (DoM) and Partition Power Analysis (PPA), in terms of regression coefficients, regression model and coefficient of determination. Finally, we probe into the state-of-the-art combined LRA with the so-called collapse function, demonstrating its relationship with another refined technique, G-DoM. We argue that properly relaxing the definition of bit groups equally satisfies our conclusions. Experimental results are in line with the theory, confirming its correctness.
Kung-Wei Hu, Huan-Chih Wang, Ja-Ling Wu
ePrint Report
This paper explores advancements in the Gentry-Sahai-Waters (GSW) fully homomorphic encryption scheme, addressing challenges related to message data range limitations and ciphertext size constraints. We introduce a novel approach utilizing the Chinese Remainder Theorem (CRT) for message decomposition, significantly expanding the allowable message range to the entire plaintext space. This method enables unrestricted message selection and supports parallel homomorphic operations without intermediate decryption. Additionally, we adapt existing ciphertext compression techniques, such as the PVW-like scheme, to reduce memory overhead associated with ciphertexts. Our experimental results demonstrate the effectiveness of the CRT-based decomposition in increasing the upper bound of message values and improving the scheme's capacity for consecutive homomorphic operations. However, compression introduces a trade-off, necessitating a reduced message range due to error accumulation. This research contributes to enhancing the practicality and efficiency of the GSW encryption scheme for complex computational scenarios while managing the balance between expanded message range, computational complexity, and storage requirements.
Umberto Cerruti
ePrint Report
This is a survey on the One Time Pad (OTP) and its derivatives, from its origins to modern times. OTP, if used correctly, is (the only) cryptographic code that no computing power, present or future, can break. Naturally, the discussion shifts to the creation of long random sequences, starting from short ones, which can be easily shared. We could call it the Short Key Dream. Many problems inevitably arise, which affect many fields of computer science, mathematics and knowledge in general. This work presents a vast bibliography that includes fundamental classical works and current papers on randomness, pseudorandom number generators, compressibility, unpredictability and more.
Sabrina Kunzweiler, Luciano Maino, Tomoki Moriya, Christophe Petit, Giacomo Pope, Damien Robert, Miha Stopar, Yan Bo Ti
ePrint Report
We provide explicit descriptions for radical 2-isogenies in dimensions
one, two and three using theta coordinates. These formulas allow us to efficiently
navigate in the corresponding isogeny graphs.
As an application of this, we implement different versions of the CGL hash func-
tion. Notably, the three-dimensional version is fastest, which demonstrates yet
another potential of using higher dimensional isogeny graphs in cryptography.
Benedikt Bünz, Pratyush Mishra, Wilson Nguyen, William Wang
ePrint Report
Proof-Carrying Data (PCD) is a foundational tool for ensuring the correctness of incremental distributed computations that has found numerous applications in theory and practice. The state-of-the-art PCD constructions are obtained via accumulation or folding schemes. Unfortunately, almost all known constructions of accumulation schemes rely on homomorphic vector commitments (VCs), which results in relatively high computational costs and insecurity in the face of quantum adversaries. A recent work of Bünz, Mishra, Nguyen, and Wang removes the dependence on homomorphic VCs by relying only on the random oracle model, but introduces a bound on the number of consecutive accumulation steps, which in turn bounds the depth of the PCD computation graph and greatly affects prover and verifier efficiency.
In this work, we propose Arc, a novel hash-based accumulation scheme that overcomes this restriction and supports an unbounded number of accumulation steps. The core building block underlying Arc is a new accumulation scheme for claims about proximity of claimed codewords to the Reed--Solomon code. Our approach achieves near-optimal efficiency, requiring a small number of Merkle tree openings relative to the code rate, and avoids the efficiency loss associated with bounded accumulation depth. Unlike prior work, our scheme is also able to accumulate claims up to list-decoding radius, resulting in concrete efficiency improvements.
We use this accumulation scheme to construct two distinct accumulation schemes, again relying solely on random oracles. The first approach accumulates RS proximity claims and can be used as an almost-drop-in replacement in existing PCD deployments based on IOP-based SNARKs. The second approach directly constructs an accumulation scheme for rank-1 constraint systems (and more generally polynomial constraint systems) that is simpler and more efficient than the former and prior approaches.
We introduce the notion of Interactive Oracle Reductions (IORs) to enable a modular and simple security analysis. These extend prior notions of Reductions of Knowledge to the setting of IOPs.
In this work, we propose Arc, a novel hash-based accumulation scheme that overcomes this restriction and supports an unbounded number of accumulation steps. The core building block underlying Arc is a new accumulation scheme for claims about proximity of claimed codewords to the Reed--Solomon code. Our approach achieves near-optimal efficiency, requiring a small number of Merkle tree openings relative to the code rate, and avoids the efficiency loss associated with bounded accumulation depth. Unlike prior work, our scheme is also able to accumulate claims up to list-decoding radius, resulting in concrete efficiency improvements.
We use this accumulation scheme to construct two distinct accumulation schemes, again relying solely on random oracles. The first approach accumulates RS proximity claims and can be used as an almost-drop-in replacement in existing PCD deployments based on IOP-based SNARKs. The second approach directly constructs an accumulation scheme for rank-1 constraint systems (and more generally polynomial constraint systems) that is simpler and more efficient than the former and prior approaches.
We introduce the notion of Interactive Oracle Reductions (IORs) to enable a modular and simple security analysis. These extend prior notions of Reductions of Knowledge to the setting of IOPs.
Aikata Aikata, Sujoy Sinha Roy
ePrint Report
Fully Homomorphic Encryption (FHE) is a promising privacy-enhancing technique that enables secure and private data processing on untrusted servers, such as privacy-preserving neural network (NN) evaluations. However, its practical application presents significant challenges. Limitations in how data is stored within homomorphic ciphertexts and restrictions on the types of operations that can be performed create computational bottlenecks. As a result, a growing body of research focuses on optimizing existing evaluation techniques for efficient execution in the homomorphic domain.
One key operation in this space is matrix multiplication, which forms the foundation of most neural networks. Several studies have even proposed new FHE schemes specifically to accelerate this operation. The optimization of matrix multiplication is also the primary goal of our work. We leverage the Single Instruction Multiple Data (SIMD) capabilities of FHE to increase data packing and significantly reduce the KeySwitch operation count— an expensive low-level routine in homomorphic encryption. By minimizing KeySwitching, we surpass current state-of-the-art solutions, requiring only a minimal multiplicative depth of two.
The best-known complexity for matrix multiplication at this depth is $\mathcal{O}(d)$ for matrices of size $d\times d$. Remarkably, even the leading techniques that require a multiplicative depth of three still incur a KeySwitch complexity of $\mathcal{O}(d)$. In contrast, our method reduces this complexity to $\mathcal{O}(\log{d})$ while maintaining the same level of data packing. Our solution broadly applies to all FHE schemes supporting Single Instruction Multiple Data (SIMD) operations. We further generalize the technique in two directions: allowing arbitrary packing availability and extending it to rectangular matrices. This versatile approach offers significant improvements in matrix multiplication performance and enables faster evaluation of privacy-preserving neural network applications.
One key operation in this space is matrix multiplication, which forms the foundation of most neural networks. Several studies have even proposed new FHE schemes specifically to accelerate this operation. The optimization of matrix multiplication is also the primary goal of our work. We leverage the Single Instruction Multiple Data (SIMD) capabilities of FHE to increase data packing and significantly reduce the KeySwitch operation count— an expensive low-level routine in homomorphic encryption. By minimizing KeySwitching, we surpass current state-of-the-art solutions, requiring only a minimal multiplicative depth of two.
The best-known complexity for matrix multiplication at this depth is $\mathcal{O}(d)$ for matrices of size $d\times d$. Remarkably, even the leading techniques that require a multiplicative depth of three still incur a KeySwitch complexity of $\mathcal{O}(d)$. In contrast, our method reduces this complexity to $\mathcal{O}(\log{d})$ while maintaining the same level of data packing. Our solution broadly applies to all FHE schemes supporting Single Instruction Multiple Data (SIMD) operations. We further generalize the technique in two directions: allowing arbitrary packing availability and extending it to rectangular matrices. This versatile approach offers significant improvements in matrix multiplication performance and enables faster evaluation of privacy-preserving neural network applications.
Supriya Adhikary, Wai Kong Lee, Angshuman Karmakar, Yongwoo Lee, Seong Oun Hwang, Ramachandra Achar
ePrint Report
Large polynomial multiplication is one of the computational bottlenecks in fully homomorphic encryption implementations. Usually, these multiplications are implemented using the number-theoretic transformation to speed up the computation. State-of-the-art GPU-based implementation of fully homomorphic encryption computes the number theoretic transformation in two different kernels, due to the necessary synchronization between GPU blocks to ensure correctness in computation. This can be a serious limitation in embedded systems that only have constrained computational resources to support the time-consuming homomorphic encryption. In this paper, we proposed a series of techniques to improve the performance of number theoretic transform targeting homomorphic encryption on a GPU device. Firstly, we proposed to arrange the polynomials in a transposed manner and skip the last two levels of radix-4 number theoretic transformation, allowing us to completely
avoid the block synchronization in NTT implementation. This technique improved the performance of homomorphic encryption by 1.37× and 1.34× on RTX 4060 and Jetson Orin Nano respectively, compared to the conventional approach that uses full NTT without skipping any levels. However, such an approach also introduces extra overhead in the subsequent point-wise multiplication, which slows down the homomorphic multiplication. To reduce this negative impact, a fast 16 × 16 point-wise
multiplication implementation was proposed, which relies on the heavily optimized Toom-Cook 4-way algorithm. Experimental results show that our proposed homomorphic multiplication can achieve similar latency compared to Jung et al. and Yang et al., which are the best results to date. This shows that the proposed cuTraNTT is able to reduce the latency of homomorphic encryption without sacrificing the performance in homomorphic multiplication.
Yuuki Fujita, Yusuke Sakai, Kyosuke Yamashita, Goichiro Hanaoka
ePrint Report
When we use signature schemes in practice, we sometimes should consider security beyond unforgeability.
This paper considers security against key substitution attacks of multi-signer signatures (i.e., aggregate signatures and multi-signatures).
Intuitively, this security property ensures that a malicious party cannot claim the ownership of a signature that is created by an honest signer.
We investigate security against key substitution attacks of a wide range of aggregate signature schemes and multi-signature schemes: the Boneh-Gentry-Lynn-Shacham aggregate signature scheme, the sequential aggregate signature scheme by Lysyanskaya et al., the multi-signature scheme by Bellare and Neven, MuSig2, and the ordered multi-signature scheme by Boldyreva et al.
Furthermore, if the scheme does not provide security against key substitution attacks, then we modify the scheme to become secure against the attacks.