International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News

Here you can see all recent updates to the IACR webpage. These updates are also available:

email icon
via email
RSS symbol icon
via RSS feed

05 July 2024

Joseph Johnston
ePrint Report ePrint Report
Lattice cryptography has many exciting applications, from homomorphic encryption to zero knowledge proofs. We explore the algebra of cyclotomic polynomials underlying many practical lattice cryptography constructions, and we explore algorithms for multiplying cyclotomic polynomials on a GPU.
Expand
Xiaoyang Hou, Jian Liu, Jingyu Li, Jiawen Zhang, Kui Ren
ePrint Report ePrint Report
As large language models (LLMs) continue to gain popularity, concerns about user privacy are amplified, given that the data submitted by users for inference may contain sensitive information. Therefore, running LLMs through secure two-party computation (a.k.a. secure LLM inference) has emerged as a prominent topic. However, many operations in LLMs, such as Softmax and GELU, cannot be computed using conventional gates in secure computation; instead, lookup tables (LUTs) have to be utilized, which makes LUT to be an essential primitive in secure LLM inference.

In this paper, we propose $\mathsf{ROTL}$, a secure two-party protocol for LUT evaluations. Compared with FLUTE (the state-of-the-art LUT presented at Oakland '23), it achieves upto 11.6$\times$ speedup in terms of overall performance and 155$\times$ speedup in terms of online performance. Furthermore, $\mathsf{ROTL}$ can support arithmetic shares (which is required by secure LLM inference), whereas FLUTE can only support boolean shares. At the heart of $\mathsf{ROTL}$ is a novel protocol for secret-shared rotation, which allows two parties to generate additive shares of the rotated table without revealing the rotation offset. We believe this protocol is of independent interest. Based on $\mathsf{ROTL}$, we design a novel secure comparison protocol; compared with the state-of-the-art, it achieves a 2.4$\times$ bandwidth reduction in terms of online performance.

To support boolean shares, we further provide an optimization for FLUTE, by reducing its computational complexity from $O(l\cdot n^2)$ to $O(n\log n+l\cdot n)$ and shifting $O(n\log n)$ computation to the preprocessing phase. As a result, compared with FLUTE, it achieves upto 10.8$\times$ speedup in terms of overall performance and 962$\times$ speedup in terms of online performance.
Expand
Xinyao Li, Xiwen Ren, Ling Ning, Changhai Ou
ePrint Report ePrint Report
In order to challenge the security of cryptographic systems, Side-Channel Attacks exploit data leaks such as power consumption and electromagnetic emissions. Classic Side-Channel Attacks, which mainly focus on mono-channel data, fail to utilize the joint information of multi-channel data. However, previous studies of multi-channel attacks have often been limited in how they process and adapt to dynamic data. Furthermore, the different data types from various channels make it difficult to use them effectively. This study introduces the Fusion Channel Attack with POI Learning Encoder (FCA), which employs a set of POI Learning encoders that learn the inverse base transformation function family and project the data of each channel into a unified fusion latent space. Furthermore, our method introduces an optimal transport theory based metric for evaluating feature space fusion, which is used to assess the differences in feature spaces between channels. This model not only enhances the ability to process and interpret multi-source data, but also significantly improves the accuracy and applicability of SCAs in different environments.
Expand
Rostin Shokri, Charles Gouert, Nektarios Georgios Tsoutsos
ePrint Report ePrint Report
Fully homomorphic encryption (FHE) enables arbitrary computation on encrypted data, but certain applications remain prohibitively expensive in the encrypted domain. As a case in point, comparing two encrypted sets of data is extremely computationally expensive due to the large number of comparison operators required. In this work, we propose a novel methodology for encrypted set similarity inspired by the MinHash algorithm and the CGGI FHE scheme. Doing comparisons in FHE requires comparators and multiplexers or an expensive approximation, which further increases the latency, especially when the goal is to compare two sets of data. The MinHash algorithm can significantly reduce the number of comparisons required by employing a special Carter-Wegman (CW) hash function as a key building block. However, the modulus operation in the CW hash becomes another key bottleneck because the encrypted sub-circuits required to perform the modular reduction are very large and inefficient in an FHE setting. Towards that end, we introduce an efficient bitwise FHE-friendly digest function (FFD) to employ as the cornerstone of our proposed encrypted set-similarity. In a Boolean FHE scheme like CGGI, the bitwise operations can be implemented efficiently with Boolean gates, which allows for faster evaluation times relative to standard Carter-Wegman constructions. Overall, our approach drastically reduces the number of comparisons required relative to the baseline approach of directly computing the Jaccard similarity coefficients, and is inherently parallelizable, allowing for efficient encrypted computation on multi-CPU and GPU-based cloud servers. We validate our approach by performing a privacy-preserving plagiarism detection across encrypted documents.
Expand
Charles Gouert, Nektarios Georgios Tsoutsos
ePrint Report ePrint Report
Homomorphic encryption is a powerful technology that solves key privacy concerns in cloud computing by enabling computation on encrypted data. However, it has not seen widespread adoption due to prohibitively high latencies. In this article, we identify polynomial multiplication as a bottleneck and investigate alternative algorithms to accelerate encrypted computing.
Expand
Charles Gouert, Dimitris Mouris, Nektarios Georgios Tsoutsos
ePrint Report ePrint Report
Fully homomorphic encryption (FHE) has become progressively more viable in the years since its original inception in 2009. At the same time, leveraging state-of-the-art schemes in an efficient way for general computation remains prohibitively difficult for the average programmer. In this work, we introduce a new design for a fully homomorphic processor, dubbed Juliet, to enable faster operations on encrypted data using the state-of-the-art TFHE and cuFHE libraries for both CPU and GPU evaluation. To improve usability, we define an expressive assembly language and instruction set architecture (ISA) judiciously designed for end-to-end encrypted computation. We demonstrate Juliet's capabilities with a broad range of realistic benchmarks including cryptographic algorithms, such as the lightweight ciphers Simon and Speck, as well as logistic regression (LR) inference and matrix multiplication.
Expand
Rostin Shokri, Charles Gouert, Nektarios Georgios Tsoutsos
ePrint Report ePrint Report
As the field of genomics continues to expand and more sequencing data is gathered, genome analysis becomes increasingly relevant for many users. For example, a common scenario entails users trying to determine if their DNA samples are similar to DNA sequences hosted in a larger remote repository. Nevertheless, end users may be reluctant to upload their DNA sequences, while the owners of remote genomics repositories are unwilling to openly share their database. To address this challenge, we propose two distinct approaches based on fully homomorphic encryption to preserve the privacy of the genomic data and enable queries directly on ciphertexts. The first is based on the ubiquitous MinHash algorithm and can determine if similar matches exist in the database, while the second involves a bespoke bloom filter construction for determining exact matches. We validate both approaches across various database sizes using both GPU and CPU-based cloud servers.
Expand
Lars Folkerts, Nektarios Georgios Tsoutsos
ePrint Report ePrint Report
Generative AI, a significant technological disruptor in recent years, has impacted domains like augmented reality, coding assistance, and text generation. However, use of these models requires users to trust the model owners with their sensitive data given as input to the model. Fully Homomorphic Encryption (FHE) offers a promising solution, and many earlier works have investigated the use this technology for machine learning as a service (MLaaS) applications. Still, these efforts do not cater to generative models that operate probabilistically, allowing for diverse and creative outputs. In this work, we introduce three novel probabilistic selection algorithms for autoregressive generative AI: multiplication-scaled cumulative sum, heuristic cumulative sum, and the random-multiplication argmax. Each of these approaches presents distinctive challenges in optimizing the trade-off between precision and timing performance, a balance intricately tied to the specific characteristics of the data under consideration. Our results show that the random multiplication argmax-based method is more scalable than the cumulative sum methods and can accurately mimic the plaintext selection curve.
Expand
Felix Günther, Douglas Stebila, Shannon Veitch
ePrint Report ePrint Report
Censorship circumvention tools enable clients to access endpoints in a network despite the presence of a censor. Censors use a variety of techniques to identify content they wish to block, including filtering traffic patterns that are characteristic of proxy or circumvention protocols and actively probing potential proxy servers. Circumvention practitioners have developed fully encrypted protocols (FEPs), intended to have traffic that appears indistinguishable from random. A FEP is typically composed of a key exchange protocol to establish shared secret keys, and then a secure channel protocol to encrypt application data; both must avoid revealing to observers that an obfuscated protocol is in use.

We formalize the notion of obfuscated key exchange, capturing the requirement that a key exchange protocol's traffic "looks random" and that it resists active probing attacks, in addition to ensuring secure session keys and authentication. We show that the Tor network's obfs4 protocol satisfies this definition. We then show how to extend the obfs4 design to defend against stronger censorship attacks and present a quantum-safe obfuscated key exchange protocol. To instantiate our quantum-safe protocol using the ML-KEM (Kyber) standard, we present Kemeleon, a new mapping between ML-KEM public keys/ciphertexts and uniform byte strings.
Expand
Onur Gunlu
ePrint Report ePrint Report
Randomized distributed function computation refers to remote function computation where transmitters send data to receivers which compute function outputs that are randomized functions of the inputs. We study the applications of semantic communications in randomized distributed function computation to illustrate significant reductions in the communication load, with a particular focus on privacy. The semantic communication framework leverages generalized remote source coding methods, where the remote source is a randomized version of the observed data. Since satisfying security and privacy constraints generally require a randomization step, semantic communication methods can be applied to such function computation problems, where the goal is to remotely simulate a sequence at the receiver such that the transmitter and receiver sequences follow a target probability distribution. Our performance metrics guarantee (local differential) privacy for each input sequence, used in two different distributed function computation problems, which is possible by using strong coordination methods.

This work provides lower bounds on Wyner's common information (WCI), which is one of the two corner points of the coordination-randomness rate region characterizing the ultimate limits of randomized distributed function computation. The WCI corresponds to the case when there is no common randomness shared by the transmitter and receiver. Moreover, numerical methods are proposed to compute the other corner point for continuous-valued random variables, for which an unlimited amount of common randomness is available. Results for two problems of practical interest illustrate that leveraging common randomness can decrease the communication load as compared to the WCI corner point significantly. We also illustrate that semantic communication gains over lossless compression methods are achieved also without common randomness, motivating further research on limited common randomness scenarios.
Expand
Yuandi Cai, Ru Cheng, Yifan Zhou, Shijie Zhang, Jiang Xiao, Hai Jin
ePrint Report ePrint Report
Cross-chain Decentralized Applications (dApps) are increasingly popular for their ability to handle complex tasks across various blockchains, extending beyond simple asset transfers or swaps. However, ensuring all dependent transactions execute correctly together, known as complete atomicity, remains a challenge. Existing works provide financial atomicity, protecting against monetary loss, but lack the ability to ensure correctness for complex tasks. In this paper, we introduce Avalon, a transaction execution framework for cross-chain dApps that guarantees complete atomicity for the first time. Avalon achieves this by introducing multiple state layers above the native one to cache state transitions, allowing for efficient management of these state transitions. Most notably, for concurrent cross-chain transactions, Avalon resolves not only intra-chain conflicts but also addresses potential inconsistencies between blockchains via a novel state synchronization protocol, enabling serializable cross-chain execution. We implement Avalon using smart contracts in Cosmos ecosystem and evaluate its commitment performance, demonstrating acceptable latency and gas consumption even under conflict cases.
Expand
Sangwon Kim, Siwoo Eum, Minho Song, Hwajeong Seo
ePrint Report ePrint Report
Cryptography implementations of block cipher have been written in C language due to its strong features on system-friendly features. However, the C language is prone to memory safety issues, such as buffer overflows and memory leaks. On the other hand, Rust, novel system programming language, provides strict compile-time memory safety guarantees through its ownership model. This paper presents the implementation of LEA block cipher in Rust language, demonstrating features to prevent common memory vulnerabilities while maintaining performance. We compare the Rust implementation with the traditional C language version, showing that while Rust incurs a reasonable memory overhead, it achieves comparable the execution timing of encryption and decryption. Our results highlight Rust’s suitability for secure cryptographic applications, striking the balance between memory safety and execution efficiency.
Expand
Yujin Oh, Kyungbae Jang, Hwajeong Seo
ePrint Report ePrint Report
As quantum computing progresses, the assessment of cryptographic algorithm resilience against quantum attack gains significance interests in the field of cryptanalysis. Consequently, this paper implements the depth-optimized quantum circuit of Korean hash function (i.e., LSH) and estimates its quantum attack cost in quantum circuits. By utilizing an optimized quantum adder and employing parallelization techniques, the proposed quantum circuit achieves a 78.8\% improvement in full depth and a 79.1\% improvement in Toffoli depth compared to previous the-state-of art works. In conclusion, based on the implemented quantum circuit, we estimate the resources required for a Grover collision attack and evaluate the post-quantum security of LSH algorithms.
Expand
Matthieu Rambaud, Christophe Levrat
ePrint Report ePrint Report
In a fully non-interactive multi- , resp. aggregate-, signature scheme (fNIM, resp. fNIA), signatures issued by many signers on the same message, resp. on different messages, can be succinctly ``combined'', resp. ``aggregated''. fNIMs are used in the Ethereum consensus protocol, to produce the certificates of validity of blocks which are to be verified by billions of clients. fNIAs are used in some PBFT-like consensus protocols, such as the production version of Diem by Aptos, to replace the forwarding of many signatures by a new leader. In this work we address three complexity bottlenecks. (i) fNIAs have larger cost than fNIMs, e.g., we observe that the fNIA of BGLS (Eurocrypt'03) over 3000 signatures, takes 100x longer verification time than a batch verification of 3000 Schnorr signatures. (ii) fNIMs impose that each verifier processes the setup published by the group of potential signers. This processing consists either in verifying proofs of possession (PoPs), such as in Pixel (Usenix'20) and in the IETF'22 draft inherited from Ristenpart-Yilek (Eurocrypt'07), which costs a product of pairings over all published keys. Or, it consists in re-randomizing the keys, such as in SMSKR (FC'24). (iii) Existing proven security bounds on efficient fNIMs do not give any guarantee in practical curves with 256bits-large groups, such as BLS12-381 (used in Ethereum) or BLS12-377 (used in Zexe). Thus, computing in much larger curves is required to have provable guarantees.

Our first contribution is a new fNIM called $\mathsf{dms}$, it addresses both (ii) and (iii). It is as simple as adding Schnorr PoPs to the schoolbook pairing-based fNIM of Boldyreva (PKC'03). (ii) For a group of 1000 signers, processing these PoPs is $5+$ times faster than for the previous pairing-based PoPs, and $3+$ times faster than the processing of SMSKR, which had furthermore to be done for every new group member. (iii) In the algebraic group model (AGM), and given the current estimation of roughly 128 bits of security for the discrete logarithm (DL) in both the curves BLS12-381 and BLS12-377, then we prove a probability of forgery of $\mathsf{dms}$ no higher than about $2^{-93}$ for a time $2^{80}$ adversary. This proof of security is our main technical contribution. The only related proof before was for an interactive Schnorr-based multi-signature scheme, using Schnorr PoPs. We observe a gap in its proof, which is that the adversary has not access to a signing oracle before publishing its PoPs, although it should have. On the one hand, the gap can easily be fixed in their context. But in our context of pairing-based multi-signatures, the signing oracle produces a correlated random string which significantly complicates our extraction of the keys of the adversary. We finally provide another application of $\mathsf{dms}$, which is that it can be plugged in recent threshold signatures without setup (presented by Das et al at CCS'23, and Garg et al at SP'24), since these schemes implicitly build on any arbitrary BLS-based fNIM.

Our second contribution addresses (i), it is a very simple compiler: $\mathcal{M}to\mathcal{A}$ (multi-to-aggregate). It turns any fNIM into an fNIA, suitable for aggregation of signatures on messages with a prefix in common, with the restriction that a signer must not sign twice using the same prefix. The obtained fNIA is post-quantum as soon as the fNIM is, such as Chipmunk (CCS'23). We demonstrate the relevance for Diem by applying $\mathcal{M}to\mathcal{A}$ to $\mathsf{dms}$: the resulting fNIA enables to verify 39x faster an aggregate of 129 signatures, over messages with $7$ bits-long variable parts, than BGLS.
Expand
Justin Holmgren, Brent Waters
ePrint Report ePrint Report
Assuming the hardness of LWE and the existence of IO, we construct a public-key encryption scheme that is IND-CCA secure but fails to satisfy even a weak notion of indistinguishability security with respect to selective opening attacks. Prior to our work, such a separation was known only from stronger assumptions such as differing inputs obfuscation (Hofheinz, Rao, and Wichs, PKC 2016).

Central to our separation is a new hash family, which may be of independent interest. Specifically, for any $S(k) = k^{O(1)}$, any $n(k) = k^{O(1)}$, and any $m(k) = k^{\Theta(1)}$, we construct a hash family mapping $n(k)$ bits to $m(k)$ bits that is somewhere statistically correlation intractable (SS-CI) for all relations $R_k \subseteq \{0,1\}^{n(k)} \times \{0,1\}^{m(k)}$ that are enumerable by circuits of size $S(k)$.

We give two constructions of such a hash family. Our first construction uses IO, and generically ``boosts'' any hash family that is SS-CI for the smaller class of functions that are computable by circuits of size $S(k)$. This weaker hash variant can be constructed based solely on LWE (Peikert and Shiehian, CRYPTO 2019). Our second construction is based on the existence of a circular secure FHE scheme, and follows the construction of Canetti et al. (STOC 2019).
Expand
Geoffroy Couteau, Lalita Devadas, Srinivas Devadas, Alexander Koch, Sacha Servan-Schreiber
ePrint Report ePrint Report
Oblivious Transfer (OT) is at the heart of secure computation and is a foundation for many applications in cryptography. Over two decades of work have led to extremely efficient protocols for evaluating OT instances in the preprocessing model, through a paradigm called OT extension. A few OT instances generated in an offline phase can be used to perform many OTs in an online phase efficiently, i.e., with very low communication and computational overheads.

Specifically, traditional OT extension protocols use a small number of “base” OTs, generated using any black-box OT protocol, and convert them into many OT instances using only lightweight symmetric-key primitives. Recently, a new paradigm of OT with a *public-key setup* has emerged, which replaces the base OTs with a non-interactive setup: Using only the public key of the other party, two parties can efficiently compute a virtually unbounded number of OT instances on-the-fly.

In this paper, we put forth a novel framework for OT extension with a public-key setup and concretely efficient instantiations. An implementation of our framework is over 20 times faster when compared to the previous state-of-the-art public-key OT protocols, and remains competitive even when compared to OT protocols that *do not* offer a public-key setup. Additionally, our instantiations result in the first public-key schemes with plausible post-quantum security.

In summary, this paper contributes:

- QuietOT: A framework for OT extension with a public-key setup that uses fast, symmetric-key primitives to generate OT instances following a one-time public-key setup, and offering additional features such as precomputability.

- A public-key setup for QuietOT from the RingLWE assumption, resulting in the first post-quantum construction of OT extension with a public-key setup.

- An optimized, open-source implementation of our construction that can generate up to 1M OT extensions per second on commodity hardware. In contrast, the state-of-the-art public-key OT protocol is limited to at most 65K OTs per second.

- The first formal treatment of the security of OT with a public-key setup in a multi-party setting, which addresses several subtleties that were overlooked in prior work.
Expand
Xingyu Xie, Yifei Li, Wei Zhang, Tuowei Wang, Shizhen Xu, Jun Zhu, Yifan Song
ePrint Report ePrint Report
Proving the security of a Multiparty Computation (MPC) protocol is a difficult task. Under the current simulation-based definition of MPC, a security proof consists of a simulator, which is usually specific to the concrete protocol and requires to be manually constructed, together with a theoretical analysis of the output distribution of the simulator and corrupted parties' views in the real world. This presents an obstacle in verifying the security of a given MPC protocol. Moreover, an instance of a secure MPC protocol can easily lose its security guarantee due to careless implementation, and such a security issue is hard to detect in practice.

In this work, we propose a general automated framework to verify the perfect security of instances of MPC protocols against the semi-honest adversary. Our framework has perfect soundness: any protocol that is proven secure under our framework is also secure under the simulation-based definition of MPC. We demonstrate the completeness of our framework by showing that for any instance of the well-known BGW protocol, our framework can prove its security for every corrupted party set with polynomial time. Unlike prior work that only focuses on black-box privacy which requires the outputs of corrupted parties to contain no information about the inputs of the honest parties, our framework may potentially be used to prove the security of arbitrary MPC protocols.

We implement our framework as a prototype. The evaluation shows that our prototype automatically proves the perfect semi-honest security of BGW protocols and B2A (binary to arithmetic) conversion protocols in reasonable durations.
Expand

02 July 2024

Divyanshu Bhardwaj, Sandhya Saravanan, Nishanth Chandran, Divya Gupta
ePrint Report ePrint Report
Decision trees are an important class of supervised learning algorithms. When multiple entities contribute data to train a decision tree (e.g. for fraud detection in the financial sector), data privacy concerns necessitate the use of a privacy-enhancing technology such as secure multi-party computation (MPC) in order to secure the underlying training data. Prior state-of-the-art (Hamada et al.) construct an MPC protocol for decision tree training with a communication of $\mathcal{O}(hmN\log N)$, when building a decision tree of height $h$ for a training dataset of $N$ samples, each having $m$ attributes.

In this work, we significantly reduce the communication complexity of secure decision tree training. We construct a protocol with communication complexity $\mathcal{O}(mN\log N + hmN + hN\log N)$, thereby achieving an improvement of $\approx \mathsf{min}(h, m, \log N)$ over Hamada et al. At the core of our technique is an improved protocol to regroup sorted private elements further into additional groups (according to a flag vector) while maintaining their relative ordering. We implement our protocol in the MP-SPDZ framework and show that it requires $10\times$ lesser communication and is $9\times$ faster than the state-of-the-art.
Expand
Dag Arne Osvik, David Canright
ePrint Report ePrint Report
We reduce the number of bit operations required to implement AES to a new minimum, and also compute improvements to elements of some other ciphers. Exploring the algebra of AES allows choices of basis and streamlining of the nonlinear parts. We also compute a more efficient implementation of the linear part of each round. Similar computational optimizations apply to other cryptographic matrices and S-boxes. This work may be incorporated into a hardware AES implementation using minimal resources, or potentially in a bit-sliced software implementation to increase speed.
Expand
Daniel Dore
ePrint Report ePrint Report
We present TaSSLE, a new lookup argument for decomposable tables with minimal commitment costs. The construction generalizes techniques introduced in Lasso (Eurocrypt '24) which take advantage of the internal structure present in such tables to avoid the need for any party to need to commit to, or even construct, the entire table. This allows the use of lookups against very large tables, with applications including new design strategies for "zero-knowledge virtual machines". We show that these techniques may be combined in a generic way with any existing lookup argument to achieve similar results. We then give a construction of TaSSLE by applying this observation to a recent lookup argument, introduced in [Papini-Haböck '23], which combines logarithmic derivatives with the GKR protocol to achieve a lookup argument with minimal commitment costs.
Expand
◄ Previous Next ►