International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News

Updates on the COVID-19 situation are on the Announcement channel.

Here you can see all recent updates to the IACR webpage. These updates are also available:

RSS symbol icon
via RSS feed
Twitter bird icon
via Twitter
Weibo icon
via Weibo
Facebook icon
via Facebook

12 July 2024

Manuel B. Santos, Dimitris Mouris, Mehmet Ugurbil, Stanislaw Jarecki, José Reis, Shubho Sengupta, Miguel de Vega
ePrint Report ePrint Report
Recent advancements in transformers have revolutionized machine learning, forming the core of Large language models (LLMs). However, integrating these systems into everyday applications raises privacy concerns as client queries are exposed to model owners. Secure multiparty computation (MPC) allows parties to evaluate machine learning applications while keeping sensitive user inputs and proprietary models private. Due to inherent MPC costs, recent works introduce model-specific optimizations that hinder widespread adoption by machine learning researchers. CrypTen (NeurIPS'21) aimed to solve this problem by exposing MPC primitives via common machine learning abstractions such as tensors and modular neural networks. Unfortunately, CrypTen and many other MPC frameworks rely on polynomial approximations of the non-linear functions, resulting in high errors and communication complexity.

This paper introduces Curl, an easy-to-use MPC framework that evaluates non-linear functions as lookup tables, resulting in better approximations and significant round and communication reduction. Curl exposes a similar programming model as CrypTen and is highly parallelizable through tensors. At its core, Curl relies on discrete wavelet transformations to reduce the lookup table size without sacrificing accuracy, which results in up to $19\times$ round and communication reduction compared to CrypTen for non-linear functions such as logarithms and reciprocals. We evaluate Curl on a diverse set of LLMs, including BERT, GPT-2, and GPT Neo, and compare against state-of-the-art related works such as Iron (NeurIPS'22) and Bolt (S&P'24) achieving at least $1.9\times$ less communication and latency.

Finally, we resolve a long-standing debate regarding the security of widely used probabilistic truncation protocols by proving their security in the stand-alone model. This is of independent interest as many related works rely on this truncation style.
Expand
Avital Shafran, Eran Malach, Thomas Ristenpart, Gil Segev, Stefano Tessaro
ePrint Report ePrint Report
Given the recent progress in machine learning (ML), the cryptography community has started exploring the applicability of ML methods to the design of new cryptanalytic approaches. While current empirical results show promise, the extent to which such methods may outperform classical cryptanalytic approaches is still somewhat unclear.

In this work, we initiate exploration of the theory of ML-based cryptanalytic techniques, in particular providing new results towards understanding whether they are fundamentally limited compared to traditional approaches. Whereas most classic cryptanalysis crucially relies on directly processing individual samples (e.g., plaintext-ciphertext pairs), modern ML methods thus far only interact with samples via gradient-based computations that average a loss function over all samples. It is, therefore, conceivable that such gradient-based methods are inherently weaker than classical approaches.

We introduce a unifying framework for capturing both ``sample-based'' adversaries that are provided with direct access to individual samples and ``gradient-based'' ones that are restricted to issuing gradient-based queries that are averaged over all given samples via a loss function. Within our framework, we establish a general feasibility result showing that any sample-based adversary can be simulated by a seemingly-weaker gradient-based one. Moreover, the simulation exhibits a nearly optimal overhead in terms of the gradient-based simulator's running time. Finally, we extend and refine our simulation technique to construct a gradient-based simulator that is fully parallelizable (crucial for avoiding an undesirable overhead for parallelizable cryptanalytic tasks), which is then used to construct a gradient-based simulator that executes the particular and highly useful gradient-descent method.

Taken together, although the extent to which ML methods may outperform classical cryptanalytic approaches is still somewhat unclear, our results indicate that such gradient-based methods are not inherently limited by their seemingly restricted access to the provided samples.
Expand
Guillaume Barbu, Laurent Grémy, Roch Lescuyer
ePrint Report ePrint Report
In this work, we use some recent developments in lattice-based cryptanalytic tools to revisit a fault attack on RSA-CRT signatures based on the Partial Approximate Common Divisor (PACD) problem. By reducing the PACD to a Hidden Number Problem (HNP) instance, we decrease the number of required faulted bits from 32 to 7 in the case of a 1024-bit RSA. We successfully apply the attack to RSA instances up to 8192-bit and present an enhanced analysis of the error-tolerance in the Bounded Distance Decoding (BDD) with predicate approach. Finally, evaluating the impact of standard side-channel and fault countermeasures, we show that merely verifying the signature before output is not an adequate protection against this attack. The reduction from PACD to HNP might be of independent interest.
Expand
Maximilian Kroschewski, Anja Lehmann, Cavit Özbay
ePrint Report ePrint Report
Single Sign-On (SSO) allows users to conveniently authenticate to many Relying Parties (RPs) through a central Identity Provider (IdP). SSO supports unlinkable authentication towards the RPs via pairwise pseudonyms, where the IdP assigns the user an RP-specific pseudonym. This feature has been rolled out prominently within Apple's SSO service. While establishing unlinkable identities provides privacy towards RPs, it actually emphasizes the main privacy problem of SSO: with every authentication request, the IdP learns the RP that the user wants to access. Solutions to overcome this limitation exist, but either assume users to behave honestly or require them to manage long-term cryptographic keys.

In this work, we propose the first SSO system that can provide such pseudonymous authentication in an unobservable yet strongly secure and convenient manner. That is, the IdP blindly derives the user's pairwise pseudonym for the targeted RP without learning the RP's identity and without requiring key material handled by the user. We formally define the desired security and privacy properties for such unlinkable, unobservable, and strongly secure SSO. In particular, our model includes the often neglected RP authentication: the IdP typically wants to limit its services to registered RPs only and thus must be able to (blindly) verify that it issues the token and pseudonym to such a registered RP. We propose a simple construction that combines signatures with efficient proofs-of-knowledge with a blind, yet verifiable, evaluation of the Hashed-Diffie-Hellman PRF. We prove the security of our construction and demonstrate its efficiency through a prototypical implementation, which requires a running time of 2-20ms per involved party.
Expand

10 July 2024

Paul Grandamme, Pierre-Antoine Tissot, Lilian Bossuet, Jean-Max Dutertre, Brice Colombier, Vincent Grosso
ePrint Report ePrint Report
Physical attacks, and among them fault injection attacks, are a significant threat to the security of embedded systems. Among the means of fault injection, laser has the significant advantage of being extremely spatially accurate. Numerous state-of-the-art studies have investigated the use of lasers to inject faults into a target at run-time. However, the high precision of laser fault injection comes with requirements on the knowledge of the implementation and exact execution time of the victim code. The main contribution of this work is the demonstration on experimental basis that it is also possible to perform laser fault injection on an unpowered device. Specifically, we targeted the Flash non-volatile memory of a 32-bit microcontroller. The advantage of this new attack path is that it does not require any synchronisation between the victim and the attacker. We provide an experimental characterization of this phenomenon with a description of the fault model from the physical level up to the software level. Finally, we applied these results to carry out a persistent fault analysis on a 128-bit AES with a particularly realistic attacker model which reinforces the interest of the PFA.
Expand
Giacomo Fenzi, Jan Gilcher, Fernando Virdia
ePrint Report ePrint Report
In 2018, Mouha et al. (IEEE Trans. Reliability, 2018) performed a post-mortem investigation of the correctness of reference implementations submitted to the SHA3 competition run by NIST, finding previously unidentified bugs in a significant portion of them, including two of the five finalists. Their innovative approach allowed them to identify the presence of such bugs in a black-box manner, by searching for counterexamples to expected cryptographic properties of the implementations under test. In this work, we extend their approach to key encapsulation mechanisms (KEMs) and digital signature schemes (DSSs). We perform our tests on multiple versions of the LibOQS collection of post-quantum schemes, to capture implementations at different points of the recent Post-Quantum Cryptography Standardization Process run by NIST. We identify multiple bugs, ranging from software bugs (segmentation faults, memory overflows) to cryptographic bugs, such as ciphertext malleability in KEMs claiming IND-CCA security. We also observe various features of KEMs and DSS that do not contradict any security guarantees, but could appear counter-intuitive.
Expand
Onur İşler
ePrint Report ePrint Report
The use of Internet of Things (IoT) devices in embedded systems has become increasingly popular with advancing technologies. These devices become vulnerable to cyber attacks as they gain popularity. The cryptographic operations performed for the purpose of protection against cyber attacks are crucial to yield fast results in open networks and not slow down network traffic. Therefore, to enhance communication security, studies have been conducted in the literature on using asymmetric encryption and symmetric encryption together in IoT devices for activities such as key sharing, encryption, decryption, data signing, and verifying signed data. In this study, we first propose a cryptographic system engaging of IoT devices operated from a server. Then we do performance analysis of our proposal. In particular, we evaluate the elliptic curve Diffie-Hellman key exchange and elliptic curve digital signature algorithms on the Secp256r1 elliptic curve and AES symmetric encryption via the Micro uECC library conducted with the 32-bit STM32F410RB Nucleo development board microprocessor running at 48 MHz.
Expand
Jing Tian, Bo Wu, Lang Feng, Haochen Zhang, Zhongfeng Wang
ePrint Report ePrint Report
This paper proposes a fast and efficient FPGA-based hardware-software co-design for the supersingular isogeny key encapsulation (SIKE) protocol controlled by a custom RISC-V processor. Firstly, we highly optimize the core unit, the polynomial-based field arithmetic logic unit (FALU), with the proposed fast convolution-like multiplier (FCM) to significantly reduce the resource consumption while still maintaining low latency and constant time for all the four SIKE parameters. Secondly, we pack the small isogeny and point operations in hardware, devise a coarse-grained reconfigurable hardware architecture (CGRHA) based on FALU as the co-processor, and apply it to the RISC-V core with customized instructions, effectively avoiding extra time consumption for the data exchange with the software side and meanwhile increasing flexibility. Finally, we code the hardware in SystemVerilog language and the software in C language and run experiments on FPGAs. In the co-processor implementation, the experiment results show that our design for the four SIKE parameters achieves 2.6-4.4x speedup and obtains comparable or better area-time product to or than the state-of-the-art. In the hardware-software co-design experiments, we still have the superiority in speed and only <10\% of extra time is introduced by mutual communication.
Expand
Dario Catalano, Emanuele Giunta, Francesco Migliaro
ePrint Report ePrint Report
The notion of Anamorphic Encryption (Persiano et al. Eurocrypt 2022) aims at establishing private communication against an adversary who can access secret decryption keys and influence the chosen messages. Persiano et al. gave a simple, black-box, rejection sampling-based technique to send anamorphic bits using any IND-CPA secure scheme as underlying PKE.

In this paper however we provide evidence that their solution is not as general as claimed: indeed there exists a (contrived yet secure) PKE which lead to insecure anamorphic instantiations. Actually, our result implies that such stateless black-box realizations of AE are impossible to achieve, unless weaker notions are targeted or extra assumptions are made on the PKE. Even worse, this holds true even if one resorts to powerful non-black-box techniques, such as NIZKs, $ i\mathcal{O} $ or garbling.

From a constructive perspective, we shed light those required assumptions. Specifically, we show that one could bypass (to some extent) our impossibility by either considering a weaker (but meaningful) notion of AE or by assuming the underlying PKE to (always) produce high min-entropy ciphertexts.

Finally, we prove that, for the case of Fully-Asymmetric AE, $ i\mathcal{O}$ can actually be used to overcome existing impossibility barriers. We show how to use $ i\mathcal{O} $ to build Fully-Asymmetric AE (with small anamorphic message space) generically from any IND-CPA secure PKE with sufficiently high min-entropy ciphertexts. Put together our results provide a clearer picture of what black-box constructions can and cannot achieve.
Expand
Poulami Das, Andreas Erwig, Sebastian Faust
ePrint Report ePrint Report
Cryptographic wallets are an essential tool in Blockchain networks to ensure the secure storage and maintenance of an user's cryptographic keys. Broadly, wallets can be divided into three categories, namely custodial, non-custodial, and shared-custodial wallets. The first two are centralized solutions, i.e., the wallet is operated by a single entity, which inherently introduces a single point of failure. Shared-custodial wallets, on the other hand, are maintained by two independent parties, e.g., the wallet user and a service provider, and hence avoid the single point of failure centralized solutions. Unfortunately, current shared-custodial wallets suffer from significant privacy issues.

In our work, we introduce password-authenticated deterministic wallets (PADW), a novel and efficient shared-custodial wallet solution, which exhibits strong security and privacy guarantees. In a nutshell, in a PADW scheme, the secret key of the user is shared between the user and the server. In order to generate a signature, the user first authenticates itself to the server by providing a password and afterwards engages in an interactive signing protocol with the server. Security is guaranteed as long as at most one of the two parties is corrupted. Privacy, on the other hand, guarantees that a corrupted server cannot link a transaction to a particular user. We formally model the notion of PADW schemes and we give an instantiation from blind Schnorr signatures. Our construction allows for deterministic key derivation, a feature that is widely used in practice by existing wallet schemes, and it does not rely on any heavy cryptographic primitives. We prove our scheme secure against adaptive adversaries in the random oracle model and under standard assumptions. That is, our security proof only relies on the assumption that the Schnorr signature scheme is unforgeable and that a public key encryption scheme is CCA-secure.
Expand
Ke Zhong, Sebastian Angel
ePrint Report ePrint Report
This paper proposes Oryx, a system for efficiently detecting cycles in federated graphs where parts of the graph are held by different parties and are private. Cycle detection is an important building block in designing fraud detection algorithms that operate on confidential transaction data held by different financial institutions. Oryx allows detecting cycles of various length while keeping the topology of the graphs secret, and it does so efficiently; Oryx achieves quasilinear computational complexity and scales well with more machines thanks to a parallel design. Our implementation of Oryx running on a single 32-core AWS machine (for each party) can detect cycles of up to length 6 in under 5 hours in a financial transaction graph that consists of tens of millions of nodes and edges. While the costs are high, adding more machines further reduces the completion time. Furthermore, Oryx is, to our knowledge, the first and only system that can handle this task.
Expand
Shen Dong, Hongrui Cui, Kaiyi Zhang, Kang Yang, Yu Yu
ePrint Report ePrint Report
Oblivious transfer (OT) is a fundamental cryptographic protocol that plays a crucial role in secure multi-party computation (MPC). Most practical OT protocols by, e.g., Naor and Pinkas (SODA'01) or Chou and Orlandi (Latincrypt'15), are based on Diffie-Hellman (DH)-like assumptions and not post-quantum secure. In contrast, many other components of MPC protocols, including garbled circuits and secret sharings, are post-quantum secure. The reliance on non-post-quantum OT protocols presents a significant security bottleneck with the advent of quantum computing. In this paper, we address this issue by constructing a simple, efficient OT protocol based on Saber, a Mod-LWR-based key exchange protocol. We implemented our OT protocol and conducted experiments to evaluate its performance. Our results show that our OT protocol significantly outperforms the state-of-the-art Kyber-based post-quantum OT protocol by Masny and Rindal (CCS'19) in terms of both computation and communication costs. Furthermore, the computation speed of our OT protocol is faster than the best-known DH-based OT protocol by Chou and Orlandi (Latincrypt'15), making it competitive to replace DH-based OT in the high-bandwidth network setting.
Expand
Bilel Zaghdoudi, Maria Potop Butucaru
ePrint Report ePrint Report
This paper reports the experimental results related to lineage event storage via smart contracts deployed on private and public blockchain. In our experiments we measure the following three metrics: the cost to deploy the storage smart contract on the blockchain, which measures the initial expenditure, typically in gas units, required to deploy the smart contract that facilitates lineage event storage, then the time and gas costs needed to store a lineage event. We investigated both single and multi-clients scenarios. We considered the following public blockchains: Hedera, Fantom, Harmony Shard0, Polygon Amoy, Ethereum Sepolia, Optimism Sepolia, Klaytn Baobab and Arbitrum Sepolia. Furthermore, we investigate the performances of Hyperledger Besu with different consensus algorithms as private blockchains.
Expand
Shintaro Narisada, Hiroki Okada, Kazuhide Fukushima, Takashi Nishide
ePrint Report ePrint Report
We propose time-memory trade-off algorithms for evaluating look-up table (LUT) in both the leveled homomorphic encryption (LHE) and fully homomorphic encryption (FHE) modes in TFHE. For an arbitrary $n$-bit Boolean function, we reduce evaluation time by a factor of $O(n)$ at the expense of an additional memory of "only" $O(2^n)$ as a trade-off: The total asymptotic memory is also $O(2^n)$, which is the same as that of prior works. Our empirical results demonstrate that a $7.8 \times$ speedup in runtime is obtained with a $3.8 \times$ increase in memory usage for 16-bit Boolean functions in the LHE mode. Additionally, in the FHE mode, we achieve reductions in both runtime and memory usage by factors of $17.9 \times$ and $2.5 \times $, respectively, for 8-bit Boolean functions. The core idea is to decompose the function $f$ into sufficiently small subfunctions and leverage the precomputed results for these subfunctions, thereby achieving significant performance improvements at the cost of additional memory.
Expand
Cecilia Boschini, Darya Kaviani, Russell W. F. Lai, Giulio Malavolta, Akira Takahashi, Mehdi Tibouchi
ePrint Report ePrint Report
A threshold signature scheme splits the signing key among $\ell$ parties, such that any $t$-subset of parties can jointly generate signatures on a given message. Designing concretely efficient post-quantum threshold signatures is a pressing question, as evidenced by NIST's recent call.

In this work, we propose, implement, and evaluate a lattice-based threshold signature scheme, Ringtail, which is the first to achieve a combination of desirable properties: (i) The signing protocol consists of only two rounds, where the first round is message-independent and can thus be preprocessed offline. (ii) The scheme is concretely efficient and scalable to $t \leq 1024$ parties. For $128$-bit security and $t = 1024$ parties, we achieve $13.4$ KB signature size and $10.5$ KB of online communication. (iii) The security is based on the standard learning with errors (LWE) assumption in the random oracle model. This improves upon the state-of-the-art (with comparable efficiency) which either has a three-round signing protocol [Eurocrypt'24] or relies on a new non-standard assumption [Crypto'24].

To substantiate the practicality of our scheme, we conduct the first WAN experiment deploying a lattice-based threshold signature, across 8 countries in 5 continents. We observe that an overwhelming majority of the end-to-end latency is consumed by network latency, underscoring the need for round-optimized schemes.
Expand
Luke Harmon, Gaetan Delavignette, Hanes Oliveira
ePrint Report ePrint Report
In this work we present $\mathsf{HERatio}$, a homomorphic encryption scheme that builds on the scheme of Brakerski, and Fan and Vercauteren. Our scheme naturally accepts Laurent polynomials as inputs, allowing it to work with rationals via their bounded base-$b$ expansions. This eliminates the need for a specialized encoder and streamlines encryption, while maintaining comparable efficiency to BFV. To achieve this, we introduce a new variant of the Polynomial Learning With Errors (PLWE) problem which employs Laurent polynomials instead of the usual ``classic'' polynomials, and provide a reduction to the PLWE problem.
Expand
John Preuß Mattsson
ePrint Report ePrint Report
Advanced Encryption Standard Galois/Counter Mode (AES-GCM) is the most widely used Authenticated Encryption with Associated Data (AEAD) algorithm in the world. In this paper, we analyze the use of GCM with all the Initialization Vector (IV) constructions and lengths approved by NIST SP 800-38D when encrypting multiple plaintexts with the same key. We derive attack complexities in both ciphertext-only and known-plaintext models, with or without nonce hiding, for collision attacks compromising integrity and confidentiality. Our analysis shows that GCM with random IVs provides less than 128 bits of security. When 96-bit IVs are used, as recommended by NIST, the security drops to less than 97 bits. Therefore, we strongly recommend NIST to forbid the use of GCM with 96-bit random nonces.
Expand
Falko Strenzke, Johannes Roth
ePrint Report ePrint Report
This work describes vulnerabilities in the specification of the AEAD packets as introduced in the novel LibrePGP specification that is implemented by the widely used GnuPG application and the AES-based AEAD schemes as well as the Key Wrap Algorithm specified in the Cryptographic Message Syntax (CMS). These new attacks exploit the possibility to downgrade AEAD or AES Key Wrap ciphertexts to valid legacy CFB- or CBC-encrypted related ciphertexts and require that the attacker learns the content of the legacy decryption result. This can happen either due to the human recipient returning the decryption output, which has entirely pseudo-random appearance, to the attacker or due to a programmatic decryption oracle in the receiving system. The attacks effect the decryption of low-entropy plaintext blocks in AEAD ciphertexts and, in the case of LibrePGP, also the manipulation of existing AEAD ciphertexts. For AES Key Wrap in CMS, full key decryption is possible. Some of the attacks require multiple successful oracle queries. The attacks thus demonstrate that CCA2 security is not achieved by the LibrePGP and CMS AEAD or Key Wrap encryption in the presence of a legacy cipher mode decryption oracle. The proper countermeasure to thwart the attacks is a key derivation that ensures the use of unrelated block cipher keys for the different encryption modes.
Expand
Banashri Karmakar, Shyam Murthy, Arpita Patra, Protik Paul
ePrint Report ePrint Report
Online ride-sharing services (RSS) have become very popular owing to increased awareness of environmental concerns and as a response to increased traffic congestion. To request a ride, users submit their locations and route information for ride matching to a service provider (SP), leading to possible privacy concerns caused by leakage of users' location data. We propose QuickPool, an efficient SP-aided RSS solution that can obliviously match multiple riders and drivers simultaneously, without involving any other auxiliary server. End-users, namely, riders and drivers share their route information with SP as encryptions of the ordered set of points-of-interest (PoI) of their route from their start to end locations. SP performs a zone based oblivious matching of drivers and riders, based on partial route overlap as well as proximity of start and end points. QuickPool is in the semi-honest setting, and makes use of secure multi-party computation. We provide security proof of our protocol, perform extensive testing of our implementation and show that our protocol simultaneously matches multiple drivers and riders very efficiently. We compare the performance of QuickPool with state-of-the-art works and observe a speed up of 1.6 - 2$\times$.
Expand

08 July 2024

Matthieu Rambaud
ePrint Report ePrint Report
Blockchain consensus, a.k.a. BFT SMR, are protocols enabling $n$ processes to decide on an ever-growing chain. The fastest known asynchronous one is called 2-chain VABA (PODC'21 and FC'22), and is used as fallback chain in Abraxas* (CCS'23). It has a claimed $9.5\delta$ expected latency when used for a single shot instance, a.k.a. an MVBA. We exhibit attacks breaking it. Hence, the title of the fastest asynchronous MVBA with quadratic messages complexity goes to sMVBA (CCS'22), with $10\delta$ expected latency. Our positive contributions are two new and complementary designs.

$\bullet$ 2PAC (2-phase asynchronous consensus). It has a simpler and lighter chaining than in previous approaches. Instantiated with either quadratic or cubic phases of voting, it yields:

2PAC$^\text{lean}$: $+90\%$ throughput and $9.5\delta$ expected latency, with quadratic ($O(n^2)$) messages complexity. In both 2-chain VABA and sMVBA (as if chained, with pipelining), the quorum-certified transactions which were produced in the worst-case 1/3 of views with a slow leader were dumped, so the work was lost. The simpler design of 2PAC inserts such blocks in straight-line in the chain. Thus, contrary to naive uncle-referencing, this comes with no computational overhead, yielding a net $+50\%$ throughput gain over chained sMVBA. Both the remaining throughput and latency ($-0.5\delta$) gains, come from the lighter interactive construction of proofs of consistency appended to proposed blocks, compared to sMVBA.

2PAC$^\text{BIG}$: the fastest asynchronous blockchain consensus with cubic ($O(n^3)$) messages complexity. Fault-free single shot MVBA runs decide in just $4\delta$, as soon as no message is delivered more than twice faster than others: GradedDAG (SRDS'23) required furthermore no messages reordering.

$\bullet$ Super Fast Pipelined Blocks. This is an upgrade of previous approaches for pipelining: in 2-chain VABA, Cordial Miners (DISC'23) and GradedDAG, a block pipelined by a leader in the middle of the view had almost twice larger latency than the non-pipelined block. Our design provides a fast path deciding the pipelined block with even smaller latency than the non-pipelined block. The fast delay is guaranteed in all executions with a fair scheduler, but remarkably, whatever the behaviors of faulty processes. Consistency is preserved by a lightweight mechanism, of one threshold signature appended per proposal. Instantiated with the previous protocols, it yields: s2PAC$^\text{lean}$, with fast decision of pipelined blocks in $4\delta$; s2PAC$^\text{BIG}$, in $3\delta$; and sGradedDAG, in $3\delta$.
Expand
◄ Previous Next ►