## IACR News

Updates on the COVID-19 situation are on the
Announcement channel.

Here you can see all recent updates to the IACR webpage. These updates are also available:

#### 15 July 2024

###### Changcun Wang, Zhaopeng Dai

ePrint Report
Matrix congruential generators is an important class of pseudorandom number
generators. In this paper we show how to predict a class of Matrix congruential generators matrix
congruential generators with unknown parameters. Given a few truncated digits
of high-order bits output by a matrix congruential generator, we give a method
based on lattice reduction to recover the parameters and the initial state of the
generator.

###### Chen Yang, Jingwei Chen, Wenyuan Wu, Yong Feng

ePrint Report
Clustering is a crucial unsupervised learning method extensively used in the field of data analysis. For analyzing big data, outsourced computation is an effective solution but privacy concerns arise when involving sensitive information. Fully homomorphic encryption (FHE) enables computations on encrypted data, making it ideal for such scenarios. However, existing privacy-preserving clustering based on FHE are often constrained by the high computational overhead incurred from FHE, typically requiring decryption and interactions after only one iteration of the clustering algorithm. In this work, we propose a more efficient approach to evaluate the one-hot vector for the index of the minimum in an array with FHE, which fully exploits the parallelism of single-instruction-multiple-data of FHE schemes. By combining this with FHE bootstrapping, we present a practical FHE-based k-means clustering protocol whose required round of interactions between the data owner and the server is optimal, i.e., accomplishing the entire clustering process on encrypted data in a single round. We implement this protocol using the CKKS FHE scheme. Experiments show that our protocol significantly outperforms the state-of-the-art FHE-based k-means clustering protocols on various public datasets and achieves comparable accuracy to plaintext result. Additionally, We adapt our protocol to support mini-batch k-means for large-scale datasets and report its performance.

###### Christian Majenz, Giulio Malavolta, Michael Walter

ePrint Report
We propose a generalization of Zhandry’s compressed oracle method to random permutations, where an algorithm can query both the permutation and its inverse. We show how to use the resulting oracle simulation to bound the success probability of an algorithm for any predicate on input-output pairs, a key feature of Zhandry’s technique that had hitherto resisted attempts at generalization to random permutations. One key technical ingredient is to use strictly monotone factorizations to represent the permutation in the oracle’s database. As an application of our framework, we show that the one-round sponge construction is unconditionally preimage resistant in the random permutation model. This proves a conjecture by Unruh.

###### Dana Dachman-Soled, Esha Ghosh, Mingyu Liang, Ian Miers, Michael Rosenberg

ePrint Report
Strike-lists are a common technique for rollback and replay prevention in protocols that require that clients remain anonymous or that their current position in a state machine remain confidential. Strike-lists are heavily used in anonymous credentials, e-cash schemes, and trusted execution environments, and are widely deployed on the web in the form of Privacy Pass (PoPETS '18) and Google Private State Tokens.
In such protocols, clients submit pseudorandom tokens associated with each action (e.g., a page view in Privacy Pass) or state transition, and the token is added to a server-side list to prevent reuse.

Unfortunately, the size of a strike-list, and hence the storage required by the server, is proportional to the total number of issued tokens, $N \cdot t$, where $N$ is the number of clients and $t$ is the maximum number of tickets per client. In this work, we ask whether it is possible to realize a strike-list-like functionality, which we call the anonymous tickets functionality, with storage requirements proportional to $N \log(t)$.

For the anonymous tickets functionality we construct a secure protocol from standard assumptions that achieves server storage of $O(N)$ ciphertexts, where each ciphertext encrypts a message of length $O(\log(t))$. We also consider an extension of the strike-list functionality where the server stores an arbitrary state for each client and clients advance their state with some function $s_i\gets f(s_{i-1},\mathsf{auxinput})$, which we call the anonymous outsourced state-keeping functionality. In this setting, malicious clients are prevented from rolling back their state, while honest clients are guaranteed anonymity and confidentiality against a malicious server. We achieve analogous results in this setting for two different classes of functions.

Our results rely on a new technique to preserve client anonymity in the face of selective failure attacks by a malicious server. Specifically, our protocol guarantees that misbehavior of the server either (1) does not prevent the honest client from redeeming a ticket or (2) provides the honest client with an escape hatch that can be used to simulate a redeem in a way that is indistinguishable to the server.

Unfortunately, the size of a strike-list, and hence the storage required by the server, is proportional to the total number of issued tokens, $N \cdot t$, where $N$ is the number of clients and $t$ is the maximum number of tickets per client. In this work, we ask whether it is possible to realize a strike-list-like functionality, which we call the anonymous tickets functionality, with storage requirements proportional to $N \log(t)$.

For the anonymous tickets functionality we construct a secure protocol from standard assumptions that achieves server storage of $O(N)$ ciphertexts, where each ciphertext encrypts a message of length $O(\log(t))$. We also consider an extension of the strike-list functionality where the server stores an arbitrary state for each client and clients advance their state with some function $s_i\gets f(s_{i-1},\mathsf{auxinput})$, which we call the anonymous outsourced state-keeping functionality. In this setting, malicious clients are prevented from rolling back their state, while honest clients are guaranteed anonymity and confidentiality against a malicious server. We achieve analogous results in this setting for two different classes of functions.

Our results rely on a new technique to preserve client anonymity in the face of selective failure attacks by a malicious server. Specifically, our protocol guarantees that misbehavior of the server either (1) does not prevent the honest client from redeeming a ticket or (2) provides the honest client with an escape hatch that can be used to simulate a redeem in a way that is indistinguishable to the server.

###### Nir Bitansky, Prahladh Harsha, Yuval Ishai, Ron D. Rothblum, David J. Wu

ePrint Report
A dot-product proof (DPP) is a simple probabilistic proof system in which the input statement $\mathbf{x}$ and the proof $\boldsymbol{\pi}$ are vectors over a finite field $\mathbb{F}$, and the proof is verified by making a single dot-product query $\langle \mathbf{q},(\mathbf{x} \| \boldsymbol{\pi}) \rangle$ jointly to $\mathbf{x}$ and $\boldsymbol{\pi}$. A DPP can be viewed as a 1-query fully linear PCP. We study the feasibility and efficiency of DPPs, obtaining the following results:

- Small-field DPP. For any finite field $\mathbb{F}$ and Boolean circuit $C$ of size $S$, there is a DPP for proving that there exists $\mathbf{w}$ such that $C(\mathbf{x}, \mathbf{w})=1$ with a proof $\boldsymbol{\pi}$ of length $S\cdot\mathsf{poly}(|\mathbb{F}|)$ and soundness error $\varepsilon=O(1 / \sqrt{|\mathbb{F}|})$. We show this error to be asymptotically optimal. In particular, and in contrast to the best known PCPs, there exist strictly linear-length DPPs over constant-size fields.

- Large-field DPP. If $|\mathbb{F}|\ge\mathsf{poly}(S/\varepsilon)$, there is a similar DPP with soundness error $\varepsilon$ and proof length $O(S)$ (in field elements).

The above results do not rely on the PCP theorem and their proofs are considerably simpler. We apply our DPP constructions toward two kinds of applications.

- Hardness of approximation. We obtain a simple proof for the NP-hardness of approximating MAXLIN (with dense instances) over any finite field $\mathbb{F}$ up to some constant factor $c>1$, independent of $\mathbb{F}$. Unlike previous PCP-based proofs, our proof yields exponential-time hardness under the exponential time hypothesis (ETH).

- Succinct arguments. We improve the concrete efficiency of succinct interactive arguments in the generic group model using input-independent preprocessing. In particular, the communication is comparable to sending two group elements and the verifier's computation is dominated by a single group exponentiation. We also show how to use DPPs together with linear-only encryption to construct succinct commit-and-prove arguments.

- Small-field DPP. For any finite field $\mathbb{F}$ and Boolean circuit $C$ of size $S$, there is a DPP for proving that there exists $\mathbf{w}$ such that $C(\mathbf{x}, \mathbf{w})=1$ with a proof $\boldsymbol{\pi}$ of length $S\cdot\mathsf{poly}(|\mathbb{F}|)$ and soundness error $\varepsilon=O(1 / \sqrt{|\mathbb{F}|})$. We show this error to be asymptotically optimal. In particular, and in contrast to the best known PCPs, there exist strictly linear-length DPPs over constant-size fields.

- Large-field DPP. If $|\mathbb{F}|\ge\mathsf{poly}(S/\varepsilon)$, there is a similar DPP with soundness error $\varepsilon$ and proof length $O(S)$ (in field elements).

The above results do not rely on the PCP theorem and their proofs are considerably simpler. We apply our DPP constructions toward two kinds of applications.

- Hardness of approximation. We obtain a simple proof for the NP-hardness of approximating MAXLIN (with dense instances) over any finite field $\mathbb{F}$ up to some constant factor $c>1$, independent of $\mathbb{F}$. Unlike previous PCP-based proofs, our proof yields exponential-time hardness under the exponential time hypothesis (ETH).

- Succinct arguments. We improve the concrete efficiency of succinct interactive arguments in the generic group model using input-independent preprocessing. In particular, the communication is comparable to sending two group elements and the verifier's computation is dominated by a single group exponentiation. We also show how to use DPPs together with linear-only encryption to construct succinct commit-and-prove arguments.

###### Ludo N. Pulles, Mehdi Tibouchi

ePrint Report
EagleSign is one of the 40 “Round 1 Additional Signatures” that is accepted for consideration in the supplementary round of the Post-Quantum Cryptography standardization process, organized by NIST. Its design is based on structured lattices, and it boasts greater simplicity and performance compared to the two lattice signatures already selected for standardization: Falcon and Dilithium.

In this paper, we show that those claimed advantages come at the cost of security. More precisely, we show that the distribution of EagleSign signatures leaks information about the private key, to the point that only a few hundred signatures on arbitrary known messages suffice for a full key recovery, for all proposed parameters.

A related vulnerability also affects EagleSign-V2, a subsequent version of the scheme specifically designed to thwart the initial attack. Although a larger number of signatures is required for key recovery, the idea of the attack remains largely similar. Both schemes come with proofs of security that we show are flawed.

In this paper, we show that those claimed advantages come at the cost of security. More precisely, we show that the distribution of EagleSign signatures leaks information about the private key, to the point that only a few hundred signatures on arbitrary known messages suffice for a full key recovery, for all proposed parameters.

A related vulnerability also affects EagleSign-V2, a subsequent version of the scheme specifically designed to thwart the initial attack. Although a larger number of signatures is required for key recovery, the idea of the attack remains largely similar. Both schemes come with proofs of security that we show are flawed.

###### Zhongyi Zhang, Chengan Hou, Meicheng Liu

ePrint Report
The SHA-3 standard consists of four cryptographic hash functions, called SHA3-224, SHA3-256, SHA3-384 and SHA3-512, and two extendable-output functions (XOFs), called SHAKE128 and SHAKE256. In this paper, we study the collision resistance of the SHA-3 instances. By analyzing the nonlinear layer, we introduce the concept of maximum difference density subspace, and develop a new target internal difference algorithm by probabilistic linearization. We also exploit new strategies for optimizing the internal differential characteristic. Further more, we figure out the expected size of collision subsets in internal differentials, by analyzing the collision probability of the digests rather than the intermediate states input to the last nonlinear layer. These techniques enhance the analysis of internal differentials, leading to the best collision attacks on four round-reduced variants of the SHA-3 instances. In particular, the number of attacked rounds is extended to 5 from 4 for SHA3-384, and to 6 from 5 for SHAKE256.

###### Christian Badertscher, Maxim Jourenko, Dimitris Karakostas, Mario Larangeira

ePrint Report
Payment channels are one of the most prominent off-chain scaling solutions for blockchain systems. However, regulatory institutions have difficulty embracing them, as the channels lack insights needed for Anti-Money Laundering (AML) auditing purposes. Our work tackles the problem of a formal reliable and controllable inspection of off-ledger payment channels, by offering a novel approach for maintaining and reliably auditing statistics of payment channels. We extend a typical trustless Layer 2 protocol and provide a lightweight and scalable protocol such that:
- every state channel is provably auditable w.r.t. a configurable set of policy queries, such that a regulator can retrieve reliable insights about the channel;
- no information beyond the answers to auditing queries is leaked;
- the cryptographic operations are inexpensive, the setup is simple, and storage complexity is independent of the transaction graph's size.
We present a concrete protocol, based on Hydra Isomorphic State Channels (FC'21), and tie the creation of a state channel to real-world identifiers, both in a plain and privacy-preserving manner. For this, we employ verifiable credentials for decentralized identifiers, specifically verifiable Legal Entity Identifiers (vLEI) that increasingly gain traction for financial service providers and regulated institutions.

###### Thomas Debris-Alazard, Pierre Loisel, Valentin Vasseur

ePrint Report
Enhanced pqsigRM is a code-based hash-and-sign scheme proposed to the second National Institute of Standards and Technology call for post-quantum signatures. The scheme is based on the $(U,U+V)$-construction and it enjoys remarkably small signature lengths, about $1$KBytes for a security level of $128$ bits. Unfortunately we show that signatures leak information about the underlying $(U,U+V)$-structure. It allows to retrieve the private-key with~$100, 000$ signatures.

###### Hossein Arabnezhad, Babak Sadeghiyan

ePrint Report
The aim of an algebraic attack is to find the secret key by solving
a collection of relations that describe the internal structure of a cipher
for observations of plaintext/cipher-text pairs.
Although algebraic attacks are addressed for cryptanalysis of block and
stream ciphers, there is a limited understanding of the impact of algebraic
representation of the cipher on the efficiency of solving the resulting collection of equations.
In this paper, we investigate on how different S-box representations affect
the complexity of algebraic attacks, in an empirical manner.
In the literature some algebraic properties are intuitively proposed to evaluate optimality of an algebraic description of S-boxes for algebraic cryptanalysis.
In this paper, we compare different S-box representation for algebraic
cryptanalysis with doing experiments with SR family of block ciphers.
We also show that the so-called \textit{Forward-Backward} representation which is in contrast with all mentioned criteria for optimal representations criteria, practically gives better results than the compliant representations.
We also compare the representations for both $GF(2)$ and $GF(2^n)$ fields.

###### Tianpei Lu, Bingsheng Zhang, Xiaoyuan Zhang, Kui Ren

ePrint Report
Model quantization has become a common practice in machine learning (ML) to improve efficiency and reduce computational/communicational overhead. However, adopting quantization in privacy-preserving machine learning (PPML) remains challenging due to the complex internal structure of quantized operators, which leads to inefficient protocols under the existing PPML frameworks.

In this work, we propose a new PPML paradigm that is tailor-made for and can benefit from quantized models. Our main observation is that lookup tables can ignore the complex internal constructs of any functions which can be used to simplify the quantized operator evaluation. We view the model inference process as a sequence of quantized operators, and each operator is implemented by a lookup table. We then develop an efficient private lookup table evaluation protocol, and its online communication cost is only $\log n$, where $n$ is the size of the lookup table. On a single CPU core, our protocol can evaluate $2^{15}$ tables with 8-bit input and 8-bit output per second.

The resulting PPML framework for quantized models offers extremely fast online performance. The experimental results demonstrate that our quantization strategy achieves substantial speedups over SOTA PPML solutions, improving the online performance by $40\sim 60 \times$ w.r.t. convolutional neural network (CNN) models, such as AlexNet, VGG16, and ResNet18, and by $10\sim 25 \times$ w.r.t. large language models (LLMs), such as GPT-2, GPT-Neo, and Llama2.

In this work, we propose a new PPML paradigm that is tailor-made for and can benefit from quantized models. Our main observation is that lookup tables can ignore the complex internal constructs of any functions which can be used to simplify the quantized operator evaluation. We view the model inference process as a sequence of quantized operators, and each operator is implemented by a lookup table. We then develop an efficient private lookup table evaluation protocol, and its online communication cost is only $\log n$, where $n$ is the size of the lookup table. On a single CPU core, our protocol can evaluate $2^{15}$ tables with 8-bit input and 8-bit output per second.

The resulting PPML framework for quantized models offers extremely fast online performance. The experimental results demonstrate that our quantization strategy achieves substantial speedups over SOTA PPML solutions, improving the online performance by $40\sim 60 \times$ w.r.t. convolutional neural network (CNN) models, such as AlexNet, VGG16, and ResNet18, and by $10\sim 25 \times$ w.r.t. large language models (LLMs), such as GPT-2, GPT-Neo, and Llama2.

#### 12 July 2024

###### Leuven, Belgium, 11 September - 13 September 2024

Event Calendar
Event date: 11 September to 13 September 2024

###### Hang Su, Qi Yang, Zhenfei Zhang

ePrint Report
The authors of Jolt [AST24] pioneered a unique method for creating zero-knowledge virtual machines, known as the lookup singularity. This technique extensively uses lookup tables to create virtual machine circuits. Despite Jolt’s performance being twice as efficient as the previous state-of-the-art1 , there is potential for further enhancement.

The initial release of Jolt uses Spartan [Set20] and Hyrax [WTs+ 18] as their backend, leading to two constraints. First, Hyrax employs Pedersen commitment to build inner product arguments, which requires elliptic curve operations. Second, the verification of a Hyrax commitment takes square root time $O(\sqrt{N})$ relative to the circuit size $N$ . This makes the recursive verification of a Jolt proof impractical, as the verification circuit would need to execute all the Hyrax verification logic in-circuit. A later version of Jolt includes Zeromorph [KT23] and HyperKZG as their commitment backend, making the system recursion-friendly, as now the recursive verifier only needs to perform $O(\log N)$ operations, but at the expense of a need for a trusted setup.

Our scheme, Jolt-b, addresses these issues by transitioning to the extension field of the Goldilocks and using the Basefold commitment scheme [ZCF23], which has an $O(\log^2 N)$ verifier time. This scheme mirrors the modifications of Plonky2 over the original Plonk [GWC19]: it transitions from EC fields to the Goldilocks field; it replaces the EC-based commitment scheme with an encoding-based commitment scheme.

We implemented Jolt-b, along with an optimized version of the Basefold scheme. Our benchmarks show that at a cost of 2.47x slowdown for the prover, we achieve recursion friendliness for the original Jolt. In comparison with other recursion-friendly Jolt variants, our scheme is 1.24x and 1.52x faster in prover time than the Zeromorph and HyperKZG variants of Jolt, respectively.

The initial release of Jolt uses Spartan [Set20] and Hyrax [WTs+ 18] as their backend, leading to two constraints. First, Hyrax employs Pedersen commitment to build inner product arguments, which requires elliptic curve operations. Second, the verification of a Hyrax commitment takes square root time $O(\sqrt{N})$ relative to the circuit size $N$ . This makes the recursive verification of a Jolt proof impractical, as the verification circuit would need to execute all the Hyrax verification logic in-circuit. A later version of Jolt includes Zeromorph [KT23] and HyperKZG as their commitment backend, making the system recursion-friendly, as now the recursive verifier only needs to perform $O(\log N)$ operations, but at the expense of a need for a trusted setup.

Our scheme, Jolt-b, addresses these issues by transitioning to the extension field of the Goldilocks and using the Basefold commitment scheme [ZCF23], which has an $O(\log^2 N)$ verifier time. This scheme mirrors the modifications of Plonky2 over the original Plonk [GWC19]: it transitions from EC fields to the Goldilocks field; it replaces the EC-based commitment scheme with an encoding-based commitment scheme.

We implemented Jolt-b, along with an optimized version of the Basefold scheme. Our benchmarks show that at a cost of 2.47x slowdown for the prover, we achieve recursion friendliness for the original Jolt. In comparison with other recursion-friendly Jolt variants, our scheme is 1.24x and 1.52x faster in prover time than the Zeromorph and HyperKZG variants of Jolt, respectively.

###### Ahmet Ramazan Ağırtaş, Arda Buğra Özer, Zülfükar Saygı, Oğuz Yayla

ePrint Report
Verifiable Random Functions (VRFs) are cryptographic primitives that generate unpredictable randomness along with proofs that are verifiable, a critical requirement for blockchain applications in decentralized finance, online gaming, and more. Existing VRF constructions often rely on centralized entities, creating security vulnerabilities. Distributed VRFs (DVRFs) offer a decentralized alternative but face challenges like large proof sizes or dependence on computationally expensive bilinear pairings.
In this research, a unique distributed VRF (DVRF) system called DVRFwCP with considerable improvements is proposed. DVRFwCP has constant-size proofs, which means that the size of the proof does not change based on the number of participants. This overcomes a significant drawback of earlier DVRF systems, which saw proof size increase with participant count. Furthermore, DVRFwCP produces more efficient verification than previous systems by eliminating the requirement for bilinear pairings throughout the verification process. These innovations contribute to a more secure and scalable solution for generating verifiable randomness in decentralized environments.
We compare our construction to well-established DVRF instantiations such as DDH-DVRF and GLOW-DVRF while also pointing out the major improvement in the estimated gas cost of these algorithms.

###### Ryuya Hayashi, Yusuke Sakai, Shota Yamada

ePrint Report
Attribute-based signatures (ABS) allow users to simultaneously sign messages and prove their possession of some attributes while hiding the attributes and revealing only the fact that they satisfy a public policy. In this paper, we propose a generic construction of ABS for circuits of unbounded depth and size with optimal parameter size, meaning that the lengths of public parameters, keys, and signatures are all constant. Our generic construction can be instantiated from various standard assumptions including LWE or DLIN. Only previous ABS construction with optimal parameter size necessitates succinct non-interactive argument of knowledge, which can be only constructed from non-standard assumptions. Our generic construction is based on RAM delegations, which intuitively allows us to compress the evaluation of a circuit when inputs are public. In high level, we find a way to compress the computation of the policy circuit on input a user attribute to achieve overall parameter size, while hiding the user policy at the same time.

###### Liron David, Omer Berkman, Avinatan Hassidim, David Lazarov, Yossi Matias, Moti Yung

ePrint Report
We present a novel cryptographic paradigm denoted ``cryptiny:'' Employing a single cryptographic value for several security goals, thus ``compacting'' the communication sent over a space-restricted (narrow) channel, while still proving security. Cryptiny is contrary to the classical cryptographic convention of using a separate cryptographic element for each security goal.

Demonstrating the importance of cryptiny, we employ it for securing a critical IoT configuration in which a broadcasting ``thing'' (called beacon) operates within stringent bandwidth constraints. In this setting, a compact BLE-broadcasting beacon lacking Internet connectivity efficiently directs brief (non fragmented) messages to its remotely pre-paired owner in real-time. Communication transpires through BLE-to-IP gateway devices denoted observers, (typically smartphones in the beacon's vicinity), and subsequently via a cloud app server. The gateway device as well, piggybacks on the transmission a secure and private message to the owner. This configuration is a generic setting for the current and future IoT real-time ecosystems, where billion of owners, beacons, and observers operate.

The configuration instances (analogous to TLS instances over the Internet) imposes high security and privacy demands. We prove that our cryptiny-based protocol for securing the above configuration achieves CCA-secrecy for the beacon's and the observer's messages with backward and forward security for the observer's message, as well simultaneously achieving mutual privacy for beacons and for observers. Achieving backward and forward security is important since beacon devices may be far from their owners for a long duration and may be passively tampered with. In addition, for the backward security proof we develop a new encryption scheme we call ``shifted-DHIES'' (``SDHIES'' for short), which generalizes DHIES. An interesting feature of SDHIES is that encryption is performed with a function of the public key rather than the public key itself.

Demonstrating the importance of cryptiny, we employ it for securing a critical IoT configuration in which a broadcasting ``thing'' (called beacon) operates within stringent bandwidth constraints. In this setting, a compact BLE-broadcasting beacon lacking Internet connectivity efficiently directs brief (non fragmented) messages to its remotely pre-paired owner in real-time. Communication transpires through BLE-to-IP gateway devices denoted observers, (typically smartphones in the beacon's vicinity), and subsequently via a cloud app server. The gateway device as well, piggybacks on the transmission a secure and private message to the owner. This configuration is a generic setting for the current and future IoT real-time ecosystems, where billion of owners, beacons, and observers operate.

The configuration instances (analogous to TLS instances over the Internet) imposes high security and privacy demands. We prove that our cryptiny-based protocol for securing the above configuration achieves CCA-secrecy for the beacon's and the observer's messages with backward and forward security for the observer's message, as well simultaneously achieving mutual privacy for beacons and for observers. Achieving backward and forward security is important since beacon devices may be far from their owners for a long duration and may be passively tampered with. In addition, for the backward security proof we develop a new encryption scheme we call ``shifted-DHIES'' (``SDHIES'' for short), which generalizes DHIES. An interesting feature of SDHIES is that encryption is performed with a function of the public key rather than the public key itself.

###### Manuel B. Santos, Dimitris Mouris, Mehmet Ugurbil, Stanislaw Jarecki, José Reis, Shubho Sengupta, Miguel de Vega

ePrint Report
Recent advancements in transformers have revolutionized machine learning, forming the core of Large language models (LLMs). However, integrating these systems into everyday applications raises privacy concerns as client queries are exposed to model owners. Secure multiparty computation (MPC) allows parties to evaluate machine learning applications while keeping sensitive user inputs and proprietary models private. Due to inherent MPC costs, recent works introduce model-specific optimizations that hinder widespread adoption by machine learning researchers. CrypTen (NeurIPS'21) aimed to solve this problem by exposing MPC primitives via common machine learning abstractions such as tensors and modular neural networks. Unfortunately, CrypTen and many other MPC frameworks rely on polynomial approximations of the non-linear functions, resulting in high errors and communication complexity.

This paper introduces Curl, an easy-to-use MPC framework that evaluates non-linear functions as lookup tables, resulting in better approximations and significant round and communication reduction. Curl exposes a similar programming model as CrypTen and is highly parallelizable through tensors. At its core, Curl relies on discrete wavelet transformations to reduce the lookup table size without sacrificing accuracy, which results in up to $19\times$ round and communication reduction compared to CrypTen for non-linear functions such as logarithms and reciprocals. We evaluate Curl on a diverse set of LLMs, including BERT, GPT-2, and GPT Neo, and compare against state-of-the-art related works such as Iron (NeurIPS'22) and Bolt (S&P'24) achieving at least $1.9\times$ less communication and latency.

Finally, we resolve a long-standing debate regarding the security of widely used probabilistic truncation protocols by proving their security in the stand-alone model. This is of independent interest as many related works rely on this truncation style.

This paper introduces Curl, an easy-to-use MPC framework that evaluates non-linear functions as lookup tables, resulting in better approximations and significant round and communication reduction. Curl exposes a similar programming model as CrypTen and is highly parallelizable through tensors. At its core, Curl relies on discrete wavelet transformations to reduce the lookup table size without sacrificing accuracy, which results in up to $19\times$ round and communication reduction compared to CrypTen for non-linear functions such as logarithms and reciprocals. We evaluate Curl on a diverse set of LLMs, including BERT, GPT-2, and GPT Neo, and compare against state-of-the-art related works such as Iron (NeurIPS'22) and Bolt (S&P'24) achieving at least $1.9\times$ less communication and latency.

Finally, we resolve a long-standing debate regarding the security of widely used probabilistic truncation protocols by proving their security in the stand-alone model. This is of independent interest as many related works rely on this truncation style.

###### Avital Shafran, Eran Malach, Thomas Ristenpart, Gil Segev, Stefano Tessaro

ePrint Report
Given the recent progress in machine learning (ML), the cryptography community has started exploring the applicability of ML methods to the design of new cryptanalytic approaches. While current empirical results show promise, the extent to which such methods may outperform classical cryptanalytic approaches is still somewhat unclear.

In this work, we initiate exploration of the theory of ML-based cryptanalytic techniques, in particular providing new results towards understanding whether they are fundamentally limited compared to traditional approaches. Whereas most classic cryptanalysis crucially relies on directly processing individual samples (e.g., plaintext-ciphertext pairs), modern ML methods thus far only interact with samples via gradient-based computations that average a loss function over all samples. It is, therefore, conceivable that such gradient-based methods are inherently weaker than classical approaches.

We introduce a unifying framework for capturing both ``sample-based'' adversaries that are provided with direct access to individual samples and ``gradient-based'' ones that are restricted to issuing gradient-based queries that are averaged over all given samples via a loss function. Within our framework, we establish a general feasibility result showing that any sample-based adversary can be simulated by a seemingly-weaker gradient-based one. Moreover, the simulation exhibits a nearly optimal overhead in terms of the gradient-based simulator's running time. Finally, we extend and refine our simulation technique to construct a gradient-based simulator that is fully parallelizable (crucial for avoiding an undesirable overhead for parallelizable cryptanalytic tasks), which is then used to construct a gradient-based simulator that executes the particular and highly useful gradient-descent method.

Taken together, although the extent to which ML methods may outperform classical cryptanalytic approaches is still somewhat unclear, our results indicate that such gradient-based methods are not inherently limited by their seemingly restricted access to the provided samples.

In this work, we initiate exploration of the theory of ML-based cryptanalytic techniques, in particular providing new results towards understanding whether they are fundamentally limited compared to traditional approaches. Whereas most classic cryptanalysis crucially relies on directly processing individual samples (e.g., plaintext-ciphertext pairs), modern ML methods thus far only interact with samples via gradient-based computations that average a loss function over all samples. It is, therefore, conceivable that such gradient-based methods are inherently weaker than classical approaches.

We introduce a unifying framework for capturing both ``sample-based'' adversaries that are provided with direct access to individual samples and ``gradient-based'' ones that are restricted to issuing gradient-based queries that are averaged over all given samples via a loss function. Within our framework, we establish a general feasibility result showing that any sample-based adversary can be simulated by a seemingly-weaker gradient-based one. Moreover, the simulation exhibits a nearly optimal overhead in terms of the gradient-based simulator's running time. Finally, we extend and refine our simulation technique to construct a gradient-based simulator that is fully parallelizable (crucial for avoiding an undesirable overhead for parallelizable cryptanalytic tasks), which is then used to construct a gradient-based simulator that executes the particular and highly useful gradient-descent method.

Taken together, although the extent to which ML methods may outperform classical cryptanalytic approaches is still somewhat unclear, our results indicate that such gradient-based methods are not inherently limited by their seemingly restricted access to the provided samples.

###### Guillaume Barbu, Laurent Grémy, Roch Lescuyer

ePrint Report
In this work, we use some recent developments in lattice-based cryptanalytic tools to revisit a fault attack on RSA-CRT signatures based on the Partial Approximate Common Divisor (PACD) problem. By reducing the PACD to a Hidden Number Problem (HNP) instance, we decrease the number of required faulted bits from 32 to 7 in the case of a 1024-bit RSA. We successfully apply the attack to RSA instances up to 8192-bit and present an enhanced analysis of the error-tolerance in the Bounded Distance Decoding (BDD) with predicate approach. Finally, evaluating the impact of standard side-channel and fault countermeasures, we show that merely verifying the signature before output is not an adequate protection against this attack. The reduction from PACD to HNP might be of independent interest.

###### Maximilian Kroschewski, Anja Lehmann, Cavit Özbay

ePrint Report
Single Sign-On (SSO) allows users to conveniently authenticate to many Relying Parties (RPs) through a central Identity Provider (IdP). SSO supports unlinkable authentication towards the RPs via pairwise pseudonyms, where the IdP assigns the user an RP-specific pseudonym. This feature has been rolled out prominently within Apple's SSO service. While establishing unlinkable identities provides privacy towards RPs, it actually emphasizes the main privacy problem of SSO: with every authentication request, the IdP learns the RP that the user wants to access. Solutions to overcome this limitation exist, but either assume users to behave honestly or require them to manage long-term cryptographic keys.

In this work, we propose the first SSO system that can provide such pseudonymous authentication in an unobservable yet strongly secure and convenient manner. That is, the IdP blindly derives the user's pairwise pseudonym for the targeted RP without learning the RP's identity and without requiring key material handled by the user. We formally define the desired security and privacy properties for such unlinkable, unobservable, and strongly secure SSO. In particular, our model includes the often neglected RP authentication: the IdP typically wants to limit its services to registered RPs only and thus must be able to (blindly) verify that it issues the token and pseudonym to such a registered RP. We propose a simple construction that combines signatures with efficient proofs-of-knowledge with a blind, yet verifiable, evaluation of the Hashed-Diffie-Hellman PRF. We prove the security of our construction and demonstrate its efficiency through a prototypical implementation, which requires a running time of 2-20ms per involved party.

In this work, we propose the first SSO system that can provide such pseudonymous authentication in an unobservable yet strongly secure and convenient manner. That is, the IdP blindly derives the user's pairwise pseudonym for the targeted RP without learning the RP's identity and without requiring key material handled by the user. We formally define the desired security and privacy properties for such unlinkable, unobservable, and strongly secure SSO. In particular, our model includes the often neglected RP authentication: the IdP typically wants to limit its services to registered RPs only and thus must be able to (blindly) verify that it issues the token and pseudonym to such a registered RP. We propose a simple construction that combines signatures with efficient proofs-of-knowledge with a blind, yet verifiable, evaluation of the Hashed-Diffie-Hellman PRF. We prove the security of our construction and demonstrate its efficiency through a prototypical implementation, which requires a running time of 2-20ms per involved party.