IACR News

If you have a news item you wish to distribute, they should be sent to the communications secretary. See also the events database for conference announcements.

Here you can see all recent updates to the IACR webpage. These updates are also available:

via email

via RSS feed

30 June 2024

KyberSlash: Exploiting secret-dependent division timings in Kyber implementations

Daniel J. Bernstein, Karthikeyan Bhargavan, Shivam Bhasin, Anupam Chattopadhyay, Tee Kiah Chia, Matthias J. Kannwischer, Franziskus Kiefer, Thales Paiva, Prasanna Ravi, Goutam Tamvada

ePrint Report

This paper presents KyberSlash1 and KyberSlash2 – two timing vulnerabilities in several implementations (including the official reference code) of the Kyber Post-Quantum Key Encapsulation Mechanism, currently undergoing standardization as ML-KEM. We demonstrate the exploitability of both KyberSlash1 and KyberSlash2 on two popular platforms: the Raspberry Pi 2 (Arm Cortex-A7) and the Arm Cortex-M4 microprocessor. Kyber secret keys are reliably recovered within minutes for KyberSlash2 and a few hours for KyberSlash1. We responsibly disclosed these vulnerabilities to maintainers of various libraries and they have swiftly been patched. We present two approaches for detecting and avoiding similar vulnerabilities. First, we patch the dynamic analysis tool Valgrind to allow detection of variable-time instructions operating on secret data, and apply it to more than 1000 implementations of cryptographic primitives in SUPERCOP. We report multiple findings. Second, we propose a more rigid approach to guarantee the absence of variable-time instructions in cryptographic software using formal methods.

Expand

28 June 2024

Distributional Secure Merge

Gayathri Garimella, Srinivasan Raghuramam, Peter Rindal

ePrint Report

Secure merge refers to the problem of merging two sorted lists. The problem appears in different settings where each list is held by one of two parties, or the lists are themselves shared among two or more parties. The output of a secure merge protocol is secret shared. Each variant of the problem offers many useful applications.

The difficulty in designing secure merge protocols vis-a-vis insecure merge protocols (which work in linear time with a single pass over the lists) has to do with operations having to be oblivious or data-independent. In particular, the protocol cannot leak the positions of items of each list in the final merged list. On account of this, sorting-based secure merge protocols have been a common solution to the problem. However, as they introduce (poly)logarithmic overheads, there has been active investigation into the task of building (near) linear time secure merge protocols. Most recently, Hemenway et al. put forth a protocol for secure merge that does achieve linear communication and computation and a round complexity of $O({\log\log n})$, where $n$ is the length of the lists being merged. While this shows the feasibility of a linear time secure merge, it still leaves room for the design of a concretely efficient linear time secure merge.

In this work, we consider a relaxation of the problem where the lists are uniformly random. We show a secure merge protocol for uniformly random lists that achieves $O({n\log\log n})$, i.e., near linear communication and computation and a round complexity of $O({\log\log n})$, where $n$ is the length of the lists being merged. Our protocol design is general and can be instantiated in a variety of settings so long as the building blocks (basic ones such as comparisons and shuffles) can be realized in said settings. Although we do not achieve the same asymptotic guarantees as Hemenway et al., our work is concretely efficient. We implement our protocol and compare it to the state of the art sorting protocols and demonstrate an order of magnitude improvement in running times and communication for lists of size of $2^{20}$.

We also extend our protocol to work for lists sampled from arbitrary distributions. In particular, when the lists are (close to) identically distributed, we achieve the same efficiency as uniform lists. This immediately improve the performance of many crucial applications including PSI & Secure Join, thus illustrating the significance and applicability of our protocol in practice.

Expand

Improved Multi-Party Fixed-Point Multiplication

Saikrishna Badrinarayanan, Eysa Lee, Peihan Miao, Peter Rindal

ePrint Report

Machine learning is widely used for a range of applications and is increasingly offered as a service by major technology companies. However, the required massive data collection raises privacy concerns during both training and inference. Privacy-preserving machine learning aims to solve this problem. In this setting, a collection of servers secret share their data and use secure multi-party computation to train and evaluate models on the joint data. All prior work focused on the scenario where the number of servers is two or three. In this work, we study the problem where there are $N \geq 3$ servers amongst whom the data is secret shared.

A key component of machine learning algorithms is to perform fixed-point multiplication with truncation of secret shared decimal values. In this work, we design new protocols for multi-party secure fixed-point multiplication where each of the $N$ parties have one share each of the two values to be multiplied and receive one share of the product at the end of the protocol. We consider three forms of secret sharing - replicated, Shamir, and additive, and design an efficient protocol secure in the presence of a semi-honest adversary for each of the forms. Our protocols are more communication efficient than all prior work on performing multi-party fixed-point multiplication. Additionally, for replicated secret sharing, we design another efficient protocol that is secure in the presence of a malicious adversary. Finally, we leverage our fixed-point multiplication protocols to design secure multi-party computation (MPC) protocols for arbitrary arithmetic circuits that have addition and fixed-point multiplication with truncation gates. All our protocols are proven secure using a standard simulation based security definition. Our protocols for replicated and Shamir sharing work in the presence of an honest majority of parties while the one for additive sharing can tolerate a dishonest majority as well.

Expand

The Sum-Check Protocol over Fields of Small Characteristic

Suyash Bagad, Yuval Domb, Justin Thaler

ePrint Report

The sum-check protocol of Lund, Fortnow, Karloff, and Nisan underlies SNARKs with the fastest known prover. In many of its applications, the prover can be implemented with a number of field operations that is linear in the number, $n$, of terms being summed.

We describe an optimized prover implementation when the protocol is applied over an extension field of a much smaller base field. The rough idea is to keep most of the prover's multiplications over the base field, at the cost of performing more $\textit{total}$ field multiplications. When the sum-check protocol is applied to a product of polynomials that all output values in the base field, our algorithm reduces the number of extension field operations by multiple orders of magnitude. In other settings, our improvements are more modest but nonetheless meaningful.

In SNARK design, the sum-check protocol is often combined with a polynomial commitment scheme, which are growing faster, especially when the values being committed are small. These improved commitment schemes are likely to render the sum-check prover the overall bottleneck, which our results help to mitigate.

Expand

Efficient Secret Sharing for Large-Scale Applications

Sarvar Patel, Giuseppe Persiano, Joon Young Seo, Kevin Yeo

ePrint Report

Threshold secret sharing enables distributing a message to $n$ parties such that no subset of fewer than $t$ parties can learn the message, whereas any subset of at least $t$ parties can recover the message. Despite being a fundamental primitive, secret sharing still suffers from one significant drawback, where its message reconstruction algorithm is computationally expensive for large privacy thresholds $t$. In this paper, we aim to address this significant drawback.

We study general $(t,c)$-ramp secret sharing schemes where the number of parties c needed to reconstruct the secret may be larger than $t$. We present a ramp secret sharing scheme whose reconstruction time is 2-7.8x faster than prior constructions suitable against adversaries that adaptively corrupt parties. For $t = 2^{20}$, our new protocol has reconstruction time of 5 seconds whereas prior work requires nearly half a minute. We see improvements starting from as small as $t = 256$. Furthermore, we obtain correctness threshold as small as $c \ge 1.05t$. To obtain our construction, we first improve the secret sharing frameworks by Cramer et al. (EUROCRYPT'15) and Applebaum et al. (CRYPTO'23) from erasure codes. Our new framework obtains secret sharing schemes that may be used against adversaries with adaptive corruptions while requiring only weaker correctness guarantees from the underlying erasure code with a distributed generation property. Furthermore, our new framework also maintains the linear homomorphism of the prior works. Afterwards, we present a concretely efficient erasure code from random band matrices that satisfies the distributed generation property.

We show that our secret sharing scheme can improve many real-world applications. In secure aggregation protocols for federated learning, we obtain up to 22% reductions in computational cost by replacing Shamir's scheme with our construction. We extend our protocol to obtain a verifiable ramp secret sharing scheme where each party can verify the consistency of the shares. Our new verifiable ramp secret sharing has 8.2-25.2x faster sharing and 2.7-23.2x faster reconstruction time compared to prior works. Finally, we present an improved distributed verifiable random function that may be used for decentralized randomness beacons.

Expand

Searching for differential addition chains

Daniel J. Bernstein, Jolijn Cottaar, Tanja Lange

ePrint Report

The literature sometimes uses slow algorithms to find minimum-length continued-fraction differential addition chains to speed up subsequent computations of multiples of points on elliptic curves. This paper introduces two faster algorithms to find these chains. The first algorithm prunes more effectively than previous algorithms. The second algorithm uses a meet-in-the-middle approach and appears to have a limiting cost exponent below 1.

Expand

Cryptography in the Common Haar State Model: Feasibility Results and Separations

Prabhanjan Ananth, Aditya Gulati, Yao-Ting Lin

ePrint Report

Common random string model is a popular model in classical cryptography. We study a quantum analogue of this model called the common Haar state (CHS) model. In this model, every party participating in the cryptographic system receives many copies of one or more i.i.d Haar random states.

We study feasibility and limitations of cryptographic primitives in this model and its variants:

- We present a construction of pseudorandom function-like states with security against computationally unbounded adversaries, as long as the adversaries only receive (a priori) bounded number of copies. By suitably instantiating the CHS model, we obtain a new approach to construct pseudorandom function-like states in the plain model.

- We present separations between pseudorandom function-like states (with super-logarithmic length) and quantum cryptographic primitives, such as interactive key agreement and bit commitment, with classical communication. To show these separations, we prove new results on the indistinguishability of identical versus independent Haar states against LOCC (local operations, classical communication) adversaries. \end{itemize}

Expand

Efficient Verifiable Differential Privacy with Input Authenticity in the Local and Shuffle Model

Tariq Bontekoe, Hassan Jameel Asghar, Fatih Turkmen

ePrint Report

Local differential privacy (LDP) is an efficient solution for providing privacy to client's sensitive data while simultaneously releasing aggregate statistics without relying on a trusted central server (aggregator) as in the central model of differential privacy. The shuffle model with LDP provides an additional layer of privacy, by disconnecting the link between clients and the aggregator, further improving the utility of LDP. However, LDP has been shown to be vulnerable to malicious clients who can perform both input and output manipulation attacks, i.e., before and after applying the LDP mechanism, to skew the aggregator's results. In this work, we show how to prevent malicious clients from compromising LDP schemes. Specifically, we give efficient constructions to prevent both input ánd output manipulation attacks from malicious clients for generic LDP algorithms. Our proposed schemes for verifiable LDP (VLDP), completely protect from output manipulation attacks, and prevent input attacks using signed data, requiring only one-time interaction between client and server, unlike existing alternatives [28, 33]. Most importantly, we are the first to provide an efficient scheme for VLDP in the shuffle model. We describe and prove secure, two schemes for VLDP in the regular model, and one in the shuffle model. We show that all schemes are highly practical, with client runtimes of < 2 seconds, and server runtimes of 5-7 milliseconds per client.

Expand

Embedding Integer Lattices as Ideals into Polynomial Rings

Yihang Cheng, Yansong Feng, Yanbin Pan

ePrint Report

Many lattice-based crypstosystems employ ideal lattices for high efficiency. However, the additional algebraic structure of ideal lattices usually makes us worry about the security, and it is widely believed that the algebraic structure will help us solve the hard problems in ideal lattices more efficiently. In this paper, we study the additional algebraic structure of ideal lattices further and find that a given ideal lattice in a polynomial ring can be embedded as an ideal into infinitely many different polynomial rings by the coefficient embedding. We design an algorithm to verify whether a given full-rank lattice in $\mathbb{Z}^n$ is an ideal lattice and output all the polynomial rings that the given lattice can be embedded into as an ideal with bit operations $\mathcal{O}(n^3(\log n + B)^2(\log n)^2)$, where $n$ is the dimension of the lattice and $B$ is the upper bound of the bit length of the entries of the input lattice basis. We would like to point out that Ding and Lindner proposed an algorithm for identifying ideal lattices and outputting a single polynomial ring of which the input lattice can be regarded as an ideal with bit operations $\mathcal{O}(n^5B^2)$ in 2007. However, we find a flaw in Ding and Lindner's algorithm, and it causes some ideal lattices can't be identified by their algorithm.

Expand

PeaceFounder: centralised E2E verifiable evoting via pseudonym braiding and history trees

Janis Erdmanis

ePrint Report

PeaceFounder is a centralised E2E verifiable e-voting system that leverages pseudonym braiding and history trees. The immutability of the bulletin board is maintained replication-free by voter’s client devices with locally stored consistency-proof chains. Meanwhile, pseudonym braiding done via an exponentiation mix before the vote allows anonymisation to be transactional with a single braider at a time. In contrast to existing E2E verifiable e-voting systems, it is much easier to deploy as the system is fully centralised, free from threshold decryption ceremonies, trusted setup phases and bulletin board replication. Furthermore, the body of a vote is signed with a braided pseudonym, enabling unlimited ballot types.

Expand

Reduction from Average-Case M-ISIS to Worst-Case CVP Over Perfect Lattices

Samuel Lavery

ePrint Report

This paper presents a novel reduction from the average-case hardness of the Module Inhomogeneous Short Integer Solution (M-ISIS) problem to the worst-case hardness of the Closest Vector Problem (CVP) by defining and leveraging “perfect” lattices for cryptographic purposes.

Perfect lattices, previously only theoretical constructs, are characterized by their highly regular structure, optimal density, and a central void, which we term the “Origin Cell.” The simplest Origin Cell is a hypercube with edge length 1 centered at the origin, guaranteed to be devoid of any valid lattice points.

By exploiting the unique properties of the Origin Cell, we recalibrate the parameters of the M-ISIS and CVP problems. Our results demonstrate that solving M-ISIS on average over perfect lattices is at least as hard as solving CVP in the worst case, thereby providing a robust hardness guarantee for M-ISIS. Additionally, perfect lattices facilitate exceptionally compact cryptographic variables, enhancing the efficiency of cryptographic schemes.

This significant finding enhances the theoretical foundation of lattice-based cryptographic problems and confirms the potential of perfect lattices in ensuring strong cryptographic security. The Appendix includes SageMath code to demonstrate the reproducibility of the reduction process from M-ISIS to CVP.

Expand

Constraint-Packing and the Sum-Check Protocol over Binary Tower Fields

Quang Dao, Justin Thaler

ePrint Report

SNARKs based on the sum-check protocol often invoke the ``zero-check PIOP''. This reduces the vanishing of many constraints to a single sum-check instance applied to an $n$-variate polynomial of the form $g(x) = \text{eq}(r,x) \cdot p(x)$, where $p$ is a product of multilinear polynomials, $r$ is a random vector, and $\text{eq}$ is the multilinear extension of the equality function. In recent SNARK designs, $p(x)$ is defined over a ``small'' base field, while $r$ is drawn from a large extension field $\mathbb{F}$ for security.

Recent papers (Bagad, Domb, and Thaler 2024; Gruen 2024) have optimized the sum-check protocol prover for this setting. However, these works still require the prover to ``pre-compute'' all evaluations of $\text{eq}(r, x)$ as $x$ ranges over $\{0, 1\}^{n}$, and this computation involves about $n$ multiplications over the extension field $\mathbb{F}$. In this note, we describe a modification to the zero-check PIOP in the case of binary tower fields that reduces this ``pre-computation'' cost by a factor of close to $\log |\mathbb{F}|$, which is $128$ in important applications. We show that our modification is sound, and that it strictly generalizes a (possibly folklore) technique of constraint-packing over field extensions.

Expand

A note on adding zero-knowledge to STARKs

Ulrich Haboeck, Al Kindi

ePrint Report

We discuss zero-knowledge in the context of FRI-based STARKs using techniques desirable in practice: Randomization by polynomials over the basefield, and decomposing the overall quotient into polynomials of smaller degree.

Expand

A note on the G-FFT

Ulrich Haboeck

ePrint Report

For primes $p$ with $p+1$ being smooth, the G-FFT from Li and Xing [LX23] is an algebraic FFT, which at first glance seems equivalent to the circle FFT from [IACR eprint 2024/278]: It also uses the circle curve over $\mathbb F_p$ (in other words the projective line) as underlying domain, and interpolates by low-degree functions with poles over the same set of points. However, their approach to control the degree of the FFT basis is fundamentally different. The G-FFT makes use of punctured Riemann-Roch spaces, and the construction works with the group doubling map only, no projection onto the $x$-axis involved.

In this note we give an elementary description of the G-FFT without using abstract algebra. We describe a variant which uses a simpler, and in our opinion more natural function space, and which treats the exceptional point of the domain (the group identity) differently. In comparison to the circle FFT, the G-FFT (both the original as well as our variant) has the following downsides. Interpolation and domain evaluation costs the double number of multiplications (the twiddle is not an ``odd'' function), and the function space is not invariant under the group action, causing additional overhead when applied in STARKs.

Expand

Reading It like an Open Book: Single-trace Blind Side-channel Attacks on Garbled Circuit Frameworks

Sirui Shen, Chenglu Jin

ePrint Report

Garbled circuits (GC) are a secure multiparty computation protocol that enables two parties to jointly compute a function using their private data without revealing it to each other. While garbled circuits are proven secure at the protocol level, implementations can still be vulnerable to side-channel attacks. Recently, side-channel analysis of GC implementations has garnered significant interest from researchers.

We investigate popular open-source GC frameworks and discover that the AES encryption used in the garbling process follows a secret-dependent sequence. This vulnerability allows private inputs to be exposed through side-channel analysis. Based on this finding, we propose a side-channel attack on garbled circuits to recover the private inputs of both parties. Our attack does not require access to any plaintexts or ciphertexts in the protocol and is single-trace, adhering to the constraint that a garbled circuit can be executed only once. Furthermore, unlike existing attacks that can only target input non-XOR gates, our method applies to both input and internal non-XOR gates. Consequently, the secrets associated with every non-XOR gate are fully exposed as in an open book.

We comprehensively evaluate our attack in various scenarios. First, we perform the attack on single-platform software implementations of standard AES and interleaved AES on a 32-bit ARM processor, achieving a $100\%$ success rate in both cases. Next, we target a hardware implementation on a Xilinx Artix-7 FPGA, where the resolution of power consumption measurements and the number of samples are significantly limited. In this scenario, our attack achieves a success rate of $79.58\%$. Finally, we perform a cross-platform attack on two processors with different microarchitectures representing the two parties. The differing execution cycles and power sensors across the platforms increase the difficulty of side-channel analysis. Despite these challenges, our point-of-interest (POI) selection method allows our attack to achieve a $100\%$ success rate in this scenario as well. We also discuss effective countermeasures that can be readily applied to GC frameworks to mitigate this vulnerability.

Expand

A Practical Protocol for Quantum Oblivious Transfer from One-Way Functions

Eleni Diamanti, Alex B. Grilo, Adriano Innocenzi, Pascal Lefebvre, Verena Yacoub, Álvaro Yángüez

ePrint Report

We present a new simulation-secure quantum oblivious transfer (QOT) protocol based on one-way functions in the plain model. With a focus on practical implementation, our protocol surpasses prior works in efficiency, promising feasible experimental realization. We address potential experimental errors and their correction, offering analytical expressions to facilitate the analysis of the required quantum resources. Technically, we achieve simulation security for QOT through an equivocal and relaxed-extractable quantum bit commitment.

Expand

Adaptively Secure 5 Round Threshold Signatures from MLWE/MSIS and DL with Rewinding

Shuichi Katsumata, Michael Reichle, Kaoru Takemure

ePrint Report

T-out-of-N threshold signatures have recently seen a renewed interest, with various types now available, each offering different tradeoffs. However, one property that has remained elusive is adaptive security. When we target thresholdizing existing efficient signatures schemes based on the Fiat-Shamir paradigm such as Schnorr, the elusive nature becomes clear. This class of signature schemes typically rely on the forking lemma to prove unforgeability. That is, an adversary is rewound and run twice within the security game. Such a proof is at odds with adaptive security, as the reduction must be ready to answer 2(T-1) secret key shares in total, implying that it can reconstruct the full secret key. Indeed, prior works either assumed strong idealized models such as the algebraic group model (AGM) or modified the underlying signature scheme so as not to rely on rewinding based proofs.

In this work, we propose a new proof technique to construct adaptively secure threshold signatures for existing rewinding-based Fiat-Shamir signatures. As a result, we obtain the following: 1. The first adaptively secure 5 round lattice-based threshold signature under the MLWE and MSIS assumptions in the ROM. The resulting signature is a standard signature of Raccoon, a lattice-based signature scheme by del Pino et al., submitted to the additional NIST call for proposals. 2. The first adaptively secure 5 round threshold signature under the DL assumption in the ROM. The resulting signature is a standard Schnorr signature. To the best of our knowledge, this is the first adaptively secure threshold signature based on DL even assuming stronger models like AGM.

Our work is inspired by the recent statically secure lattice-based 3 round threshold signature by del Pino et al. (Eurocrypt~2024) based on Raccoon. While they relied on so-called one-time additive masks to solve lattice specific issues, we notice that these masks can also be a useful tool to achieve adaptive security. At a very high level, we use these masks throughout the signing protocol to carefully control the information the adversary can learn from the signing transcripts. Intuitively, this allows the reduction to return a total of 2(T-1) randomly sampled secret key shares to the adversary consistently and without being detected, resolving the above paradoxical situation. Lastly, by allowing the parties to maintain a simple state, we can compress our 5 round schemes into 4 rounds.

Expand

Threshold OPRF from Threshold Additive HE

Animesh Singh, Sikhar Patranabis, Debdeep Mukhopadhyay

ePrint Report

An oblivious pseudorandom function (OPRF) is a two-party protocol in which a party holds an input and the other party holds the PRF key, such that the party having the input only learns the PRF output and the party having the key would not learn the input. Now, in a threshold oblivious pseudorandom function (TOPRF) protocol, a PRF key K is initially shared among T servers. A client can obtain a PRF value by interacting with t(≤ T) servers but is unable to compute the same with up to (t − 1) servers. In this paper, we present a practically efficient homomorphic encryption (HE)-based post-quantum secure TOPRF protocol. Our proposed approach, which is based on a novel use of threshold HE, is agnostic of the underlying PRF and outperforms existing fully homomorphic encryption (FHE)-based approaches for TOPRF computation by several orders of magnitude in terms of running time. The FHE-based approaches require bootstrapping, a computationally extensive operation, and the primary bottleneck for evaluating large-depth circuits. Whereas, our proposed approach is based on a multi-party computation (MPC) protocol that uses a threshold additive HE scheme based on Regev’s cryptosystem (J’ACM 2009) alternative to FHE-based approaches. Concretely, we show a novel replacement of bootstrapping required in traditional FHE schemes by a threshold additive HE-based interactive protocol that performs masked decryption followed by table look-ups, jointly performed by a group of servers holding secret shares of the HE decryption key. Finally, We present a practical validation of our approach by realizing an AES-based TOPRF with an evaluation time of less than 1 second on consumer-grade server(s).

Expand

SACfe: Secure Access Control in Functional Encryption with Unbounded Data

Uddipana Dowerah, Subhranil Dutta, Frank Hartmann, Aikaterini Mitrokotsa, Sayantan Mukherjee, Tapas Pal

ePrint Report

Privacy is a major concern in large-scale digital applications, such as cloud-computing, machine learning services, and access control. Users want to protect not only their plain data but also their associated attributes (e.g., age, location, etc). Functional encryption (FE) is a cryptographic tool that allows fine-grained access control over encrypted data. However, existing FE fall short as they are either inefficient and far from reality or they leak sensitive user-specific information.

We propose SACfe, a novel attribute-based FE scheme that provides secure, fine-grained access control and hides both the user’s attributes and the function applied to the data, while preserving the data’s confidentiality. Moreover, it enables users to encrypt unbounded-length messages along with an arbitrary number of hidden attributes into ciphertexts. We design SACfe, a protocol for performing linear computation on encrypted data while enforcing access control based on inner product predicates. We show how SACfe can be used for online biometric authentication for privacy-preserving access control. As an additional contribution, we introduce an attribute-based linear FE for unbounded length of messages and functions where access control is realized by monotone span programs. We implement our protocols using the CiFEr cryptographic library and show its efficiency for practical settings.

Expand

GRASP: Accelerating Hash-based PQC Performance on GPU Parallel Architecture

Yijing Ning, Jiankuo Dong, Jingqiang Lin, Fangyu Zheng, Yu Fu, Zhenjiang Dong, Fu Xiao

ePrint Report

$SPHINCS^+$, one of the Post-Quantum Cryptography Digital Signature Algorithms (PQC-DSA) selected by NIST in the third round, features very short public and private key lengths but faces significant performance challenges compared to other post-quantum cryptographic schemes, limiting its suitability for real-world applications. To address these challenges, we propose the GPU-based paRallel Accelerated $SPHINCS^+$ (GRASP), which leverages GPU technology to enhance the efficiency of $SPHINCS^+$ signing and verification processes. We propose an adaptable parallelization strategy for $SPHINCS^+$, analyzing its signing and verification processes to identify critical sections for efficient parallel execution. Utilizing CUDA, we perform bottom-up optimizations, focusing on memory access patterns and hypertree computation, to enhance GPU resource utilization. These efforts, combined with kernel fusion technology, result in significant improvements in throughput and overall performance. Extensive experimentation demonstrates that our optimized CUDA implementation of $SPHINCS^+$ achieves superior performance. Specifically, our GRASP scheme delivers throughput improvements ranging from 1.37× to 5.13× compared to state-of-the-art GPU-based solutions and surpasses the NIST reference implementation by over three orders of magnitude, highlighting a significant performance advantage.

Expand

◄ Previous Next ►

International Association for Cryptologic Research

International Associationfor Cryptologic Research

IACR News

30 June 2024

KyberSlash: Exploiting secret-dependent division timings in Kyber implementations

Daniel J. Bernstein, Karthikeyan Bhargavan, Shivam Bhasin, Anupam Chattopadhyay, Tee Kiah Chia, Matthias J. Kannwischer, Franziskus Kiefer, Thales Paiva, Prasanna Ravi, Goutam Tamvada

28 June 2024

Distributional Secure Merge

Gayathri Garimella, Srinivasan Raghuramam, Peter Rindal

Improved Multi-Party Fixed-Point Multiplication

Saikrishna Badrinarayanan, Eysa Lee, Peihan Miao, Peter Rindal

The Sum-Check Protocol over Fields of Small Characteristic

Suyash Bagad, Yuval Domb, Justin Thaler

Efficient Secret Sharing for Large-Scale Applications

Sarvar Patel, Giuseppe Persiano, Joon Young Seo, Kevin Yeo

Searching for differential addition chains

Daniel J. Bernstein, Jolijn Cottaar, Tanja Lange

Cryptography in the Common Haar State Model: Feasibility Results and Separations

Prabhanjan Ananth, Aditya Gulati, Yao-Ting Lin

Efficient Verifiable Differential Privacy with Input Authenticity in the Local and Shuffle Model

Tariq Bontekoe, Hassan Jameel Asghar, Fatih Turkmen

Embedding Integer Lattices as Ideals into Polynomial Rings

Yihang Cheng, Yansong Feng, Yanbin Pan

PeaceFounder: centralised E2E verifiable evoting via pseudonym braiding and history trees

Janis Erdmanis

Reduction from Average-Case M-ISIS to Worst-Case CVP Over Perfect Lattices

Samuel Lavery

Constraint-Packing and the Sum-Check Protocol over Binary Tower Fields

Quang Dao, Justin Thaler

A note on adding zero-knowledge to STARKs

Ulrich Haboeck, Al Kindi

A note on the G-FFT

Ulrich Haboeck

Reading It like an Open Book: Single-trace Blind Side-channel Attacks on Garbled Circuit Frameworks

Sirui Shen, Chenglu Jin

A Practical Protocol for Quantum Oblivious Transfer from One-Way Functions

Eleni Diamanti, Alex B. Grilo, Adriano Innocenzi, Pascal Lefebvre, Verena Yacoub, Álvaro Yángüez

Adaptively Secure 5 Round Threshold Signatures from MLWE/MSIS and DL with Rewinding

Shuichi Katsumata, Michael Reichle, Kaoru Takemure

Threshold OPRF from Threshold Additive HE

Animesh Singh, Sikhar Patranabis, Debdeep Mukhopadhyay

SACfe: Secure Access Control in Functional Encryption with Unbounded Data

Uddipana Dowerah, Subhranil Dutta, Frank Hartmann, Aikaterini Mitrokotsa, Sayantan Mukherjee, Tapas Pal

GRASP: Accelerating Hash-based PQC Performance on GPU Parallel Architecture

Yijing Ning, Jiankuo Dong, Jingqiang Lin, Fangyu Zheng, Yu Fu, Zhenjiang Dong, Fu Xiao

International Association
for Cryptologic Research