CryptoDB
Recently updated IACR publications
CryptoDB is periodically updated by manual and automatic processes. Whenever a paper is added or modified it will appear in this list, e.g., when a video appears.
A separate history of changes tracks schema and process changes. There is further information about CryptoDB in the documentation.
Year
Venue
Title
2024
TOSC
Symmetric Twin Column Parity Mixers and Their Applications
Abstract
The circulant twin column parity mixer (TCPM) is a type of mixing layer for the round function of cryptographic permutations designed by Hirch et al. at CRYPTO 2023. It has a bitwise differential branch number of 12 and a bitwise linear branch number of 4, which makes it competitive in applications where differential security is required. Hirch et al. gave a concrete instantiation of a permutation using such a mixing layer, named Gaston, and showed the best 3-round differential and linear trails of Gaston have much higher weights than those of Ascon. In this paper, we first prove why the TCPM has linear branch number 4 and then show that Gaston’s linear behavior is worse than Ascon for more than 3 rounds. Motivated by these facts, we aim to enhance the linear security of the TCPM. We show that adding a specific set of row cyclic shifts to the TCPM can make its differential and linear branch numbers both 12. Notably, by setting a special relationship between the row shift parameters of the modified TCPM, we obtain a special kind of mixlayer called the symmetric circulant twin column parity mixer. The symmetric TCPM has a unique design property that its differential and linear branch histograms are the same, which makes the parameter selection process and the security analysis convenient. Using the symmetric TCPM, we present two new 320-bit cryptographic permutations, namely (1) Gaston-S where we replace the mixing layer in Gaston with the symmetric TCPM and (2) SBD which uses a low-latency degree-4 S-box as the non-linear layer and the symmetric TCPM as the mixing layer. We evaluate the security of these permutations considering differential, linear and algebraic analysis, and then provide the performance comparison with Gaston in both hardware and software. Our results indicate that Gaston-S and SBD are competitive with Gaston in both security and performance.
2024
TOSC
A New Practical Cube Attack via Recovering Numerous Superpolys
Abstract
Cube attack is one of the most powerful approaches for recovering keys of stream ciphers. Practical cube attacks generate several superpolys first and solve the system constructed by these superpolys afterward. Unlike previous practical attacks, we propose a new cube attack that transfers the difficulty of generating easy-solving superpolys to solving the system built by numerous nonlinear ones. In the offline phase, we recovered lots of nonlinear superpolys by improving the approach proposed by Delaune et al. at SAC 2022 in theory. In the online phase, taking advantage of the sparsity and asymmetry of these numerous superpolys, we present a new testing method to solve the constructed system efficiently. As applications, the latest attack could practically recover the keys for 820- and 832-round Trivium with the time complexity no more extensive than 246 and 250, while the previous highest number of rounds of Trivium that can be attacked practically is 830. We believe the proposed approach can be used to attack more rounds of Trivium and other stream ciphers.
2024
TOSC
Chosen-Prefix Collisions on AES-like Hashing
Abstract
Chosen-prefix collision (CPC) attack was first presented by Stevens, Lenstra and de Weger on MD5 at Eurocrypt 2007. A CPC attack finds a collision for any two chosen prefixes, which is a stronger variant of collision attack. CPCs are naturally harder to construct but have larger practical impact than (identical-prefix) collisions, as seen from the series of previous works on MD5 by Stevens et al. and SHA-1 by Leurent and Peyrin. Despite its significance, the resistance of CPC attacks has not been studied on AES-like hashing.In this work, we explore CPC attacks on AES-like hashing following the framework practiced on MD5 and SHA-1. Instead of the message modification technique developed for MD-SHA family, we opt for related-key rebound attack to construct collisions for AES-like hashing in view of its effectiveness. We also note that the CPC attack framework can be exploited to convert a specific class of one-block free-start collisions into two-block collisions, which sheds light on the importance of free-start collisions. As a result, we present the first CPC attacks on reduced Whirlpool, Saturnin-hash and AES-MMO/MP in classic and quantum settings, and extend the collision attack on Saturnin-hash from 5 to 6 rounds in the classic setting. As an independent contribution, we improve the memoryless algorithm of solving 3-round inbound phase by Hosoyamada and Sasaki at Eurocrpyt 2020, which leads to improved quantum attacks on Whirlpool. Notably, we find the first 6-round memoryless quantum collision attack on Whirlpool better than generic CNS collision finding algorithm when exponential-size qRAM is not available but exponential-size classic memory is available.
2024
TOSC
Opening the Blackbox: Collision Attacks on Round-Reduced Tip5, Tip4, Tip4’ and Monolith
Abstract
A new design strategy for ZK-friendly hash functions has emerged since the proposal of Reinforced Concrete at CCS 2022, which is based on the hybrid use of two types of nonlinear transforms: the composition of some small-scale lookup tables (e.g., 7-bit or 8-bit permutations) and simple power maps over Fp. Following such a design strategy, some new ZK-friendly hash functions have been recently proposed, e.g., Tip5, Tip4, Tip4’, and the Monolith family. All these hash functions have a small number of rounds, i.e., 5 rounds for Tip5, Tip4, and Tip4’, and 6 rounds for Monolith (recently published at ToSC 2024/3). Using the composition of some small-scale lookup tables to build a large-scale permutation over Fp – which we call S-box – is a main feature in such designs, which can somehow enhance the resistance against the Gröbner basis attack because this large-scale permutation will correspond to a complex and high-degree polynomial representation over Fp.As the first technical contribution, we propose a novel and efficient algorithm to study the differential property of this S-box and to find a conforming input pair for a randomly given input and output difference. For comparison, a trivial method based on the use of the differential distribution table (DDT) for solving this problem will require time complexity O(p2).For the second contribution, we also propose new frameworks to devise efficient collision attacks on such hash functions. Based on the differential properties of these S-boxes and the new attack frameworks, we propose the first collision attacks on 3-round Tip5, Tip4, and Tip4’, as well as 2-round Monolith-31 and Monolith-64, where the 2-round attacks on Monolith are practical. In the semi-free-start (SFS) collision attack setting, we achieve practical SFS collision attacks on 3-round Tip5, Tip4, and Tip4’. Moreover, the SFS collision attacks can reach up to 4-round Tip4 and 3-round Monolith-64. As far as we know, this is the first third-party cryptanalysis of these hash functions, which improves the initial analysis given by the designers.
2024
TOSC
Exploring the Six Worlds of Gröbner Basis Cryptanalysis: Application to Anemoi
Abstract
Gröbner basis cryptanalysis of hash functions and ciphers, and their underlying permutations, has seen renewed interest recently. Anemoi (Crypto’23) is a permutation-based hash function that is efficient for a variety of arithmetizations used in zero-knowledge proofs. In this paper, exploring both theoretical bounds as well as experimental validation, we present new complexity estimates for Gröbner basis attacks on the Anemoi permutation over prime fields.We cast our findings in what we call the six worlds of Gröbner basis cryptanalysis. As an example, keeping the same security arguments of the design, we conclude that at least 41 instead of 37 rounds would need to be used for 256-bit security, whereby our suggestion does not yet include a security margin.
2024
TOSC
Committing AE from Sponges: Security Analysis of the NIST LWC Finalists
Abstract
Committing security has gained considerable attention in the field of authenticated encryption (AE). This can be traced back to a line of recent attacks, which entail that AE schemes used in practice should not only provide confidentiality and authenticity, but also committing security. Roughly speaking, a committing AE scheme guarantees that ciphertexts will decrypt only for one key. Despite the recent research effort in this area, the finalists of the NIST lightweight cryptography standardization process have not been put under consideration yet. We close this gap by providing an analysis of these schemes with respect to their committing security. Despite the structural similarities the finalists exhibit, our results are of a quite heterogeneous nature: We break four of the schemes with effectively no costs, while for two schemes our attacks are costlier, yet still efficient. For the remaining three schemes Isap, Ascon, and (a slightly modified version of) Schwaemm, we give formal security proofs. Our analysis reveals that sponges are well-suited for building committing AE schemes. Furthermore, we show several negative results when applying the zero-padding method to the NIST finalists.
2024
TOSC
Permutation-Based Hash Chains with Application to Password Hashing
Abstract
Hash chain based password systems are a useful way to guarantee authentication with one-time passwords. The core idea dates back to Lamport, and is specified in RFC 1760 as S/Key. At CCS 2017, Kogan et al. introduced T/Key, an improved password system where one-time passwords are only valid for a limited time period. They proved security of their construction in the random oracle model under a basic modeling of the adversary. In this work, we make various advances in the analysis and instantiation of hash chain based password systems. Firstly, we describe a slight abstraction called U/Key that allows for more flexibility in the instantiation and analysis, and we develop a security model that refines the adversarial strength into offline and online complexity, that can be used beyond the random oracle model, and that allows to argue multi-user security directly. Secondly, we derive a new security proof of U/Key in the random oracle model, as well as dedicated and tighter security proofs of U/Key instantiated with a sponge construction and a truncated permutation. These dedicated security proofs, in turn, solve a problem of understanding the preimage resistance of a cascaded evaluation of the sponge construction. When applied to T/Key, these results improve significantly over the earlier results: whereas the originally suggested instantiation using SHA-256 uses a compression function that maps 768 bits into 256 bits, with a truncated permutation construction one can generically achieve 128 bits of security already with a permutation of size 256 bits.
2024
ASIACRYPT
RoK, Paper, SISsors – Toolkit for Lattice-based Succinct Arguments
Abstract
Lattice-based succinct arguments allow to prove bounded-norm satisfiability of relations, such as $f(\mathbf{s}) = \mathbf{t} \bmod q$ and $\|\mathbf{s}\|\leq \beta$, over specific cyclotomic rings $\mathcal{O}_\mathcal{K}$, with proof size polylogarithmic in the witness size. However, state-of-the-art protocols require either 1) a super-polynomial size modulus $q$ due to a soundness gap in the security argument, or 2) a verifier which runs in time linear in the witness size. Furthermore, construction techniques often rely on specific choices of $\mathcal{K}$ which are not mutually compatible. In this work, we exhibit a diverse toolkit for constructing efficient lattice-based succinct arguments:
\begin{enumerate}
\item We identify new subtractive sets for general cyclotomic fields $\mathcal{K}$ and their maximal real subfields $\mathcal{K}^+$, which are useful as challenge sets, e.g. in arguments for exact norm bounds.
\item We construct modular, verifier-succinct reductions of knowledge for the bounded-norm satisfiability of structured-linear/inner-product relations, without any soundness gap, under the vanishing SIS assumption, over any $\mathcal{K}$ which admits polynomial-size subtractive sets.
\item We propose a framework to use twisted trace maps, i.e. maps of the form $\tau(z) = \frac{1}{N} \cdot \mathsf{Trace}_{\mathcal{K}/\mathbb{Q}}( \alpha \cdot z )$, to embed $\mathcal{R}$-inner-products as $\mathcal{R}$-inner-products for some structured subrings $\mathcal{R} \subseteq \mathcal{O}_\mathcal{K}$ whenever the conductor has a square-free odd part.
\item We present a simple extension of our reductions of knowledge for proving the consistency between the coefficient embedding and the Chinese Remainder Transform (CRT) encoding of $\vec{s}$ over any cyclotomic field $\mathcal{K}$ with a smooth conductor, based on a succinct decomposition of the CRT map into automorphisms, and a new, simple succinct argument for proving automorphism relations.
\end{enumerate}
Combining all techniques, we obtain, for example, verifier-succinct arguments for proving that $\vec{s}$ satisfying $f(\mathbf{s}) = \mathbf{t} \bmod q$ has binary coefficients, without soundness gap and with polynomial-size modulus $q$.
2024
ASIACRYPT
Early Stopping Byzantine Agreement in $(1+\epsilon)\cdot f$ Rounds
Abstract
In this paper, we present two \textit{early stopping} Byzantine agreement protocols in the authenticated setting against a corrupt minority $t < n/2$, where $t$ represents the maximum number of malicious parties. Early stopping protocols ensure termination within a number of rounds determined solely by the actual number of malicious nodes $f$ present during execution, irrespective of $t$.
Our first protocol is deterministic and ensures early stopping termination in $ (d+5) \cdot (\lfloor f/d \rfloor +3)$ rounds, where $d$ is a fixed constant. For example, for all $d\ge 6$, our protocol runs in at most $(1+\epsilon )\cdot f$ rounds (where $0<\epsilon<1$), improving (for large $f$) upon the best previous early stopping deterministic broadcast protocol by Perry and Toueg~\cite{Perry}, which terminates in $2f+4$ rounds. Additionally, our second protocol is randomized, ensuring termination in an expected constant number of rounds and achieving early stopping in $(d+9) \cdot (\lfloor f/d \rfloor +2)$ rounds in the worst case. This marks a significant improvement over a similar result by Goldreich and Petrank.~\cite{GOLDREICH199045}, which \emph{always} requires an expected constant number of rounds and $O(t)$ rounds in the worst case, i.e., does not have the early stopping property.
2024
ASIACRYPT
More Vulnerabilities of Linear Structure Sbox-Based Ciphers Reveal Their Inability to Resist DFA
Abstract
At Asiacrypt 2021, Baksi et al. introduced DEFAULT, the first block cipher designed to resist differential fault attacks (DFA) at the algorithm level, boasting of 64-bit DFA security. However, during Eurocrypt 2022, Nageler et al. presented a DFA attack that exposed vulnerabilities in the claimed DFA security of DEFAULT, reducing it by up to 20 bits in the case of the simple key schedule and even allowing for unique key recovery in the presence of rotating keys. In this work, we compute deterministic differential trails for up to five rounds, injecting around 5 faults into the simple key schedule for key recovery, recovering equivalent keys with just 36 faults in the DEFAULT-LAYER, and introducing a generic DFA approach suitable for round-independent keys within the DEFAULT cipher. These results represent the most efficient key recovery achieved for the DEFAULT cipher under DFA attacks so far. Additionally, we introduce a novel fault attack called the Statistical-Differential Fault Attack (SDFA), specifically tailored for linear-structured SBox-based ciphers like DEFAULT. This technique is successfully applied to BAKSHEESH, resulting in a nearly unique key recovery. Our findings emphasize the vulnerabilities present in linear-structured SBox-based ciphers and underscore the challenges in establishing robust DFA protection for such cipher designs.
2024
ASIACRYPT
Jackpot: Non-Interactive Aggregatable Lotteries
Abstract
In proof-of-stake blockchains, liveness is ensured by repeatedly selecting random groups of parties as leaders, who are then in charge of proposing new blocks and driving consensus forward.
The lotteries that elect those leaders need to ensure that adversarial parties are not elected disproportionately often and that an adversary can not tell who was elected before those parties decide to speak, as this would potentially allow for denial-of-service attacks.
Whenever an elected party speaks, it needs to provide a winning lottery ticket, which proves that the party did indeed win the lottery.
Current solutions require all published winning tickets to be stored individually on-chain, which introduces undesirable storage overheads.
In this work, we introduce non-interactive aggregatable lotteries and show how these can be constructed efficiently.
Our lotteries provide the same security guarantees as previous lottery constructions, but additionally allow any third party to take a set of published winning tickets and aggregate them into one short digest.
We provide a formal model of our new primitive in the universal composability framework.
As one of our technical contributions, which may be of independent interest, we introduce aggregatable vector commitments with simulation-extractability and present a concretely efficient construction thereof in the algebraic group model in the presence of a random oracle.
We show how these commitments can be used to construct non-interactive aggregatable lotteries.
We have implemented our construction, called Jackpot, and provide benchmarks that underline its concrete efficiency.
2024
ASIACRYPT
Bootstrapping Small Integers With CKKS
Abstract
The native plaintexts of the Cheon-Kim-Kim-Song (CKKS) fully homomorphic encryption scheme are vectors of approximations to complex numbers. Drucker \emph{et al} [J. Cryptol.'24] have showed how to use CKKS to efficiently perform computations on bits and small bit-length integers, by relying on their canonical embeddings into the complex plane. For small bit-length integers, Chung \emph{et al} [IACR eprint'24] recently suggested to rather rely on an embedding into complex roots of unity, to gain numerical stability and efficiency. Both works use CKKS in a black-box manner.
Inspired by the design by Bae \emph{et al} [Eurocrypt'24] of a dedicated bootstrapping algorithm for ciphertexts encoding bits, we propose a CKKS bootstrapping algorithm, $\style{SI\mbox{-}BTS}$ (small-integer bootstrapping), for ciphertexts encoding small bit-length integers. For this purpose, we build upon the DM/CGGI-to-CKKS conversion algorithm from Boura \emph{et al} [J.~Math. Cryptol.'20], to bootstrap canonically embedded integers to integers embedded as roots of unity. $\style{SI\mbox{-}BTS}$ allows functional bootstrapping: it can evaluate an arbitrary function of its input while bootstrapping. It may also be used to batch-(functional-)bootstrap multiple DM/CGGI ciphertexts. For example, its amortized cost for evaluating an 8-bit look-up table on~$2^{12}$ DM/CGGI ciphertexts is~3.75ms (single-thread CPU, 128-bit security).
We adapt $\style{SI\mbox{-}BTS}$ to simultaneously bootstrap multiple CKKS ciphertexts for bits. The resulting $\style{BB\mbox{-}BTS}$ algorithm (batch-bits bootstrapping) allows to decrease the amortized cost of a binary gate evaluation. Compared to Bae \emph{et al}, it gives a 2.4x speed-up.
2024
ASIACRYPT
Measure-Rewind-Extract: Tighter Proofs of One-Way to Hiding and CCA Security in the Quantum Random Oracle Model
Abstract
The One-Way to Hiding (O2H) theorem, first given by Unruh (J ACM 2015) and then restated by Ambainis et al. (CRYPTO 2019), is a crucial technique for solving the reprogramming problem in the quantum random oracle model (QROM). It provides an upper bound d\cdot\sqrt{\epsilon} for the distinguisher's advantage, where d is the query depth and \epsilon denotes the advantage of a one-wayness attacker. Later, in order to obtain a tighter upper bound, Kuchta et al. (EUROCRYPT 2020) proposed the Measure-Rewind-Measure (MRM) technique and then proved the Measure-Rewind-Measure O2H (MRM-O2H) theorem, which provides the upper bound d\cdot\epsilon. They also proposed an open question: Can we combine their MRM technique with Ambainis et al.'s semi-classical oracle technique (CRYPTO 2019) or Zhandry's compressed oracle technique (CRYPTO 2019) to prove a new O2H theorem with an upper bound even tighter than d\cdot\epsilon?
In this paper, we give an affirmative answer for the above question. We propose a new technique named Measure-Rewind-Extract (MRE) by combining the MRM technique with the semi-classical oracle technique. By using MRE technique, we prove the Measure-Rewind-Extract O2H (MRE-O2H) theorem, which provides the upper bound \sqrt{d}\cdot\epsilon.
As an important application of our MRE-O2H theorem, for the FO^{\slashed{\bot}}, FO_m^\slashed{\bot}, FO^{\bot} and FO_m^\bot proposed by Hofheinz et al. (TCC 2017), i.e., the key encapsulation mechanism (KEM) variants of the Fujisaki-Okamoto transformation, we prove the following results in the QROM: Their IND-CCA security can be reduced to the IND-CPA security of the underlying public key encryption (PKE) scheme without the square-root advantage loss. In particular, compared with the IND-CCA proof of FO^{\slashed{\bot}} given by Kuchta et al. (EUROCRYPT 2020), ours removes the injectivity assumption and has a tighter security bound. Under the assumption that the underlying PKE scheme is unique randomness recoverable, we for the first time prove that their IND-CCA security can be reduced to the OW-CPA security of the underlying PKE scheme without the square-root advantage loss.
2024
ASIACRYPT
Constrained Pseudorandom Functions for Inner-Product Predicates from Weaker Assumptions
Abstract
In this paper, we build a framework for constructing Constrained Pseudorandom Functions (CPRFs) with inner-product constraint predicates, using ideas from subtractive secret sharing and related-key-attack security.
Our framework can be instantiated using a random oracle or any suitable Related-Key-Attack (RKA) secure pseudorandom function. We provide three instantiations of our framework:
1. an adaptively-secure construction in the random oracle model;
2. a selectively-secure construction under the DDH assumption; and
3. a selectively-secure construction with a polynomial domain under the assumption that one-way functions exist.
All three instantiations are constraint-hiding and support inner-product predicates, leading to the first constructions of such expressive CPRFs under each corresponding assumption. Moreover, while the OWF-based construction is primarily of theoretical interest, the random oracle and DDH-based constructions are concretely efficient, which we show via an implementation.
2024
ASIACRYPT
QuietOT: Lightweight Oblivious Transfer with a Public-Key Setup
Abstract
Oblivious Transfer (OT) is at the heart of secure computation and is a foundation for many applications in cryptography. Over two decades of work have led to extremely efficient protocols for efficiently evaluating OT instances in the preprocessing model, through a paradigm called OT extension. A few OT instances generated in an offline phase can be used to perform many OTs in an online phase efficiently, i.e., with very low communication and computational overheads.
Specifically, traditional OT extension uses a small number of “base” OTs, generated using any black-box OT protocol, and convert them into many OT instances using only lightweight symmetric-key primitives. Recently, a new paradigm of OT with a public-key setup has emerged, which replaces the base OTs with a non-interactive setup: Using only the public key of the other party, two parties can efficiently compute a virtually unbounded number of OT instances on-the-fly.
In this paper, we put forth a novel framework for OT extension with a public-key setup (henceforth, “public-key OT”) and concretely efficient instantiations. Implementations of our framework are 30–100× faster when compared to the previous state-of-the-art public-key OT protocols, and remain competitive even when compared to OT protocols that do not offer a public-key setup. Additionally, our instantiations result in the first public-key schemes with plausible post-quantum security.
In summary, this paper contributes:
- QuietOT: A framework for OT extension with public-key setup that uses fast, symmetric-key primitives to generate OT instances following a one-time public-key setup, and offering additional features such as precomputability.
- A public-key setup for QuietOT from the RingLWE assumption, resulting in the first post-quantum construction of OT extension with a public-key setup.
- An optimized, open-source implementation of our construction that can generate up to 1M OT extensions per second on commodity hardware. In contrast, the state-of-the-art public-key OT protocol is limited to at most 20K OTs per second.
- The first formal treatment of the security of OT with a public-key setup in a multi-party setting, which addresses several subtleties that were overlooked in prior work.
2024
ASIACRYPT
Post-Quantum Asynchronous Remote Key Generation for FIDO2
Abstract
The Fast IDentity Online (FIDO) Alliance has developed
the widely adopted FIDO2 protocol suite that allows for passwordless
online authentication. Cryptographic keys stored on a user’s device (e.g.
their smartphone) are used as credentials to authenticate to services by
performing a challenge-response protocol. Yet, this approach leaves users
unable to access their accounts in case their authenticator is lost.
The device manufacturer Yubico thus proposed a FIDO2-compliant mech-
anism that allows to easily create backup authenticators. Frymann et
al. (CCS 2020) have first analyzed the cryptographic core of this pro-
posal by introducing the new primitive of Asynchronous Remote Key
Generation (ARKG) and accompanying security definitions. Later works
instantiated ARKG both from classical and post-quantum assumptions
(ACNS 2023, EuroS&P 2023).
As we will point out in this paper, the security definitions put forward
and used in these papers do not adequately capture the desired security
requirements in FIDO2-based authentication and recovery. This issue
was also identified in independent and concurrent work by Stebila and
Wilson (AsiaCCS 2024), who proposed a new framework for the analy-
sis of account recovery mechanisms, along with a secure post-quantum
instantiation from KEMs and key-blinding signature schemes.
In this work, we propose alternative security definitions for the primitive
ARKG when used inside an account recovery mechanism in FIDO2. We
give a secure instantiation from KEMs and standard signature schemes,
which may in particular provide post-quantum security. Our solution
strikes a middle ground between the compact, but (for this particular
use case) inadequate security notions put forward by Frymann et al.,
and the secure, but more involved and highly tailored model introduced
by Stebila and Wilson.
2024
ASIACRYPT
Signature-based Witness Encryption with Compact Ciphertext
Abstract
Signature-based witness encryption (SWE) is a recently proposed notion that allows to encrypt a message with respect to a tag $T$ and a set of signature verification keys. The resulting ciphertext can only be decrypted by a party who holds at least $k$ different valid signatures w.r.t. $T$ and $k$ different verification keys out of the $n$ keys specified at encryption time. Natural applications of this primitive involve distributed settings (e.g., blockchains), where multiple parties sign predictable messages, such as polling or randomness beacons. However, known SWE schemes without trusted setup have ciphertexts that scale linearly in the number of verification keys. This quickly becomes a major bottleneck as the system gets more distributed and the number of parties increases.
Towards showing the feasibility of SWE with ciphertext size sub-linear in the number of keys, we give a construction based on indistinguishability obfuscation (iO) for Turing machines and a new flavour of puncturable signatures that we call \emph{strongly} puncturable signatures (SPS). SPS allows to generate key pairs which are strongly punctured at a message $T$, meaning that with overwhelming probability no valid signature exists for message $T$ under the punctured key pair. Moreover, punctured keys are indistinguishable from standard non-punctured keys.
2024
ASIACRYPT
Extractable Witness Encryption for KZG Commitments and Efficient Laconic OT
Abstract
We present a concretely efficient and simple extractable witness encryption scheme for KZG polynomial commitments.
It allows to encrypt a message towards a triple $(\mathsf{com}, \alpha, \beta)$, where $\mathsf{com}$ is a KZG commitment for some polynomial $f$.
Anyone with an opening for the commitment attesting $f(\alpha) = \beta$ can decrypt, but without knowledge of a valid opening the message is computationally hidden.
Our construction is simple and highly efficient. The ciphertext is only a single group element. Encryption and decryption both require a single pairing evaluation and a constant number of group operations.
Using our witness encryption scheme, we construct a simple and highly efficient laconic OT protocol, which significantly outperforms the state of the art in most important metrics.
2024
ASIACRYPT
Updatable Privacy-Preserving Blueprints
Abstract
Privacy-preserving blueprint schemes (Kohlweiss et al., EUROCRYPT'23) offer a mechanism for safeguarding user's privacy while allowing for specific legitimate controls by a designated auditor agent. These schemes enable users to create escrows encrypting the result of evaluating a function y=P(t,x), with P being publicly known, t a secret used during the auditor's key generation, and x the user's private input.
Crucially, escrows only disclose the blueprinting result y=P(t,x) to the designated auditor, even in cases where the auditor is fully compromised.
The original definition and construction only support the evaluation of functions P on an input x provided by a single user.
We address this limitation by introducing updatable privacy-preserving blueprint schemes (UPPB), which enhance the original notion with the ability for multiple users to non-interactively update the private user input x while blueprinting.
Moreover, UPPBs contain a proof that y is the result of a sequence of valid updates, while revealing nothing else about the private inputs {x_i} of updates.
As in the case of privacy-preserving blueprints, we first observe that UPPBs can be realized via a generic construction for arbitrary predicates P based on FHE and NIZKs.
Our main result is uBlu, an efficient instantiation for a specific predicate comparing the values x and t, where x is the cumulative sum of users' private inputs and t is a fixed private value provided by the auditor in the setup phase.
This rather specific setting already finds interesting applications
such as privacy-preserving anti-money laundering and location tracking, and can be extended to support more generic predicates.
From the technical perspective, we devise a novel technique to keep the escrow size concise, independent of the number of updates, and reasonable for practical applications. We achieve this via a novel characterization of malleability for the algebraic NIZK by Couteau and Hartmann (CRYPTO’20) that allows for an additive update function.
2024
ASIACRYPT
One-More Unforgeability for Multi- and Threshold Signatures
Abstract
This paper initiates the study of one-more unforgeability for multi-signatures and threshold signatures as a stronger security goal, ensuring that $\ell$ executions of a signing protocol cannot result in more than $\ell$ signatures. This notion is widely used in the context of blind signatures, but we argue that it is a convenient way to model strong unforgeability for other types of distributed signing protocols. We provide formal security definitions for one-more unforgeability (OMUF) and show that the HBMS multi-signature scheme does not satisfy this definition, whereas MuSig and MuSig2 do. In the full version of this paper, we also show that mBCJ does not satisfy OMUF, as well as expose a subtle issue with its existential unforgeability. For threshold signatures, FROST satisfies OMUF, but ROAST does not.
2024
ASIACRYPT
Revisiting Pairing-Friendly Curves with Embedding Degrees 10 and 14
Abstract
Since 2015, there has been a significant decrease in the asymptotic complexity of computing discrete logarithms in finite fields. As a result, the key sizes of many mainstream pairing-friendly curves have to be updated to maintain the desired security level. In PKC'20, Guillevic conducted a comprehensive assessment of the security of a series of pairing-friendly curves with embedding degrees ranging from $9$ to $17$. In this paper, we focus on five pairing-friendly curves with embedding degrees 10 and 14 at the 128-bit security level, with BW14-351 emerging as the most competitive candidate. First, we extend the optimized formula for the optimal pairing on BW13-310, a 128-bit secure curve with a prime $p$ in 310 bits and embedding degree $13$, to our target curves. This generalization allows us to compute the optimal pairing in approximately $\log r/(2\varphi(k))$ Miller iterations, where $r$ and $k$ are the order of pairing groups and the embedding degree respectively. Second, we develop optimized algorithms for cofactor multiplication for $\G_1$ and $\G_2$, as well as subgroup membership testing for $\G_2$ on these curves. Finally, we provide detailed performance comparisons between BW14-351 and other popular curves on a 64-bit platform in terms of pairing computation, hashing to $\G_1$ and $\G_2$, group exponentiations, and subgroup membership testings. Our results demonstrate that BW14-351 is a strong candidate for building pairing-based cryptographic protocols.
2024
ASIACRYPT
Low Communication Threshold Fully Homomorphic Encryption
Abstract
We study constructions of threshold fully homomorphic encryption with small partial decryption shares. In this context, we discuss in details the technicalities for achieving full-fledged threshold FHE, and bring light to limitations regarding prior works, inclucing an attack against the recent construction from Boudgoust and Scholl [ASIACRYPT 2023]. In light of our observations, we generalize the definition of threshold fully homomorphic encryption by adding an algorithm which allows to sanitize evaluated ciphertexts before they are decrypted by parties. In this setting, we are able to propose a construction which offers small partial decryption shares and avoids exponential noise flooding during partial decryption. In addition, we also propose an alternative protocol based on circuit-private (non-threshold) FHE and threshold public-key encryption for private delegation of computation on joint data, but requires an additional round of communication compared to our threshold FHE construction.
2024
ASIACRYPT
Updatable Private Set Intersection Revisited: Extended Functionalities, Deletion, and Worst-Case Complexity
Abstract
Private set intersection (PSI) allows two mutually distrusting parties each holding a private set of elements, to learn the intersection of their sets without revealing anything beyond the intersection. Recent work (Badrinarayanan et al., PoPETS'22) initiates the study of updatable PSI (UPSI), which allows the two parties to compute PSI on a regular basis with sets that constantly get updated, where both the computation and communication complexity only grow with the size of the small updates and not the large entire sets. However, there are several limitations of their presented protocols. First, they can only be used to compute the plain PSI functionality and do not support extended functionalities such as PSI-Cardinality and PSI-Sum. Second, they only allow parties to add new elements to their existing set and do not support arbitrary deletion of elements. Finally, their addition-only protocols either require both parties to learn the output or only achieve low complexity in an amortized sense and incur linear worst-case complexity.
In this work, we address all the above limitations. In particular, we study UPSI with semi-honest security in both the addition-only and addition-deletion settings. We present new protocols for both settings that support plain PSI as well as extended functionalities including PSI-Cardinality and PSI-Sum, achieving one-sided output (which implies two-sided output). In the addition-only setting, we also present a protocol for a more general functionality Circuit-PSI that outputs secret shares of the intersection. All of our protocols have worst-case computation and communication complexity that only grow with the set updates instead of the entire sets (except for a polylogarithmic factor). We implement our new UPSI protocols and compare with the state-of-the-art protocols for PSI and extended functionalities. Our protocols compare favorably when the total set sizes are sufficiently large, the new updates are sufficiently small, or in networks with low bandwidth.
2024
ASIACRYPT
Honest Majority GOD MPC with O(depth(C)) Rounds and Low Online Communication
Abstract
In the context of secure multiparty computation (MPC) protocols with guaranteed output delivery (GOD) for the honest majority setting, the state-of-the-art in terms of communication is the work of (Goyal et al. CRYPTO'20), which communicates O(n|C|) field elements, where |C| is the size of the circuit being computed and n is the number of parties. Their round complexity, as usual in secret-sharing based MPC, is proportional to O(depth(C)), but only in the optimistic case where there is no cheating. Under attack, the number of rounds can increase to \Omega(n^2) before honest parties receive output, which is undesired for shallow circuits with depth(C) << n^2. In contrast, other protocols that only require O(depth(C) rounds even in the worst case exist, but the state-of-the-art from (Choudhury and Patra, Transactions on Information Theory, 2017) still requires \Omega(n^4|C|) communication in the offline phase, and \Omega(n^3|C|) in the online (for both point-to-point and broadcast channels). We see there exists a tension between efficient communication and number of rounds. For reference, the recent work of (Abraham et al., EUROCRYPT'23) shows that for perfect security and t<n/3, protocols with both linear communication and O(depth(C)) rounds exist.
We address this state of affairs by presenting a novel honest majority GOD protocol that maintains O(depth(C)) rounds, even under attack, while improving over the communication of the most efficient protocol in this setting by Choudhury and Patra. More precisely, our protocol has point-to-point (P2P) online communication of O(n|C|), accompanied by O(n|C|) broadcasted (BC) elements, while the offline has O(n^3|C|) P2P communication with O(n^3|C|) BC. This improves over the previous best result, and reduces the tension between communication and round complexity. Our protocol is achieved via a careful use of packed secret-sharing in order to improve the communication of existing verifiable secret-sharing approaches, although at the expense of weakening their robust guarantees: reconstruction of shared values may fail, but only if the adversary gives away the identities of many corrupt parties. We show that this less powerful notion is still useful for MPC, and we use this as a core building block in our construction. Using this weaker VSS, we adapt the recent secure-with-abort Turbopack protocol (Escudero et al. CCS'22) to the GOD setting without significantly sacrificing in efficiency.
2024
ASIACRYPT
Delegatable Anonymous Credentials From Mercurial Signatures With Stronger Privacy
Abstract
Delegatable anonymous credentials (DACs) enable a root issuer to delegate credential-issuing power, allowing a delegatee to take a delegator role. To preserve privacy, credential recipients and verifiers should not learn anything about intermediate issuers in the delegation chain. One particularly efficient approach to constructing DACs is due to Crites and Lysyanskaya (CT-RSA '19). In contrast to previous approaches, it is based on mercurial signatures (a type of equivalence-class signature), offering a conceptually simple design that does not require extensive use of zero-knowledge proofs. Unfortunately, current constructions of ``CL-type'' DACs only offer a weak form of privacy-preserving delegation: if an adversarial issuer (even an honest-but-curious one) is part of a user's delegation chain, they can detect when the user shows its credential. This is because the underlying mercurial signature schemes allows a signer to identify his public key in a delegation chain.
We propose CL-type DACs that overcome the above limitation based on a new mercurial signature scheme that provides adversarial public key class hiding which ensures that adversarial signers who participate in a user's delegation chain cannot exploit that fact to trace users. We achieve this introducing structured public parameters for each delegation level. Since the related setup produces critical trapdoors, we discuss techniques from updatable structured reference strings in zero-knowledge proof systems (Groth et al. CRYPTO '18) to guarantee the required privacy needs. In addition, we propose a simple way to realize revocation for CL-type DACs via the concept of revocation tokens. While we showcase this approach to revocation using our DAC scheme, it is generic and can be applied to any CL-type DAC system. Revocation is a vital feature that is largely unexplored and notoriously hard to achieve for DACs, thus providing it can help to make DAC schemes more attractive in practical applications.
2024
ASIACRYPT
Unclonable Secret Sharing
Abstract
Unclonable cryptography utilizes the principles of quantum mechanics to addresses cryptographic tasks that are impossible classically. We introduce a novel unclonable primitive in the context of secret sharing, called unclonable secret sharing (USS). In a USS scheme, there are
shareholders, each holding a share of a classical secret represented as a quantum state. They can recover the secret once all parties (or at least
parties) come together with their shares. Importantly, it should be infeasible to copy their own shares and send the copies to two non-communicating parties, enabling both of them to recover the secret.
Our work initiates a formal investigation into the realm of unclonable secret sharing, shedding light on its implications, constructions, and inherent limitations.
** Connections: We explore the connections between USS and other quantum cryptographic primitives such as unclonable encryption and position verification, showing the difficulties to achieve USS in different scenarios.
**Limited Entanglement: In the case where the adversarial shareholders do not share any entanglement or limited entanglement, we demonstrate information-theoretic constructions for USS.
**Large Entanglement: If we allow the adversarial shareholders to have unbounded entanglement resources (and unbounded computation), we prove that unclonable secret sharing is impossible. On the other hand, in the quantum random oracle model where the adversary can only make a bounded polynomial number of queries, we show a construction secure even with unbounded entanglement.
Furthermore, even when these adversaries possess only a polynomial amount of entanglement resources, we establish that any unclonable secret sharing scheme with a reconstruction function implementable using Cliffords and logarithmically many T-gates is also unattainable.
2024
ASIACRYPT
Dishonest Majority Constant-Round MPC with Linear Communication from DDH
Abstract
In this work, we study constant round multiparty computation (MPC) for Boolean circuits against a fully malicious adversary who may control up to $n-1$ out of $n$ parties. Without relying on fully homomorphic encryption (FHE), the best-known results in this setting are achieved by Wang et al. (CCS 2017) and Hazay et al. (ASIACRYPT 2017) based on garbled circuits, which require a quadratic communication in the number of parties $O(|C|\cdot n^2)$. In contrast, for non-constant round MPC, the recent result by Rachuri and Scholl (CRYPTO 2022) has achieved linear communication $O(|C|\cdot n)$.
In this work, we present the first concretely efficient constant round MPC protocol in this setting with linear communication in the number of parties $O(|C|\cdot n)$. Our construction can be based on any public-key encryption scheme that is linearly homomorphic for public keys. Our work gives a concrete instantiation from a variant of the El-Gamal Encryption Scheme assuming the DDH assumption. The analysis shows that when the computational security parameter $\lambda=128$ and statistical security parameter $\kappa=80$, our protocol achieves a smaller communication than Wang et al. (CCS 2017) when there are $16$ parties for AES circuit, and $8$ parties for general Boolean circuits (where we assume that the numbers of AND gates and XOR gates are the same). When comparing with the recent work by Beck et al. (CCS 2023) that achieves constant communication complexity $O(|C|)$ in the strong honest majority setting ($t<(1/2-\epsilon)n$ where $\epsilon$ is a constant), our protocol is better as long as $n<3500$ (when $t=n/4$ for their work).
2024
ASIACRYPT
Strongly Secure Universal Thresholdizer
Abstract
A universal thresholdizer (UT), constructed from a threshold fully homomorphic
encryption by Boneh et. al, Crypto 2018, is a general framework for universally
thresholdizing many cryptographic schemes. However, their framework
is insufficient to construct strongly secure threshold schemes, such as threshold
signatures and threshold public-key encryption, etc.
In this paper, we strengthen the security definition for a universal thresholdizer
and propose a scheme which satisfies our stronger security notion. Our UT
scheme is an improvement of Boneh et. al ’s construction in the level of threshold
fully homomorphic encryption using a key homomorphic pseudorandom function.
We apply our strongly secure UT scheme to construct strongly secure
threshold signatures and threshold public-key encryption.
2024
ASIACRYPT
Interval Key-Encapsulation Mechanism
Abstract
Forward-Secure Key-Encapsulation Mechanism (FS-KEM; Canetti et al. Eurocrypt 2003) allows Alice to encapsulate a key k to Bob for some time t such that Bob can decapsulate it at any time t'=<t. Crucially, a corruption of Bob's secret key after time t does not reveal k.
In this work, we generalize and extend this idea by also taking Post-Compromise Security (PCS) into account and call it Interval Key-Encapsulation Mechanism (IKEM). Thus, we do not only protect confidentiality of previous keys against future corruptions but also confidentiality of future keys against past corruptions. For this, Bob can regularly renew his secret key and inform others about the corresponding public key. IKEM enables Bob to decapsulate keys sent to him over an interval of time extending into the past, in case senders have not obtained his latest public key; forward security only needs to hold with respect to keys encapsulated before this interval. This basic IKEM variant can be instantiated based on standard KEM, which we prove to be optimal in terms of assumptions as well as ciphertext and key sizes.
We also extend this notion of IKEM for settings in which Bob decapsulates (much) later than Alice encapsulates (e.g., in high-latency or segmented networks): if a third user Charlie forwards Alice's ciphertext to Bob and, additionally, knows a recently renewed public key of Bob's, Charlie could re-encrypt the ciphertext for better PCS. We call this extended notion IKEMR. Our first IKEMR construction based on trapdoor permutations has (almost) constant sized ciphertexts in the number of re-encryptions; and our second IKEMR construction based on FS-PKE has constant sized public keys in the interval size.
Finally, to bypass our lower bound on the IKEM(R) secret key size, which must be linear in the interval size, we develop a new Interval RAM primitive with which Bob only stores a constant sized part of his secret key locally, while outsourcing the rest to a (possibly adversarial) server.
For all our constructions, we achieve security against active adversaries. For this, we obtain new insights on Replayable CCA security for KEM-type primitives, which might be of independent interest.
2024
ASIACRYPT
Count Corruptions, Not Users: Improved Tightness for Signatures, Encryption and Authenticated Key Exchange
Abstract
In the multi-user with corruptions (muc) setting there are $n\geq 1$ users, and the goal is to prove that, even in the face of an adversary that adaptively corrupts users to expose their keys, un-corrupted users retain security. This can be considered for many primitives including signatures and encryption. Proofs of muc security, while possible, generally suffer a factor $n$ loss in tightness, which can be large. This paper gives new proofs where this factor is reduced to the number $c$ of corruptions, which in practice is much smaller than $n$. We refer to this as corruption-parametrized muc (cp-muc) security. We give a general result showing it for a class of games that we call local. We apply this to get cp-muc security for signature schemes (including ones in standards and in TLS 1.3) and some forms of public-key and symmetric encryption. Then we give dedicated cp-muc security proofs for some important schemes whose underlying games are not local, including the Hashed ElGamal and Fujisaki-Okamoto KEMs and authenticated key exchange. Finally, we give negative results to show optimality of our bounds.
2024
ASIACRYPT
Modelling Ciphers with Overdefined Systems of Quadratic Equations: Application to Friday, Vision, RAIN and Biscuit
Abstract
Overdefined polynomial systems have the potential to lead to reduced complexity in solving procedures. In this work, we study how to overdefine the system of equations to describe the arithmetic oriented (AO) ciphers Friday, Vision, and RAIN, as well as a special system of quadratic equations over $\mathbb F_{2^{\ell}}$ used in the post-quantum signature scheme Biscuit. Our method is inspired by Courtois-Pieprzyk's and Murphy-Robshaw's methods to model AES with overdefined systems of quadratic equations over $\mathbb F_2$ and $\mathbb F_{2^8}$, respectively. However, our method is more refined and much simplified compared with Murphy-Robshaw's method, since it can take full advantage of the low-degree $\mathbb F_2$-linearized affine polynomials used in Friday and Vision, and the overdefined system of equations over $\mathbb F_{2^{\ell}}$ can be described in a clean way with our method. For RAIN, we instead consider quadratic Boolean equations rather than equations over large finite fields $\mathbb F_{2^{\ell}}$. Specifically, we demonstrate that the special structure of RAIN allows us to set up much more linearly independent quadratic Boolean equations than those obtained only with Courtois-Pieprzyk's method. Moreover, we further demonstrate that the underlying key-recovery problem in Biscuit (NIST PQC Round 1 Additional Signatures) can also be described by solving a much overdefined system of quadratic equations over $\mathbb F_{2^{\ell}}$. On the downside, the constructed systems of quadratic equations for these ciphers cannot be viewed as semi-regular, which makes it challenging to upper bound the complexity of the Gr\"{o}bner basis attack. However, such a new modelling method can significantly improve the lower bound of the complexity of the Gr\"{o}bner basis attacks on these ciphers, i.e., we view the complexity of solving a random system of quadratic equations of the same scale as the lower bound. How to better estimate the upper and lower bounds of the Gr\"{o}bner basis attacks on these ciphers based on our modelling method is left as an open problem.
2024
ASIACRYPT
FLI: Folding Lookup Instances
Abstract
We introduce two folding schemes for lookup instances: FLI and FLI+SOS. Both use a PIOP to check that a matrix has elementary basis vectors as rows, with FLI+SOS adding a twist based on Lasso’s [STW23] SOS-decomposability.
FLI takes two lookup instances {a_1}, {a_2} ⊆ {t}, and expresses them as matrix equations 𝑀_𝑖 · t^T = a_i^T for i=1,2, where each matrix 𝑀_𝑖 ∈ F^{𝑚 × 𝑁} has rows which are elementary basis vectors in F^𝑁. Matrices that satisfy this condition are said to be in R_{elem}. Then, a folding scheme for R_{elem} into a relaxed relation is used, which combines the matrices 𝑀_1, 𝑀_2 as 𝑀_1 + 𝛼 𝑀_2 for a random 𝛼 ∈ F. Finally, the lookup equations are combined as (𝑀_1 + 𝛼 𝑀_2)* t^T = (a_1 + 𝛼 a_2)^T. In FLI, only the property that a matrix is in R_{elem} is folded, and this makes the FLI folding step the cheapest among existing solutions. The price to pay is in the cost for proving accumulated instances.
FLI+SOS builds upon FLI to enable folding of large SOS-decomposable [STW23] tables. This is achieved through a variation of Lasso's approach to SOS-decomposability, which fits FLI naturally. For comparison, we describe (for the first time to our knowledge) straightforward variations of Protostar [BC23] and Proofs for Deep Thought [BC24] that also benefit from SOS-decomposability. We see that for many reasonable parameter choices, and especially those arising from lookup-based zkVMs [AST23], FLI+SOS can concretely be the cheapest folding solution.
2024
ASIACRYPT
Password-Protected Threshold Signatures
Abstract
We witness increase in applications like cryptocurrency wallets, which involve users issuing signatures using private keys. To protect these keys from loss or compromise, users commonly outsource them to a custodial server. This creates a new point of failure, because compromise of such server leaks the user’s key, and if user authentication is implemented with a password then this password becomes open to an offline dictionary attack (ODA). A better solution is to secret-share the key among a set of servers, possibly including user’s own device(s), and implement password authentication and signature computation using threshold cryptography.
We propose a notion of augmented password protected threshold signature scheme (aptSIG) which captures the best possible security level for this setting. Using standard threshold cryptography techniques, i.e. threshold password authentication and threshold signatures, one can guarantee that compromising up to t out of n servers reveals no information on either the key or the password. However, we extend this with a novel property, namely that compromising even all n servers also does not leak any information, except via an unavoidable ODA attack, which reveals the key (and the password) only if the attacker guesses the password.
We define aptSIG in the Universally Composable (UC) framework and show that it can be constructed very efficiently, using a black-box composition of any UC threshold signature [12] and a UC augmented Password-Protected Secret Sharing (aPPSS), which we define as an extension of prior notion of PPSS [26]. As concrete instantiations we obtain secure aptSIG schemes for ECDSA and BLS signatures with very small overhead over the respective respective threshold signature.
2024
ASIACRYPT
HARTS: High-Threshold, Adaptively Secure, and Robust Threshold Schnorr Signatures
Abstract
Threshold variants of the Schnorr signature scheme have recently been at the center of attention due to their applications to Bitcoin, Ethereum, and other cryptocurrencies.
However, existing constructions for threshold Schnorr signatures among a set of n parties with corruption threshold t_c suffer from at least one of the following drawbacks: (i) security only against static (i.e., non-adaptive) adversaries, (ii) cubic or higher communication cost to generate a single signature, (iii) strong synchrony assumptions on the network, or (iv) t_c+1 are sufficient to generate a signature, i.e., the corruption threshold of the scheme equals its reconstruction threshold. Especially (iv) turns out to be a severe limitation for many asynchronous real-world applications where t_c < n/3 is necessary to maintain liveness, but a higher signing threshold of n-t_c is needed. A recent scheme, ROAST, proposed by Ruffing et al. (ACM CCS `22) addresses (iii) and (iv), but still falls short of obtaining subcubic complexity and adaptive security.
In this work, we present HARTS, the first threshold Schnorr signature scheme to incorporate all these desiderata. More concretely:
- HARTS is adaptively secure and remains fully secure and operational even under asynchronous network conditions in the presence of up to t_c < n/3 malicious parties. This is optimal.
- HARTS outputs a Schnorr signature of size lambda with a near-optimal amortized communication cost of O(lambda n^2 log n) bits and a single online round per signature.
- HARTS is a high-threshold scheme: no fewer than t_r+1 signature shares can be combined to yield a full signature, where any t_r in [t_c,n-t_c) is supported. This especially covers the case t_r >= 2n/3 > 2t_c. This is optimal.
We prove our result in a modular fashion in the algebraic group model. At the core of our construction, we design a new simple and adaptively secure high-threshold AVSS scheme which may be of independent interest.
2024
ASIACRYPT
Anamorphic Authenticated Key Exchange: Double Key Distribution under Surveillance
Abstract
Anamorphic encryptions and anamorphic signatures assume a double key pre-shared between two parties so as to enable the transmission of covert messages. How to securely and efficiently distribute a double key under the dictator's surveillance is a central problem for anamorphic cryptography, especially when the users are forced to surrender their long-term secret keys or even the randomness used in the algorithms to the dictator.
In this paper, we propose Anamorphic Authentication Key Exchange (AM-AKE) to solve the problem. Similar to anamorphic encryption, AM-AKE contains a set of anamorphic algorithms besides the normal algorithms. With the help of the anamorphic algorithms in AM-AKE, the initiator and the responder are able to exchange not only a session key but also a double key. We define robustness and security notions for AM-AKE, and also prove some impossibility results on plain AM-AKE whose anamorphic key generation algorithm only outputs a key-pair. To bypass the impossibility results, we work on two sides.
-- On the one side, for plain AM-AKE, the securities have to be relaxed to resist only passive attacks from the dictator. Under this setting, we propose a generic construction of two-pass plain AM-AKE from a two-pass AKE with partially randomness-recoverable algorithms.
-- On the other side, we consider (non-plain) AM-AKE whose key generation algorithm also outputs an auxiliary trapdoor besides the key-pairs. We ask new properties from AKE: its key generation algorithm has secret extractability and other algorithms have separability. Based on such a two-pass AKE, we propose a generic construction of two-pass (non-plain) AM-AKE. The resulting AM-AKE enjoys not only robustness but also the strong security against any dictator knowing both users' secret keys and even the internal randomness of the AKE algorithms and implementing active attacks.
Finally, we present concrete AM-AKE schemes from the popular SIG+KEM paradigm and three-KEM paradigm for constructing AKE.
2024
ASIACRYPT
Evasive LWE Assumptions: Definitions, Classes, and Counterexamples
Abstract
The evasive LWE assumption, proposed by Wee [Eurocrypt'22 Wee] for constructing a lattice-based optimal broadcast encryption, has shown to be a powerful assumption, adopted by subsequent works to construct advanced primitives ranging from ABE variants to obfuscation for null circuits. However, a closer look reveals significant differences among the precise assumption statements involved in different works, leading to the fundamental question of how these assumptions compare to each other. In this work, we initiate a more systematic study on evasive LWE assumptions:
(i) Based on the standard LWE assumption, we construct simple counterexamples against three private-coin evasive LWE variants, used in [Crypto'22 Tsabary, Asiacrypt'22 VWW, Crypto'23 ARYY] respectively, showing that these assumptions are unlikely to hold.
(ii) Based on existing evasive LWE variants and our counterexamples, we propose and define three classes of plausible evasive LWE assumptions, suitably capturing all existing variants for which we are not aware of non-obfuscation-based counterexamples.
(iii) We show that under our assumption formulations, the security proofs of [Asiacrypt'22 VWW] and [Crypto'23 ARYY] can be recovered, and we reason why the security proof of [Crypto'22 Tsabary] is also plausibly repairable using an appropriate evasive LWE assumption.
2024
ASIACRYPT
Cryptanalysis of Rank-2 Module-LIP with Symplectic Automorphisms
Abstract
At Eurocrypt'24, Mureau et al. formally defined the Lattice Isomorphism Problem for module lattices (module-LIP) in a number field $\mathbb{K}$, and proposed a heuristic randomized algorithm solving module-LIP for modules of rank 2 in $\mathbb{K}^2$ with a totally real number field $\mathbb{K}$, which runs in classical polynomial time for a large class of modules and a large class of totally real number field under some reasonable number theoretic assumptions. In this paper, by introducing a (pseudo) symplectic automorphism of the module, we successfully reduce the problem of solving module-LIP over CM number field to the problem of finding certain symplectic automorphism. Furthermore, we show that a weak (pseudo) symplectic automorphism can be computed efficiently, which immediately turns out to be the desired automorphism when the module is in a totally real number field. This directly results in a provable deterministic polynomial-time algorithm solving module-LIP for rank-2 modules in $\mathbb{K}^2$ where $\mathbb{K}$ is a totally real number field, without any assumptions or restrictions on the modules and the totally real number fields. Moreover, the weak symplectic automorphism can also be utilized to invalidate the omSVP assumption employed in HAWK's forgery security analysis, although it does not yield any actual attacks against HAWK itself.
2024
ASIACRYPT
Mild Asymmetric Message Franking: Illegal-Messages-Only and Retrospective Content Moderation
Abstract
In these years, many messaging platforms have integrated end-to-end (E2E) encryption into their services. This widespread adoption of E2E encryption has triggered a technical tension between user privacy and illegal content moderation. The existing solutions either support only unframeability or deniability, or they are prone to abuse (the moderator can perform content moderation for all messages, whether illegal or not), or they lack mechanisms for retrospective content moderation.
To address the above issues, we introduce a new primitive called \emph{mild asymmetric message franking} (MAMF) to establish illegal-messages-only and retrospective content moderation for messaging systems, supporting unframeability and deniability simultaneously. We provide a framework to construct MAMF, leveraging two new building blocks, which might be of independent interest.
2024
ASIACRYPT
Dense and smooth lattices in any genus
Abstract
The Lattice Isomorphism Problem (LIP) was recently introduced as a new hardness assumption for post-quantum cryptography. The strongest known efficiently computable invariant for LIP is the genus of a lattice. To instantiate LIP-based schemes one often requires the existence of a lattice that (1) lies in some fixed genus, and (2) has some good geometric properties such as a high packing density or small smoothness parameter.
In this work we show that such lattices exist. In particular, building upon classical results by Siegel (1935), we show that essentially any genus contains a lattice with a close to optimal packing density, smoothing parameter and covering radius. We present both how to efficiently compute concrete existence bounds for any genus, and asymptotically tight bounds under weak conditions on the genus.
2024
ASIACRYPT
Mind the Bad Norms: Revisiting Compressed Oracle-based Quantum Indistinguishability Proofs
Abstract
In this work, we revisit the Hosoyamada-Iwata (HI) proof for the quantum CPA security of the 4-round Luby-Rackoff construction and identify a gap that appears to undermine the security proof. We emphasize that this is not an attack, and the construction may still achieve the claimed security level. However, this gap raises concerns about the feasibility of establishing a formal security proof for the 4-round Luby-Rackoff construction. In fact, the issue persists even if the number of rounds is increased arbitrarily. On a positive note, we restore the security of the 4-round Luby-Rackoff construction in the non-adaptive setting, achieving security up to $2^{n/6}$ superposition queries. Furthermore, we establish the quantum CPA security of the 4-round MistyR and 5-round MistyL constructions, up to $2^{n/5}$ and $2^{n/7}$ superposition queries, respectively, where $n$ denotes the size of the underlying permutation.
2024
ASIACRYPT
Improved Quantum Lifting by Coherent Measure-and-Reprogram
Abstract
We give a tighter lifting theorem for security games in the quantum random oracle model. At the core of our main result lies a novel measure-and-reprogram framework that we call coherent reprogramming. This framework gives a tighter lifting theorem for query complexity problems, that only requires purely classical reasoning. As direct applications of our lifting theorem, we first provide a quantum direct product theorem in the average case --- i.e., an enabling tool to determine the hardness of solving multi-instance security games. This allows us to derive in a straightforward manner the hardness of various security games, for example (i) the non-uniform hardness of salted games, (ii) the hardness of specific cryptographic tasks such as the multiple instance version of one-wayness and collision-resistance, and (iii) uniform or non-uniform hardness of many other games.
2024
ASIACRYPT
Generalized Hybrid Search with Applications to Blockchain and Hash Function Security
Abstract
In this work we first examine the hardness of solving various search problems by hybrid quantum-classical strategies, namely, by algorithms that have both quantum and classical capabilities. We then construct a hybrid quantum-classical search algorithm and analyze its success probability.
Regarding the former, for search problems that are allowed to have multiple solutions and in which the input is sampled according to arbitrary distributions we establish their hybrid quantum-classical query complexities—i.e., given a fixed number of classical and quantum queries, determine what is the probability of solving the search task. At a technical level, our results generalize the framework for hybrid quantum-classical search algorithms recently proposed by Rosmanis. Namely, for an arbitrary distribution D on Boolean functions, the probability that an algorithm equipped with t_c classical queries and t_q quantum queries succeeds in finding a preimage of 1 for a function sampled from D is at most v_D(2sqrt(t_c) + 2t_q + 1)^2, where v_D captures the average (over D) fraction of preimages of 1.
Regarding our second contribution, we design a hybrid algorithm which first spends all of its classical queries and in the second stage runs a “modified Grover” in which the initial state depends on the target distribution D. We then show how to analyze its success probability for
arbitrary target distributions and, importantly, its optimality for the uniform and the Bernoulli distribution cases.
As applications of our hardness results, we first revisit and generalize the formal security treatment of the Bitcoin protocol called the Bitcoin backbone [Eurocrypt 2015], to a setting where the adversary has both quantum and classical capabilities, presenting a new hybrid honest majority condition necessary for the protocol to properly operate. Secondly, we re-examine the generic security of hash functions [PKC 2016] against quantum-classical hybrid adversaries.
2024
ASIACRYPT
MuxProofs: Succinct Arguments for Machine Computation from Vector Lookups
Abstract
Proofs for machine computation prove the correct execution of arbitrary programs that operate over fixed instruction sets (e.g., RISC-V, EVM, Wasm).
A standard approach for proving machine computation is to prove a universal set of constraints that encode the full instruction set at each step of the program execution.
This approach incurs a proving cost per execution step on the order of the total sum of instruction constraints for all of the instructions in the set, despite each step of the program only executing a single instruction.
Existing proving approaches that avoid this universal cost per step (and incur only the cost of a single instruction's constraints per step) either fail to provide zero-knowledge or rely on recursive proof composition for which security relies on the heuristic instantiation of the random oracle.
We present new protocols for proving machine execution that resolve these limitations, enabling prover efficiency on the order of only the executed instructions while achieving zero-knowledge and avoiding recursive proofs.
Our core technical contribution is a new primitive that we call a succinct vector lookup argument which enables a prover to build up a machine execution ``on-the-fly''.
We propose succinct vector lookups for both univariate polynomial and multivariate polynomial commitments in which vectors are encoded on cosets of a multiplicative subgroup and on subcubes of the boolean hypercube, respectively.
We instantiate our proofs for machine computation by integrating our vector lookups with existing efficient, succinct non-interactive proof systems for NP.
2024
ASIACRYPT
Multiple-Tweak Differential Attack Against SCARF
Abstract
In this paper, we present the first third-party cryptanalysis of SCARF, a tweakable low-latency block cipher designed to thwart contention-based cache attacks through cache randomization. We focus on multiple-tweak differential attacks, exploiting biases across multiple tweaks. We establish a theoretical framework explaining biases for any number of rounds and verify this framework experimentally. Then, we use these properties to develop a key recovery attack on 7-round SCARF with a time complexity of 2^76, achieving a 98.9% success rate in recovering the 240-bit secret key. Additionally, we introduce a distinguishing attack on the full 8-round SCARF in a multi-key setting, with a complexity of c x 2^67.55, demonstrating that SCARF does not provide 80-bit security under these conditions. We also explore whether our approach could be extended to the single-key model and discuss the implications of different S-box choices on the attack success.
2024
ASIACRYPT
Registered FE beyond Predicates: (Attribute-Based) Linear Functions and more
Abstract
This paper introduces the first registered functional encryption RFE scheme tailored for linear functions. Distinctly different from classical functional encryption (FE), RFE addresses the key-escrow issue and negates the master key exfiltration attack. Instead of relying on a centralized trusted authority, it introduces a “key curator” - a fully transparent entity that does not retain secrets. In an RFE framework, users independently generate secret keys and subsequently register their respective public keys, along with their authorized functions, with the key curator. This curator consolidates public keys from various users into a unified, concise master public key. For decryption, users occasionally secure helper decryption keys from the key curator, which they use in conjunction way with their private keys. It is imperative that the aggregate public key, helper decryption keys, ciphertexts, and the times for encryption/decryption are polylogarithmic in the number of registered users.
All existing RFE designs were confined to predicates where given the correct credentials a user can retrieve the entire payload from a ci- phertext or gain no information about it otherwise. Contrarily, our RFE scheme facilitates the computation of linear functions on encrypted con- tent and extraction of only the computation results. Recognizing poten- tial leaks from linear functions, we further enhance our RFE by incor- porating an attribute-based access control mechanism. The outcome is the first registered attribute-based linear FE (RABIPFE), which sup- ports access policies depicted as linear secret sharing schemes LSSS. Our proposed schemes are realized in the common reference string (CRS) model as introduced by Hohenberger et al.[EUROCRYPT 2023], employ simple tools and black-box methods. Specifically, our constructions op- erate in asymmetric prime-order bilinear group setting and are proven secure in the generic bilinear group model. Aligning with all pre-existing black-box RFE designs within the CRS model, our schemes cater to a predetermined maximum user count. A notable variant of our RABIPFE scheme also yields the first efficient registered ABE (RABE) system for LSSS access policies in asymmetric prime-order bilinear groups. Conclusively, demonstrating feasibility, we formulated an RFE blueprint that supports general functionalities and an infinite user base, leveraging indistinguishability obfuscation and one-way functions.
2024
ASIACRYPT
Tighter Proofs for PKE-to-KEM Transformation in the Quantum Random Oracle Model
Abstract
In this work, we provide new, tighter proofs for the $T_{RH}$-transformation by Jiang {et al.} (ASIACRYPT 2023), which converts OW-CPA secure PKEs into KEMs with IND-1CCA security, a variant of typical IND-CCA security where only a single decapsulation query is allowed. Such KEMs are efficient and have been shown sufficient for real-world applications by Huguenin-Dumittan and Vaudenay at EUROCRYPT 2022. We reprove Jiang {et al.}'s $T_{RH}$-transformation in both the random oracle model (ROM) and the quantum random oracle model (QROM), for the case where the underlying PKE is rigid deterministic. In both ROM and QROM models, our reductions achieve security loss factors of $\bigO{1}$, significantly improving Jiang {et al.}'s results which have security loss factors of $\bigO{q}$ in the ROM and $\bigO{q^2}$ in the QROM respectively. Notably, central to our tight QROM reduction is a new tool called ``reprogram-after-measure'', which overcomes the reduction loss posed by oracle reprogramming in QROM proofs. This technique may be of independent interest and useful for achieving tight QROM proofs for other post-quantum cryptographic schemes. We remark that our results also improve the reduction tightness of the $T_{H}$-transformation (which also converts PKEs to KEMs) by Huguenin-Dumittan and Vaudenay (EUROCRYPT 2022), as Jiang {et al.} provided a tight reduction from $T_H$-transformation to $T_{RH}$-transformation (ASIACRYPT 2023).
2024
ASIACRYPT
Leakage-Resilient Incompressible Cryptography: Constructions and Barriers
Abstract
We introduce Leakage-Resilient Incompressible cryptography, which simultaneously addresses two variants of side-channel attacks that have been tackled in theoretical cryptography. Leakage-resilience seeks to provide security against an adversary who learns a part of the secret-key and the entire ciphertext or signature; conversely, incompressible cryptography provides security against an adversary who learns the entire secret-key, but only a part of the ciphertext or signature. However, constructions in either of these security models can fail against an attack in the other model. In this work, we define a new model of security that subsumes both leakage-resilient cryptography and incompressible cryptography, and we present several non-trivial positive and negative results.
On the positive side, first we present a transformation from incompressible symmetric-key encryption (SKE) to leakage-resilient incompressible SKE in the information-theoretic setting. Next, as one of our main results, we construct a leakage-resilient incompressible public-key encryption (PKE), combining an incompressible SKE and a new primitive that we call leakage-resilient non-committing key encapsulation mechanism (LR-NC-KEM). While an incompressible SKE suitable for use in both these constructions already exists in the literature (Dziembowski, CRYPTO 2006), we present a new construction with better parameters, using an appropriate notion of invertible extractors; this leads to corresponding improvements in the final parameters we obtain in these constructions. We also design a leakage-resilient incompressible signature scheme.
On the negative side, we show barriers to significantly improving the parameters we obtain, by showing impossibility of basing the security of such improved schemes on blackbox reductions.
Apart from the general framework and the specific results we obtain, some of the intermediate tools that we define and instantiate, like LR-NC-KEM and invertible extractors, may be of independent interest.
2024
ASIACRYPT
Toward Full n-bit Security and Nonce Misuse Resistance of Block Cipher-based MACs
Abstract
In this paper, we study the security of MAC constructions among those classified by Chen {\it et al.} in ASIACRYPT '21. Precisely, $F^{\text{EDM}}_{B_2}$~(or $\ewcdm$ as named by Cogliati and Seurin in CRYPTO '16), $F^{\text{EDM}}_{B_3}$, $F^{\text{SoP}}_{B_2}$, $F^{\text{SoP}}_{B_3}$ (all as named by Chen {\it et al.}) are proved to be fully secure up to $2^n$ MAC queries in the nonce-respecting setting, improving the previous bound of $\frac{3n}{4}$-bit security. In particular, $F^{\text{SoP}}_{B_2}$ and $F^{\text{SoP}}_{B_3}$ enjoy graceful degradation as the number of queries with repeated nonces grows (when the underlying universal hash function satisfies a certain property called \emph{multi-xor-collision resistance}). To do this, we develop a new tool, namely extended Mirror theory based on two independent permutations to a wide range of $\xi_{\max}$ including inequalities.
We also present matching attacks on $F^{\text{EDM}}_{B_4}$ and $F^{\text{EDM}}_{B_5}$ using $O(2^{3n/4})$ MAC queries and $O(1)$ verification query without using repeated nonces.
2024
ASIACRYPT
Deletions and Dishonesty: Probabilistic Data Structures in Adversarial Settings
Abstract
Probabilistic data structures (PDS) are compact representations of high-volume data that provide approximate answers to queries about the data. They are commonplace in today's computing systems, finding use in databases, networking and more. While PDS are designed to perform well under benign inputs, they are frequently used in applications where inputs may be adversarially chosen. This may lead to a violation of their expected behaviour, for example an increase in false positive rate.
In this work, we focus on PDS that handle approximate membership queries (AMQ). We consider adversarial users with the capability of making adaptive insertions, deletions and membership queries to AMQ-PDS, and analyse the performance of AMQ-PDS under such adversarial inputs.
We argue that deletions significantly empower adversaries, presenting a challenge to enforcing honest behaviour when compared to insertion-only AMQ-PDS.To address this, we introduce a new concept of an honest setting for AMQ-PDS with deletions. By leveraging simulation-based security definitions, we then quantify how much harm can be caused by adversarial users to the functionality of AMQ-PDS. Our resulting bounds only require calculating the maximal false positive probability and insertion failure probability achievable in our novel honest setting.
We apply our results to Cuckoo filters and Counting filters. We show how to protect these AMQ-PDS at low cost, by replacing or composing the hash functions with keyed pseudorandom functions in their construction. This strategy involves establishing practical bounds for the probabilities mentioned above. Using our new techniques, we demonstrate that achieving security against adversarial users making both insertions *and* deletions remains practical.
2024
ASIACRYPT
Partially Non-Interactive Two-Round Lattice-Based Threshold Signatures
Abstract
This paper gives the first lattice-based two-round threshold signature based on standard lattice assumptions for which the first message is independent of the message being signed without relying on fully-homomorphic encryption, and our construction supports arbitrary thresholds.
Our construction provides a careful instantiation of a generic threshold signature construction by Tessaro and Zhu (EUROCRYPT '23) based on specific linear hash functions, which in turns can be seen as a generalization of the FROST scheme by Komlo and Goldberg (SAC '20). Our reduction techniques are new in the context of lattice-based cryptography. Also, our scheme does not use any heavy tools, such as NIZKs or homomorphic trapdoor commitments.
2024
ASIACRYPT
Non-interactive Blind Signatures: Post-quantum and Stronger Security
Abstract
Blind signatures enable a receiver to obtain signatures on messages of its choice without revealing any message to the signer. Round-optimal blind signatures are designed as a two-round interactive protocol between a signer and receiver. Coincidentally, the choice of message is not important in many applications, and is routinely set as a random (unstructured) message by a receiver.
With the goal of designing more efficient blind signatures for such applications, Hanzlik (Eurocrypt '23) introduced a new variant called non-interactive blind signatures (NIBS). These allow a signer to asynchronously generate partial signatures for any recipient such that only the intended recipient can extract a blinded signature for a random message. This bypasses the two-round barrier for traditional blind signatures, yet enables many known applications. Hanzlik provided new practical designs for NIBS from bilinear pairings.
In this work, we propose new enhanced security properties for NIBS as well as provide multiple constructions with varying levels of security and concrete efficiency. We propose a new generic paradigm for NIBS from circuit-private leveled homomorphic encryption achieving optimal-sized signatures (i.e., same as any non-blind signature) at the cost of large public keys. We also investigate concretely efficient NIBS with post-quantum security, satisfying weaker level of privacy as proposed by Hanzlik.
2024
ASIACRYPT
Compute, but Verify: Efficient Multiparty Computation over Authenticated Inputs
Abstract
Traditional notions of secure multiparty computation (MPC) allow mutually distrusting parties to jointly compute a function over their private inputs, but typically do not specify how these inputs are chosen. Motivated by real-world applications where corrupt inputs could adversely impact privacy and operational legitimacy, we consider a notion of authenticated MPC where the inputs are authenticated (for instance, signed using a digital signature) by some certification authority. We propose a generic and efficient compiler that transforms any linear secret sharing based honest-majority MPC protocol into one with input authentication.
Our compiler achieves an ideal notion of authenticated MPC equipped with stronger and more desirable security guarantees than those considered in prior works, while incurring significantly lower computational costs and competitive communication overheads when compared to existing solutions. In particular, we entirely avoid the (potentially expensive) protocol-specific techniques and pre-processing requirements that are inherent to these solutions. For certain corruption thresholds, our compiler additionally preserves the stronger identifiable abort security of the underlying MPC protocol. No existing solution for authenticated MPC achieves this regardless of the corruption threshold.
Along the way, we make several technical contributions that are of independent interest. This includes the notion of distributed proofs of knowledge and concrete realizations of the same for several relations of interest, such as proving knowledge of many popularly used digital signature schemes, and proving knowledge of opening of a Pedersen commitment.
2024
ASIACRYPT
Verifiable Oblivious Pseudorandom Functions from Lattices: Practical-ish and Thresholdisable
Abstract
We revisit the lattice-based verifiable oblivious PRF construction from PKC’21 and remove or mitigate its central three sources of inefficiency. First, applying R´enyi divergence arguments, we eliminate one superpolynomial factor from the ciphertext modulus q, allowing us to reduce the overall bandwidth consumed by RLWE samples by about a factor of four. This necessitates us introducing intermediate unpredictability notions to argue PRF security of the final output in the Random Oracle model. Second, we remove the reliance on the 1D-SIS assumption, which reduces another superpolynomial factor, albeit to a factor that is still superpolynomial. Third, by applying the state-of-the-art in zero-knowledge proofs for lattice statements, we achieve a reduction in
bandwidth of several orders of magnitude for this material. Finally, we give a t-out-of-n threshold variant of the VOPRF for constant t and with trusted setup, based on a n-out-of-n distributed variant of the VOPRF (and without trusted setup).
2024
ASIACRYPT
Robust AE With Committing Security
Abstract
There has been a recent interest to develop and standardize Robust Authenticated Encryption schemes. NIST, for example, is considering an Accordion mode for (wideblock) tweakable blockcipher, with Robust AE as a primary application. At the same time, recent attacks and applications suggest that encryption context needs to be committed. Indeed, committing security is also a design consideration in Accordion mode.
In this work, we give a modular solution for this problem. We first show how to transform any wideblock tweakable blockcipher TE to a Robust AE scheme SE that commits just the key.
The overhead is cheap, just a few finite-field multiplications and blockcipher calls. If one wants to commit the entire encryption context, one can simply hash the context to derive a 256-bit subkey,
and uses SE on that subkey. The use of 256-bit key on SE only means that it has to rely on AES-256 but doesn't require TE to have 256-bit key.
Our approach frees the Accordion designs from consideration of committing security. Moreover, it gives a big saving for several key-committing applications that don't want to pay the inherent hashing cost of full committing.
2024
ASIACRYPT
Homomorphic sign evaluation with a RNS representation of integers
Abstract
In the context of fully-homomorphic-encryption, we consider the representation of large integers by their decomposition over a product of rings (through the Chinese Remainder Theorem) and introduce a new algorithm for the determination of the sign solely through the knowledge of ring-components. Our implementation with 128 bits of security delivers a correct result and a probability higher than 1-1.e-9 in less than 100 milliseconds for 32-bit integers on a laptop.
2024
ASIACRYPT
Perfectly-Secure Multiparty Computation with Linear Communication Complexity over Any Modulus
Abstract
Consider the task of secure multiparty computation (MPC) among n parties with perfect security and guaranteed output delivery, supporting t < n/3 active corruptions. Suppose the arithmetic circuit C to be computed is defined over a finite ring Z/qZ, for an arbitrary q ∈ Z. It is known that this type of MPC over such ring is possible, with communication that scales as O(n|C|), assuming that q scales as Ω(n). However, for constant-size rings Z/qZ where q = O(1), the communication is actually O(n log n|C|) due to the need of the so-called ring extensions. In most natural settings, the number of parties is variable but the “datatypes” used for the computation are fixed (e.g. 64-bit integers). In this regime, no protocol with linear communication exists. In this work we provide an MPC protocol in this setting: perfect security, G.O.D. and t < n/3 active corruptions, that enjoys linear communication O(n|C|), even for constant-size rings Z/qZ. This includes as important particular cases small fields such as F2, and also the ring Z/2k Z. The main difficulty in achieving this result is that widely used techniques such as linear secret-sharing cannot work over constant-size rings, and instead, one must make use of ring extensions that add Ω(log n) over- head, while packing Ω(log n) ring elements in each extension element in order to amortize this cost. We make use reverse multiplication-friendly embeddings (RMFEs) for this packing, and adapt recent techniques in network routing (Goyal et al. CRYPTO’22) to ensure this can be efficiently used for non-SIMD circuits. Unfortunately, doing this naively results in a restriction on the minimum width of the circuit, which leads to an extra additive term in communication of poly(n) · depth(C). One of our biggest technical contributions lies in designing novel techniques to overcome this limitation by packing elements that are distributed across different layers. To the best of our knowledge, all works that have a notion of packing (e.g. RMFE or packed secret-sharing) group gates across the same layer, and not doing so, as in our work, leads to a unique set of challenges and complications.
2024
ASIACRYPT
An Algorithmic Approach to $(2,2)$-isogenies in the Theta Model and Applications to Isogeny-based Cryptography
Abstract
In this paper, we describe an algorithm to compute chains of $(2,2)$-isogenies between products of elliptic curves in the theta model. The description of the algorithm is split into various subroutines to allow for a precise field operation counting.
We present a constant time implementation of our algorithm in Rust and an alternative implementation in SageMath. Our work in SageMath runs ten times faster than a comparable implementation of an isogeny chain using the Richelot correspondence. The Rust implementation runs up to forty times faster than the equivalent isogeny in SageMath and has been designed to be portable for future research in higher-dimensional isogeny-based cryptography.
2024
ASIACRYPT
Generic Differential Key Recovery Attacks and Beyond
Abstract
At Asiacrypt 2022, a holistic key guessing strategy was proposed to yield the most efficient key recovery for the rectangle attack. Recently, at Crypto 2023, a new cryptanalysis technique--the differential meet-in-the-middle (MITM) attack--was introduced. Inspired by these two previous works, we present three generic key recovery attacks in this paper. First, we extend the holistic key guessing strategy from the rectangle to the differential attack, proposing the generic classical differential attack (GCDA). Next, we combine the holistic key guessing strategy with the differential MITM attack, resulting in the generalized differential MITM attack (GDMA). Finally, we apply the MITM technique to the rectangle attack, creating the generic rectangle MITM attack (GRMA). In terms of applications, we improve 12/13-round attacks on AES-256. For 12-round AES-256, by using the GDMA, we reduce the time complexity by a factor of 2^{62}; by employing the GCDA, we reduce both the time and memory complexities by factors of 2^{61} and 2^{56}, respectively. For 13-round AES-256, we present a new differential attack with data and time complexities of 2^{89} and 2^{240}, where the data complexity is 2^{37} times lower than previously published results. These are currently the best attacks on AES-256 using only two related keys. For KATAN-32, we increase the number of rounds covered by the differential attack from 115 to 151 in the single-key setting using the basic differential MITM attack (BDMA) and GDMA. Furthermore, we achieve the first 38-round rectangle attack on SKINNYe-64-256 v2 by using the GRMA.
2024
ASIACRYPT
Proofs for Deep Thought: Accumulation for large memories and deterministic computations
Abstract
An important part in proving machine computation is to prove the correctness of the read and write operations performed from the memory, which we term memory-proving. Previous methodologies required proving Merkle Tree openings or multi-set hashes, resulting in relatively large proof circuits. We construct an efficient memory-proving Incrementally Verifiable Computation (IVC) scheme from accumulation, which is particularly useful for machine computations with large memories and deterministic steps. In our scheme, the IVC prover PIVC has cost entirely independent of the memory size T and only needs to commit to approximately 15 field elements per read/write operation, marking a more than 100X improvement over prior work. We further reduce this cost by employing a modified, accumulation-friendly version of the GKR protocol. In the optimized version, PIVC only needs to commit to 6 small memory-table elements per read/write. If the table stores 32-bit values, then this is equivalent to committing to less than one single field element per read and write. Our modified GKR protocol is also valuable for proving other deterministic computations within the context of IVC. Our memory-proving protocol can be extended to support key-value store.
2024
ASIACRYPT
Key Collisions on AES and Its Applications
Abstract
In this paper, we explore a new type of key collisions called target-plaintext key collisions of AES, which emerge as an open problem in the key committing security and are directly converted into single-block collision attacks on Davies-Meyer (DM) hashing mode.
For this key collision, a ciphertext collision is uniquely observed when a specific plaintext is encrypted under two distinct keys. We introduce an efficient automatic search tool designed to find target-plaintext key collisions.
This tool exploits bit-wise behaviors of differential characteristics and dependencies among operations and internal variables of both data processing and key scheduling parts.
This allows us to hierarchically perform rebound-type attacks to identify key collisions.
As a result, we demonstrate single-block collision attacks on 2/5/6-round AES-128/192/256-DM and semi-free-start collision attacks on 5/7/9-round AES-128/192/256-DM, respectively. To validate our attacks, we provide an example of fixed-target-plaintext key collision/semi-free-start collisions on 9-round AES-256-DM.
Furthermore, by exploiting a specific class of free-start collisions with our tool, we present two-block collision attacks on 3/9-round AES-128/256-DM, respectively.
2024
ASIACRYPT
Extending class group action attacks via sesquilinear pairings
Abstract
We introduce a new tool for the study of isogeny-based cryptography, namely pairings which are sesquilinear (conjugate linear) with respect to the O-module structure of an elliptic curve with CM by an imaginary quadratic field O. We use these pairings to study the security of problems based on the class group action on collections of oriented ordinary or supersingular elliptic curves. This extends work of [CHM+23] and [FFP24].
2024
ASIACRYPT
Threshold PAKE with Security against Compromise of all Servers
Abstract
We revisit the notion of Threshold Password-Authenticated Key Exchange (tPAKE), and we extend it to augmented tPAKE (atPAKE), which protects password information even in case of compromise of all servers, except for allowing an (inevitable) offline dictionary attack. Compared to prior notions of tPAKE this is analogous to replacing symmetric PAKE, where the server stores the user’s password, with an augmented (or asymmetric) PAKE, like OPAQUE [39], where the server stores a password hash, which can be used only as a target in an offline dictionary search for the password. An atPAKE scheme also strictly improves on security of an aPAKE, by secret-sharing the password hash among a set of servers. Indeed, our atPAKE protocol is a natural realization of threshold OPAQUE.
We formalize atPAKE in the framework of Universal Composability (UC), and show practical ways to realize it. All our schemes are generic compositions which interface to any aPAKE used as a sub-protocol, making them easier to adopt. Our main scheme relies on threshold Oblivious Pseudorandom Function (tOPRF), and our independent contribution fixes a flaw in the UC tOPRF notion of [36] and upgrades the tOPRF scheme therein to achieve the fixed definition while preserving its minimal cost and round complexity. The technique we use enforces implicit agreement on arbitrary context information within threshold computation, and it is of general interest.
2024
ASIACRYPT
HELIOPOLIS: Verifiable Computation over Homomorphically Encrypted Data from Interactive Oracle Proofs is Practical
Abstract
Homomorphic encryption (HE) enables computation on encrypted data, which in turn facilitates the outsourcing of computation on private data. However, HE offers no guarantee that the returned result was honestly computed by the cloud. In order to have such guarantee, it is necessary to add verifiable computation (VC) into the system.
The most efficient recent works in VC over HE focus on verifying operations on the ciphertext space of the HE scheme, which usually lacks the algebraic structure that would make it compatible with existing VC systems. For example, multiplication of ciphertexts in the current most efficient HE schemes requires non-algebraic operations such as real division and rounding. Therefore, existing works for VC over HE have to either give up on those efficient HE schemes, or incur a large overhead (an amount of constraints proportional to the ciphertext ring's size) in order to emulate these non-algebraic operations.
In this work, we move away from that paradigm by placing the verification checks in the \emph{plaintext space} of HE, all while the prover remains computing on ciphertexts. We achieve this by introducing a general transformation for Interactive Oracle Proofs (IOPs) to work over HE, whose result we denote as HE-IOPs. We apply this same transformation to the FRI [Ben-Sasson et al., ICALP 2018] IOP of proximity and we show how to compile HE-Reed Solomon-encoded IOPs and HE-$\delta$-correlated-IOPs with HE-FRI into HE-IOPs. Furthermore, our construction is compatible with a prover that provides input in zero-knowledge, and only relies on building blocks that are plausibly quantum-safe.
Aligning the security parameters of HE and FRI is a difficult task for which we introduce several optimizations. We demonstrate their efficiency with a proof-of-concept implementation and show that we can run FRI's commit phase for 4096 encrypted Reed Solomon codewords with degree bound $2^{11}$ in just 5.4 seconds (using 32 threads) on a \texttt{c6i.metal} instance using less than 4GB of memory. Verification takes just 12.3 milliseconds (single-threaded) for the same parameter set and can be reduced to just 5.6ms with parameters optimized for the verifier.
2024
ASIACRYPT
Ultrametric integral cryptanalysis
Abstract
A systematic method to analyze divisibility properties is proposed. In integral cryptanalysis, divisibility properties interpolate between bits that sum to zero (divisibility by two) and saturated bits (divisibility by $2^{n - 1}$ for $2^n$ inputs). From a theoretical point of view, we construct a new cryptanalytic technique that is a non-Archimedean multiplicative analogue of linear cryptanalysis. It lifts integral cryptanalysis to characteristic zero in the sense that, if all quantities are reduced modulo two, then one recovers the algebraic theory of integral cryptanalysis. The new technique leads to a theory of trails. We develop a tool based on off-the-shelf solvers that automates the analysis of these trails and use it to show that many integral distinguishers on Present and Simon are stronger than expected.
2024
ASIACRYPT
Quantum Unpredictability
Abstract
Unpredictable functions (UPFs) play essential roles in classical cryptography, including message authentication codes (MACs) and digital signatures. In this paper, we introduce a quantum analog of UPFs, which we call unpredictable state generators (UPSGs). UPSGs are implied by pseudorandom function-like states generators (PRFSs), which are a quantum analog of pseudorandom functions (PRFs), and therefore UPSGs could exist even if one-way functions do not exist, similar to other recently introduced primitives like pseudorandom state generators (PRSGs), one-way state generators (OWSGs), and EFIs. In classical cryptography, UPFs are equivalent to PRFs, but in the quantum case, the equivalence is not clear, and UPSGs could be weaker than PRFSs. Despite this, we demonstrate that all known applications of PRFSs are also achievable with UPSGs. They include IND-CPA-secure secret-key encryption and EUF-CMA-secure MACs with unclonable tags. Our findings suggest that, for many applications, quantum unpredictability, rather than quantum pseudorandomness, is sufficient.
2024
ASIACRYPT
Tiresias: Large Scale, UC-Secure Threshold Paillier
Abstract
In the threshold version of Paillier's encryption scheme, a set of parties collectively holds the secret decryption key through a secret sharing scheme.
Whenever a ciphertext is to be decrypted, the parties send their decryption shares, which are then verified for correctness and combined into the plaintext.
The scheme has been widely adopted in various applications, from secure voting to general purpose MPC protocols.
However, among the handful of existing proposals for a maliciously secure scheme, one must choose between an efficient implementation that relies on non-standard assumptions or a computationally expensive implementation that relies on widely acceptable assumptions.
In this work, we show that one can enjoy the benefits of both worlds.
Specifically, we adjust a scheme by Damgard et al. (Int. J. Inf. Secur. 2010) to get a practical distributed key generation (DKG). While the original scheme was only known to be secure under ad-hoc non-standard assumptions, we prove that the adjusted scheme is in fact secure under the decisional composite residuosity (DCR) assumption alone, required for the semantic security of the Pallier encryption scheme itself.
This is possible thanks to a novel reduction technique, from computing and proving a false decryption share, to the factoring problem. Specifically, while there may exist false decryption shares for which the zk-proof verifies with non-negligible probability, they are computationally hard to find.
Furthermore, we use similar ideas to prove that batching techniques by Aditya et al. (ACNS 2004), which allows a prover to batch several statements into a single proof, can be applied to our adjusted scheme. This enables a batched threshold Paillier decryption in the fully distributed setting for the first time.
Until now, verifying that a decryption share is correct was the bottleneck of threshold Paillier schemes and hindered real world deployments (unless one is willing to rely on a trusted dealer).
Our work accumulates to shifting the bottleneck back to the plaintext reconstruction, just like in the semi-honest setting, and renders threshold Paillier practical for the first time, supporting large scale deployments.
We exemplify this shift by implementing the scheme and report our evaluation with up to 1000 parties, in the dishonest majority setting.
Over an EC2 c6i machine, we get a throughput of about 50 and 3.6 decryptions per second, when run over a network of 100 and 1000 parties, respectively.
2024
ASIACRYPT
Practical Blind Signatures in Pairing-Free Groups
Abstract
Blind signatures have garnered significant attention in recent years, with several efficient constructions in the random oracle model relying on well-understood assumptions. However, this progress does not apply to pairing-free cyclic groups: fully secure constructions over cyclic groups rely on pairings, remain inefficient, or depend on the algebraic group model or strong interactive assumptions. To address this gap, Chairattana-Apirom, Tessaro, and Zhu (CTZ, Crypto 2024) proposed a new scheme based on the CDH assumption. Unfortunately, their construction results in large signatures and high communication complexity.
In this work, we propose a new blind signature construction in the random oracle model that significantly improves upon the CTZ scheme. Compared to CTZ, our scheme reduces communication complexity by a factor of more than 10 and decreases the signature size by a factor of more than 45, achieving a compact signature size of only 224~Bytes. The security of our scheme is based on the DDH assumption over pairing-free cyclic groups, and we show how to generalize it to the partially blind setting.
2024
ASIACRYPT
ZKFault: Fault attack analysis on zero-knowledge-based post-quantum digital signature schemes
Abstract
Computationally hard problems based on coding theory, such as the syndrome decoding problem, have been used for constructing secure cryptographic schemes for a long time. Schemes based on these problems are also assumed to be secure against quantum computers. However, these schemes are often considered impractical for real-world deployment due to large key sizes and inefficient computation time. In the recent call for standardization of additional post-quantum digital signatures by the National Institute of Standards and Technology, several code-based candidates have been proposed, including LESS, CROSS, and MEDS. These schemes are designed on the relatively new zero-knowledge framework. Although several works analyze the hardness of these schemes, there is hardly any work that examines the security of these schemes in the presence of physical attacks.
In this work, we analyze these signature schemes from the perspective of fault attacks. All these schemes use a similar tree-based construction to compress the signature size. We attack this component of these schemes. Therefore, our
attack is applicable to all of these schemes. In this work, we first analyze the LESS signature scheme and devise our attack. Furthermore, we showed how this attack can be extended to the CROSS signature scheme. Our attacks are built on very simple fault assumptions. Our results show that we can recover the entire secret key of LESS and CROSS using as little as a single fault. Finally, we propose various countermeasures to prevent these kinds of attacks and discuss their efficiency and shortcomings.
2024
ASIACRYPT
Tightly-Secure Group Key Exchange with Perfect Forward Secrecy
Abstract
In this work, we present a new paradigm for constructing Group Authenticated Key Exchange (GAKE). This result is the first tightly secure GAKE scheme in a strong security model that allows maximum exposure attacks (MEX) where the attacker is allowed to either reveal the secret session state or the long-term secret of all communication partners. Moreover, our protocol features the strong and realistic notion of (full) perfect forward secrecy (PFS), that allows the attacker to actively modify messages before corrupting parties. We obtain our results via a series of tightly secure transformations. Our first transformation is from weakly secure KEMs to unilateral authenticated key exchange (UAKE) with weak forward secrecy (WFS). Next, we show how to turn this into an UAKE with PFS in the random oracle model. Finally, and as one of our major novel conceptual contributions, we describe how to build GAKE protocols from UAKE protocols, also in the random oracle model. We apply our transformations to obtain two practical GAKE protocols with tight security. The first is based on the DDH assumption and features low message complexity. Our second result is based on the LWE assumption. In this way, we obtain the first GAKE protocol from a post-quantum assumption that is tightly secure in a strong model of security allowing MEX attacks.
2024
ASIACRYPT
CPA-secure KEMs are also sufficient for Post-Quantum TLS 1.3
Abstract
In the post-quantum migration of TLS 1.3, an ephemeral Diffie-Hellman must be replaced with a post-quantum key encapsulation mechanism (KEM). At EUROCRYPT 2022, Huguenin-Dumittan and Vaudenay [HV22] demonstrated that KEMs with standard CPA security are sufficient for the security of the TLS1.3 handshake. However, their result is only proven in the random oracle model (ROM), and as the authors comment, their reduction is very much non-tight and not sufficient to
guarantee security in practice due to the $O(q^6)$-loss, where $q$ is the number of adversary’s queries to random oracles. Moreover, in order to analyze the post-quantum security of TLS 1.3 handshake with a KEM, it is necessary to consider the security in the quantum ROM (QROM). Therefore, they leave the tightness improvement of their ROM proof and the QROM proof of such a result as an interesting open question.
In this paper, we resolve this problem. We improve the ROM proof in [HV22] from an $O(q^6)$-loss to an $O(q)$-loss with standard CPA-secure KEMs which can be directly obtained from the underlying public-key encryption (PKE) scheme in CRYSTALS-Kyber. Moreover, we show that if the KEMs are constructed from rigid deterministic public-key encryption (PKE) schemes such as the ones in Classic McEliece and NTRU, this $O(q)$-loss can be further improved to an $O(1)$-loss. Hence, our reductions are sufficient to guarantee security in practice. According to our results, a CPA-secure KEM (which is more concise and efficient than the currently used CCA/1CCA-secure KEM) can be directly employed to construct a post-quantum TLS 1.3. Furthermore, we lift our ROM result into QROM and first prove that the CPA-secure KEMs are also sufficient for the post-quantum TLS 1.3 handshake. In particular, the techniques introduced to improve reduction tightness in this paper may be of independent interest.
2024
TCHES
Optimized Software Implementation of Keccak, Kyber, and Dilithium on RV{32,64}IM{B}{V}
Abstract
With the standardization of NIST post-quantum cryptographic (PQC) schemes, optimizing these PQC schemes across various platforms presents significant research value. While most existing software implementation efforts have concentrated on ARM platforms, research on PQC implementations utilizing various RISC-V instruction set architectures (ISAs) remains limited. In light of this gap, this paper proposes comprehensive and efficient optimizations of Keccak, Kyber, and Dilithium on RV{32,64}IM{B}{V}. We thoroughly optimize these implementations for dual-issue CPUs, believing that our work on various RISC-V ISAs will provide valuable insights for future PQC deployments.Specifically, for Keccak, we revisit a range of optimization techniques, including bit interleaving, lane complementing, in-place processing, and hybrid vector/scalar implementations. We construct an optimal combination of methods aimed at achieving peak performance on dual-issue CPUs for various RISC-V ISAs. For the NTT implementations of Kyber and Dilithium, we deliver optimized solutions based on Plantard and Montgomery arithmetic for diverse RISC-V ISAs, incorporating extensive dual-issue enhancements. Additionally, we improve the signed Plantard multiplication algorithm proposed by Akoi et al. Ultimately, our testing demonstrates that our implementations of Keccak and NTT across various ISAs achieve new performance records. More importantly, they significantly enrich the PQC software ecosystem for RISC-V.
2024
TCHES
Efficient and Composable Masked AES S-Box Designs Using Optimized Inverters
Abstract
Hardware implementations of cryptographic algorithms are susceptible to power analysis attacks, allowing attackers to break the otherwise strong security guarantees. A theoretically sound countermeasure against such attacks is masking, where all key- and data-dependent intermediate values in the computation are split into so-called shares, requiring an attacker to learn all of them before recovering the secret key. Masking a cryptographic hardware design against power analysis attacks incurs large area and latency overheads due to their nonlinear components, especially when implemented using composable masking schemes.These overheads disproportionately affect ciphers with highly nonlinear monolithic S-Boxes like the Advanced Encryption Standard (AES). The masking of the AES S-Box is well studied, and most implementations use Canright’s F28 inverter design that decomposes operations in a larger field into a combination of multiplications, additions and inversions in a smaller field. While remarkable, Canright’s inverter design has a sub-optimal multiplicative depth, and can thus not take full advantage of recent developments in low-latency composable masking schemes.In this paper, we present a F28 inverter that achieves the optimal multiplicative depth of three, and use it to construct a more efficient trivially composable masked implementation of the AES S-Box. Moreover, we present HPC3.1, a better low-latency multiplication gadget that works in all finite fields Fpn, and a randomness reuse strategy for both HPC1 and HPC3.1 gadgets that preserves side-channel security. Orthogonally, we also propose an improved bit-level implementation of the F24 inverter for more efficient masked S-Box designs based on Canright’s original F28 inverter.We develop, functionally test, and formally verify the trivially composable side-channel security of all masked AES S-Box designs. Our evaluation shows that the designs outperform or match the state-of-the-art in terms of latency, randomness use and area cost.
2024
TCHES
Random Probing Security with Precomputation
Abstract
At Eurocrypt 2014, Duc, Dziembowski and Faust proposed the random probing model to bridge the gap between the probing model proposed at Crypto 2003 and the noisy model proposed at Eurocrypt 2013. Compared with the probing model whose noise in the leakages should (linearly) increase with the number of shares, the random probing model allows each variable leak its value with a probability p, which reflects the physical reality of side channels much better. In Crypto 2020, Belaïd et al. proposed the Random Probing Expandability (RPE) security ensuring the random probing security for arbitrary order masking algorithms with constant leakage probability. However, the complexity of existing RPE algorithms is much higher than that of the probing secure algorithms, which is short of practical usage. In this paper, we investigate the random probing security with precomputation, where a masked cryptographic implementation can be divided into two phases. The first phase, called preprocessing, takes random bits and returns a number of precomputed values. The second phase, called online computation, takes input (e.g., plaintext and shares of secret) and precomputed values to calculate output (e.g., ciphertext) efficiently. We describe a random probing secure precomputable scheme, which transforms an arbitrary circuit compiler with tolerant leakage probability p into a precomputable one by adding a public (but random) share that is calculated in the online phase and the tolerant leakage probability of the new compiler is min{p, 2−5.01}. Then, we apply the new scheme to the bitsliced AES. Notably, the implementation under ARM Cortex M architecture shows that the performance of the online phase is significantly improved and even comparable to masking schemes only secure in the probing model.
2024
TCHES
Trace Copilot: Automatically Locating Cryptographic Operations in Side-Channel Traces by Firmware Binary Instrumenting
Abstract
A common assumption in side-channel analysis is that the attacker knows the cryptographic algorithm implementation of the victim. However, many labsetting studies implicitly extend this assumption to the knowledge of the source code, by inserting triggers to measure, locate or align the Cryptographic Operations (CO) in the trace. For real-world attacks, the source code is typically unavailable, which poses a challenge for locating the COs thus reducing the effectiveness of many methods. In contrast, obtaining the (partial) binary firmware is more prevalent in practical attacks on embedded devices. While binary code theoretically encapsulates necessary information for side-channel attacks on software-implemented cryptographic algorithms, there is no systematic study on leveraging this information to facilitate side-channel analysis. This paper introduces a novel and general framework that utilizes binary information for the automated locating of COs on side-channel traces. We first present a mechanism that maps the execution flow of binary instructions onto the corresponding side-channel trace through a tailored static binary instrumentation process, thereby transforming the challenge of locating COs into one of tracing cryptographic code execution within the binary. For the latter, we propose a method to retrieve binary instruction addresses that are equivalent to the segmenting boundaries of the COs within side-channel traces. By identifying the mapping points of these instructions on the trace, we can obtain accurate segmentation labeling for the sidechannel data. Further, by employing the well-labeled side-channel segments obtained on a profiling device, we can readily identify the locations of COs within traces collected from un-controllable target devices. We evaluate our approach on various devices and cryptographic software, including a real-world secure boot program. The results demonstrate the effectiveness of our method, which can automatically locate typical COs, such as AES or ECDSA, in raw traces using only the binary firmware and a profiling device. Comparison experiments indicate that our method outperforms existing techniques in handling noisy or jittery traces and scales better to complex COs. Performance evaluation confirms that the runtime and storage overheads of the proposed approach are practical for real-world deployment.
2024
TCHES
Multiplying Polynomials without Powerful Multiplication Instructions
Abstract
We improve the performance of lattice-based cryptosystems Dilithium on Cortex-M3 with expensive multiplications. Our contribution is two-fold: (i) We generalize Barrett multiplication and show that the resulting shape-independent modular multiplication performs comparably to long multiplication on some platforms without special hardware when precomputation is free. We call a modular multiplication “shape-independent” if its correctness and efficiency depend only on the magnitude of moduli and not the shapes of the moduli. This was unknown in the literature even though modular multiplication has been studied for more than 40 years. In the literature, shape-independent modular multiplications often perform several times slower than long multiplications even if we ignore the cost of the precomputation. (ii) We show that polynomial multiplications based on Nussbaumer fast Fourier transform and Toom–Cook over Z2k perform the best when modular multiplications are expensive and k is not very close to the arithmetic precision.For practical evaluation, we implement assembly programs for the polynomial arithmetic used in the digital signature Dilithium on Cortex-M3. For the modular multiplications in Dilithium, our generalized Barrett multiplications are 1.92 times faster than the state-of-the-art assembly-optimized Montgomery multiplications, leading to 1.38−1.51 times faster Dilithium NTT/iNTT. Along with the improvement in accumulating products, the core polynomial arithmetic matrix-vector multiplications are 1.71−1.77 times faster. We further apply the FFT-based polynomial multiplications over Z2k to the challenge polynomial multiplication ct0, leading to 1.31 times faster computation for ct0.We additionally apply the ideas to Saber on Cortex-M3 and demonstrate their improvement to Dilithium and Saber on our 8-bit AVR environment. For Saber on Cortex-M3, we show that matrix-vector multiplications with FFT-based polynomial multiplications over Z2k are 1.42−1.46 faster than the ones with NTT-based polynomial multiplications over NTT-friendly coefficient rings. When moving to a platform with smaller arithmetic precision, such as 8-bit AVR, we improve the matrix-vector multiplication of Dilithium with our Barrett-based NTT/iNTT by a factor of 1.87−1.89. As for Saber on our 8-bit AVR environment, we show that matrixvector multiplications with NTT-based polynomial multiplications over NTT-friendly coefficient rings are faster than polynomial multiplications over Z2k due to the large k in Saber.
2024
TCHES
FalconSign: An Efficient and High-Throughput Hardware Architecture for Falcon Signature Generation
Abstract
Falcon is a lattice-based quantum-resistant digital signature scheme renowned for its high signature generation/verification speed and compact signature size. The scheme has been selected to be drafted in the third round of the post-quantum cryptography (PQC) standardization process due to its unique attributes and robust security features. Despite its strengths, there has been a lack of research on hardware acceleration, primarily due to its complex calculation flow and floating-point operations, which hinders its widespread adoption. To address this issue, we propose FalconSign, a high-performance, configurable crypto-processor designed to accelerate Falcon signature generation on FPGA/ASIC through algorithmhardware co-design. Our approach involves a new scheduling flow and architecture for Fast-Fourier Sampling to enhance computing unit reuse and reduce processing time. Additionally, we introduce several optimized modules, including configurable randomness generation units, parallel floating-point processing units, and an optimized SamplerZ module, to improve execution efficiency. Furthermore, this paper presents a finely optimized hardware accelerator for the Falcon scheme. Our FPGA implementation results demonstrate a throughput improvement of approximately 5.1 x compared to state-of-the-art designs, with 2.8x/4.5x/4.2x/3.2x fewer in the area (LUTs/FFs/DSPs/BRAMs)-time product, for NIST security level V. The crypto-processor occupies an area of 0.71 mm2 and achieves 5.2k OPS at throughput on the TSMC 28nm process for NIST security level I.
2024
TCHES
Call Rewinding: Efficient Backward Edge Protection
Abstract
The prevalence of memory-unsafe software prompts significant efforts by the research community to mitigate memory corruption bugs. This endeavor is crucial for safeguarding critical systems against security threats. Specifically, there is a focus to protect against code-reuse attacks through enforcing control-flow integrity (CFI). This paper introduces call rewinding, a novel microarchitecture-level mechanism for protection of return addresses. It is based on a property of the calling convention that is common to major architectures such as x86, ARM and RISC-V, which states that all return instructions transfer control to a valid call site. Call rewinding consists of jumping to the instruction preceding the return target for each return instruction and checking if the instruction at this address is a call or not. On systems equipped with return address prediction, a commonly employed optimization, the security check is performed only on mispredicted return addresses. The proposed protection mechanism demonstrates negligible impact on both area and performance. We implement call rewinding on the CV64A6, a RISC-V CPU with consequent branch prediction support. Our evaluation validates the effectiveness of call rewinding, both in bare-metal and in a Linux operating system (OS) environment. It triggers no false positives in bare-metal and is functional with the OS extended with a custom exception handler. Furthermore, our findings indicate that call rewinding successfully detects unauthorized return addresses, highlighting its potential as a reliable and efficient security mechanism.
2024
TCHES
OT-PCA: New Key-Recovery Plaintext-Checking Oracle Based Side-Channel Attacks on HQC with Offline Templates
Abstract
In this paper, we introduce OT-PCA, a novel approach for conducting Plaintext-Checking (PC) oracle based side-channel attacks, specifically designed for Hamming Quasi-Cyclic (HQC). By calling the publicly accessible HQC decoder, we build offline templates that enable efficient extraction of soft information for hundreds of secret positions with just a single PC oracle call. Our method addresses critical challenges in optimizing key-related information extraction, including maximizing decryption output entropy and ensuring error pattern independence, through the use of genetic-style algorithms.Extensive simulations demonstrate that our new attack method significantly reduces the required number of oracle calls, achieving a 2.4-fold decrease for hqc-128 and even greater reductions for hqc-192 and hqc-256 compared to current state-of-the-art methods. Notably, the attack shows strong resilience against inaccuracy in the PC oracle—when the oracle accuracy decreases to 95%, the reduction factor in oracle call requirements increases to 7.6 for hqc-128.Lastly, a real-world evaluation conducted using power analysis on a platform with an ARM Cortex-M4 microcontroller validates the practical applicability and effectiveness of our approach.
2024
TCHES
A High-performance NTT/MSM Accelerator for Zero-knowledge Proof Using Load-balanced Fully-pipelined Montgomery Multiplier
Abstract
Zero-knowledge proof (ZKP) is an attractive cryptographic paradigm that allows a party to prove the correctness of a given statement without revealing any additional information. It offers both computation integrity and privacy, witnessing many celebrated deployments, such as computation outsourcing and cryptocurrencies. Recent general-purpose ZKP schemes, e.g., zero-knowledge succinct non-interactive argument of knowledge (zk-SNARK), suffer from time-consuming proof generation, which is mainly bottlenecked by the large-scale number theoretic transformation (NTT) and multi-scalar point multiplication (MSM). To boost its wide application, great interest has been shown in expediting the proof generation on various platforms like GPU, FPGA and ASIC.So far as we know, current works on the hardware designs for ZKP employ two separated data-paths for NTT and MSM, overlooking the potential of resource reusage. In this work, we particularly explore the feasibility and profit of implementing both NTT and MSM with a unified and high-performance hardware architecture. For the crucial operator design, we propose a dual-precision, load-balanced and fully-pipelined Montgomery multiplier (LBFP MM) by introducing the new mixed-radix technique and improving the prior quotient-decoupled strategy. Collectively, we also integrate orthogonal ideas to further enhance the performance of LBFP MM, including the customized constant multiplication, truncated LSB/MSB multiplication/addition and Karatsuba technique. On top of that, we present the unified, scalable and highperformance hardware architecture that conducts both NTT and MSM in a versatile pipelined execution mechanism, intensively sharing the common computation and memory resource. The proposed accelerator manages to overlap the on-chip memory computation with off-chip memory access, considerably reducing the overall cycle counts for NTT and MSM.We showcase the implementation of modular multiplier and overall architecture on the BLS12-381 elliptic curve for zk-SNARK. Extensive experiments are carried out under TSMC 28nm synthesis and similar simulation set, which demonstrate impressive improvements: (1) the proposed LBFP MM obtains 1.8x speed-up and 1.3x less area cost versus the state-of-the-art design; (2) the unified accelerator achieves 12.1x and 5.8x acceleration for NTT and MSM while also consumes 4.3x lower overall on-chip area overhead, when compared to the most related and advanced work PipeZK.
2024
TCHES
GPU Acceleration for FHEW/TFHE Bootstrapping
Abstract
Fully Homomorphic Encryption (FHE) allows computations to be performed directly on encrypted data without decryption. Despite its great theoretical potential, the computational overhead remains a major obstacle for practical applications. To address this challenge, hardware acceleration has emerged as a promising approach, aiming to achieve real-time computation across a wider range of scenarios. In line with this, our research focuses on designing and implementing a Graphic Processing Unit (GPU)-based accelerator for the third generation FHEW/TFHE bootstrapping scheme, which features smaller parameters and bootstrapping keys particularly suitable for GPU architectures compared to the other generations.In summary, our accelerator offers improved efficiency, scalability, and flexibility for extensions, e.g., functional bootstrapping (Liu et al., Asiacrypt 2022), compared to current state-of-the-art solutions. We evaluate our implementation and demonstrate substantial speedup in the single-GPU setting, our bootstrapping achieves an 18x - 20x speedup compared to a 64-thread server-class CPU; by using 8 GPUs, the throughput can be further improved by 7x compared to the single-GPU implementation, confirming the scalability of our design. Furthermore, compared to the SoTA GPU solution TFHE-rs, we achieve a maximum speedup of 1.69x in AND gate evaluation. Finally, we benchmark several private machine learning applications, showing real-time solutions for (1) encrypted neural network inference for MNIST in 0.04 seconds per image, which is the fastest implementation to our knowledge.(2) private decision trees in 0.38 seconds for Iris dataset, where as prior 16 cores CPU implementation (Lu et al., IEEE S&P 2021) required 1.87 seconds; These results highlight the effectiveness and efficiency of our GPU-acceleration in real-world applications.As a technical highlight, we design a novel parallelization strategy tailored for FHEW/TFHE bootstrapping, allowing an automated optimization that partitions bootstrapping into multiple GPU thread blocks. This is necessary for FHEW/TFHE bootstrapping with scalable parameters, where the whole bootstrapping process may not fit into a single thread block. With this, our accelerator can support a broader range of parameters, making it ideal for upcoming privacy-preserving applications.
2024
TCHES
Extending Randomness-Free First-Order Masking Schemes and Applications to Masking-Friendly S-boxes
Abstract
Masking has emerged as a widely adopted countermeasure against side-channel attacks. However, the implementation of masking schemes faces several challenges, including hardware area, latency and the overhead associated with fresh randomness generation. To eliminate the implementation cost caused by fresh randomness, Shahmirzadi et al. introduced a methodology for constructing 2-share first-order masking schemes without randomness at CHES 2021. In this work, we extend Shahmirzadi et al.’s method to find masked implementations for more S-boxes and further reduce the hardware overhead. We propose the concept of a non-linear compression layer, a comprehensive share assignment strategy based on a linear compression layer, and corresponding optimization techniques. Based on these techniques, we construct the first randomness-free first-order masking schemes for the PRINCE S-box and its inverse, reduce the hardware overhead of masking schemes for multiple S-boxes, and design new masking-friendly S-boxes. Particularly for the SKINNY S-box, the reduction is 21% and 15% in area and power consumption, respectively. To validate the security of masked implementations, we not only employ the automated tools SILVER and PROLEAD but also conduct FPGA-based experiments.
2024
TCHES
Full Key-Recovery Cubic-Time Template Attack on Classic McEliece Decapsulation
Abstract
Classic McEliece is one of the three code-based candidates in the fourth round of the NIST post-quantum cryptography standardization process in the Key Encapsulation Mechanism category. As such, its decapsulation algorithm is used to recover the session key associated with a ciphertext using the private key. In this article, we propose a new side-channel attack on the syndrome computation in the decapsulation algorithm that recovers the private key, which consists of the private Goppa polynomial g and the permuted support L. The attack relies on both practical aspects and theoretical contributions, namely that the side-channel distinguisher can accurately discriminate elements of the permuted support L, while relying only on a standard noisy Hamming weight leakage assumption and that there exists a cubic-time algorithm that uses this information to recover the private Goppa polynomial g. Compared with previous work targeting the Classic McEliece private key, this drastically improves both on the assumptions made in the attacker model and on the overall efficiency of the key-recovery algorithm. We have carried out the attack in practice on a microcontroller target running the reference implementation of Classic McEliece, and make the full attack source code available.
2024
TCHES
PhaseSCA: Exploiting Phase-Modulated Emanations in Side Channels
Abstract
In recent years, the limits of electromagnetic side-channel attacks have been significantly expanded. However, while there is a growing literature on increasing attack distance or performance, the discovery of new phenomenons about compromising electromagnetic emanations remains limited.In this work, we identify a novel form of modulation produced by unintentional electromagnetic emanations: phase-modulated emanations. This observation allows us to extract a side-channel leakage that can be exploited to reveal secret cryptographic material. We introduce a technique allowing us to exploit this side-channel in order to perform a full AES key recovery, using cheap and common hardware equipment like a software-defined radio (SDR). Moreover, we demonstrate that the exploitation of this new phase leakage can be combined with traditional amplitude leakage to significantly increase attack performance. While investigating the underlying phenomenon causing this unintentional modulation, we identified several prior works that have approached similar exploitation – without being aware of each other. Creating a bridge between older and recent work, we unveil the relationship between digital jitter and signal phase shift in the context of side-channel attacks and fill the gap between prior works from various research fields.
2024
TCHES
Dash: Accelerating Distributed Private Convolutional Neural Network Inference with Arithmetic Garbled Circuits
Abstract
The adoption of machine learning solutions is rapidly increasing across all parts of society. As the models grow larger, both training and inference of machine learning models is increasingly outsourced, e.g. to cloud service providers. This means that potentially sensitive data is processed on untrusted platforms, which bears inherent data security and privacy risks. In this work, we investigate how to protect distributed machine learning systems, focusing on deep convolutional neural networks. The most common and best-performing mixed MPC approaches are based on HE, secret sharing, and garbled circuits. They commonly suffer from large performance overheads, big accuracy losses, and communication overheads that grow linearly in the depth of the neural network. To improve on these problems, we present Dash, a fast and distributed private convolutional neural network inference scheme secure against malicious attackers. Building on arithmetic garbling gadgets [BMR16] and fancy-garbling [BCM+19], Dash is based purely on arithmetic garbled circuits. We introduce LabelTensors that allow us to leverage the massive parallelity of modern GPUs. Combined with state-of-the-art garbling optimizations, Dash outperforms previous garbling approaches up to a factor of about 100. Furthermore, we introduce an efficient scaling operation over the residues of the Chinese remainder theorem representation to arithmetic garbled circuits, which allows us to garble larger networks and achieve much higher accuracy than previous approaches. Finally, Dash requires only a single communication round per inference step, regardless of the depth of the neural network, and a very small constant online communication volume.
2024
TCHES
Perceived Information Revisited II: Information-Theoretical Analysis of Deep-Learning Based Side-Channel Attacks
Abstract
Previous studies on deep-learning-based side-channel attacks (DL-SCAs) have shown that traditional performance evaluation metrics commonly used in DL, like accuracy and F1 score, are not effective in evaluating DL-SCA performance. Therefore, some previous studies have proposed new alternative metrics for evaluating the performance of DL-SCAs. Notably, perceived information (PI) and effective perceived information (EPI) are major metrics based on information theory. While it has been experimentally confirmed that these metrics can give the attack success rate (SR) for DL-SCAs, their theoretical validity remains unclear.
In this paper, we propose a new theoretically valid performance evaluation metric called latent perceived information (LPI), which serves as an alternative to the existing metrics. LPI is defined as the mutual information between the output of the feature extractor of a neural network (NN) model and the intermediate value, representing the potential attack performance of the trained model. First, we prove that LPI provides an upper bound on the SR of a DL-SCA by modeling and formulating DL-SCA as a communication channel. Additionally, we clarify the conditions under which PI and EPI theoretically provide an upper bound on the SR from the perspective of LPI. For practical computation of LPI, we present two methods. One utilizes the Kraskov (KSG) estimator, a common mutual information estimator, and the other is based on logistic regression. While the KSG estimator is computationally intensive, it yields accurate LPI values. In contrast, the logistic regression is faster but provides a lower bound for LPI. Through experimental attacks on AES software and hardware implementations with masking countermeasures, we demonstrate that the LPI values estimated by these two methods are significantly similar, indicating the reliability and soundness of our proposed estimation techniques. Furthermore, we show that, by using the logistic regression as a classifier, we can significantly improve the attack performance of the trained model when the difference between the SR upper bound by the LPI and its actual SR is large. This indicates that LPI represents the potential for performance improvement in the trained model. Therefore, our study contributes to optimizing the distinguisher for attack performance using the trained model.
2024
TCHES
Blind-Folded: Simple Power Analysis Attacks using Data with a Single Trace and no Training
Abstract
Side-Channel Attacks target the recovery of key material in cryptographic implementations by measuring physical quantities such as power consumption during the execution of a program. Simple Power Attacks consist in deducing secret information from a trace using a single or a few samples, as opposed to differential attacks which require many traces. Software cryptographic implementations usually contain a data-independent execution path, but often do not consider variations in power consumption associated to data. In this work, we show that a technique commonly used to select a value from different possible values in a control-independant way leads to significant power differences depending on the value selected. This difference is actually so important that a single sample can be considered for attacking one condition, and no training on other traces is required. We exploit this finding to propose a single-trace attack without any knowledge gained on previous executions, using trace folding. We target the two modular exponentiation implementations in Libgcrypt, getting respectively 100% and 99.998% correct bits in average on 30 executions using 2,048-bit exponents. We also use this technique to attack the scalar multiplication in ECDSA, successfully recovering all secret nonces on 1,000 executions. Finally, the insights we gained from this work allow us to show that a proposed countermeasure from the literature for performing the safe loading of precomputed operands in the context of windowed implementations can be attacked as well.
2024
TCHES
Optimized One-Dimensional SQIsign Verification on Intel and Cortex-M4
Abstract
SQIsign is a well-known post-quantum signature scheme due to its small combined signature and public-key size. However, SQIsign suffers from notably long signing times, and verification times are not short either. To improve this, recent research has explored both one-dimensional and two-dimensional variants of SQIsign, each with distinct characteristics. In particular, SQIsign2D’s efficient signing and verification times have made it a focal point of recent research. However, the absence of an optimized one-dimensional verification implementation hampers a thorough comparison between these different variants. This work bridges this gap in the literature: we provide a state-of-the-art implementation of one-dimensional SQIsign verification, including novel optimizations. We report a record-breaking one-dimensional SQIsign verification time of 8.55 Mcycles on a Raptor Lake Intel processor, closely matching SQIsign2D on the same processor. For uncompressed signatures, the signature size doubles and we verify in only 5.6 Mcycles. Taking advantage of the inherent parallelism available in isogeny computations, we present 5-core variants that can go as low as 1.3 Mcycles. Furthermore, we present the first implementation that supports both 32-bit and 64-bit processors. It includes optimized assembly code for the Cortex-M4 and has been integrated with the pqm4 project. Our results motivate further research into one-dimensional SQIsign, as it boasts unique features among isogeny-based schemes.
2024
TCHES
Prover - Toward More Efficient Formal Verification of Masking in Probing Model
Abstract
In recent years, formal verification has emerged as a crucial method for assessing security against Side-Channel attacks of masked implementations, owing to its remarkable versatility and high degree of automation. However, formal verification still faces technical bottlenecks in balancing accuracy and efficiency, thereby limiting its scalability. Former tools like maskVerif and CocoAlma are very efficient but they face accuracy issues when verifying schemes that utilize properties of Boolean functions. Later, SILVER addressed the accuracy issue, albeit at the cost of significantly reduced speed and scalability compared to maskVerif. Consequently, there is a pressing need to develop formal verification tools that are both efficient and accurate for designing secure schemes and evaluating implementations. This paper’s primary contribution lies in proposing several approaches to develop a more efficient and scalable formal verification tool called Prover, which is built upon SILVER. Firstly, inspired by the auxiliary data structures proposed by Eldib et al. and optimistic sampling rule of maskVerif, we introduce two reduction rules aimed at diminishing the size of observable sets and secret sets in statistical independence checks. These rules substantially decrease, or even eliminate, the need for repeated computation of probability distributions using Reduced Ordered Binary Decision Diagrams (ROBDDs), a time-intensive procedure in verification. Subsequently, we integrate one of these reduction rules into the uniformity check to mitigate its complexity. Secondly, we identify that variable ordering significantly impacts efficiency and optimize it for constructing ROBDDs, resulting in much smaller representations of investigated functions. Lastly, we present the algorithm of Prover, which efficiently verifies the security and uniformity of masked implementations in probing model with or without the presence of glitches. Experimental results demonstrate that our proposed tool Prover offers a better balance between efficiency and accuracy compared to other state-of-the-art tools (IronMask, CocoAlma, maskVerif, and SILVER). In our experiments, we also found an S-box that can only be verified by Prover, as IronMask cannot verify S-boxes, and both CocoAlma and maskVerif suffer from false positive issues. Additionally, SILVER runs out of time during verification.
2024
TCHES
A Framework for Generating S-Box Circuits with Boyer–Peralta Algorithm-Based Heuristics, and Its Applications to AES, SNOW3G, and Saturnin
Abstract
In many lightweight cryptography applications, low area and latency are required for efficient implementation. The gate count in the cipher and the circuit depth must be low to minimize these two metrics. Many optimization strategies have been developed for the linear layer, led by the Boyer–Peralta (BP) algorithm. The Advanced Encryption Standard (AES) has been a focus of extensive research in this area. However, while the linear layer uses only XOR gates, the S-box, which is an essential nonlinear component in symmetric cryptography, uses various gate types, making optimization challenging, particularly as the bit size increases.In this paper, we propose a new framework for a heuristic search to optimize the circuit depth or XOR gate count of S-box circuits. Existing S-box circuit optimization studies have divided the nonlinear and linear layers of the S-box, optimizing each separately, but limitations still exist in optimizing large S-box circuits. To extend the optimization target from individual internal components to the entire S-box circuit, we extract the XOR information of each node in the target circuit and reconstruct the nodes based on nonlinear gates. Next, we extend the BP algorithm-based heuristics to address nonlinear gates and incorporate this into the framework. It is noteworthy that the effects of our framework occur while maintaining the AND gate count and AND depth without any increase.To demonstrate the effectiveness of the proposed framework, we apply it to the AES, SNOW3G, and Saturnin S-box circuits. Our results include depth improvements by about 40% and 11% compared to the existing AES S-box [BP10] and Saturnin super S-box [CDL+20] circuits, respectively. We implement a new circuit for the SNOW3G S-box, which has not previously been developed, and apply our framework to reduce its depth. We expect the proposed framework to contribute to the design and implementation of various symmetric-key cryptography solutions.
2024
TCHES
FANNG-MPC: Framework for Artificial Neural Networks and Generic MPC
Abstract
In this work, we introduce FANNG-MPC, a versatile secure multi-party computation framework capable to offer active security for privacy-preserving machine learning as a service (MLaaS). Derived from the now deprecated SCALE-MAMBA, FANNG is a data-oriented fork, featuring novel set of libraries and instructions for realizing private neural networks, effectively reviving the popular framework. To the best of our knowledge, FANNG is the first MPC framework to offer actively secure MLaaS in the dishonest majority setting.FANNG goes beyond SCALE-MAMBA by decoupling offline and online phases and materializing the dealer model in software, enabling a separate set of entities to produce offline material. The framework incorporates database support, a new instruction set for pre-processed material, including garbled circuits and convolutional and matrix multiplication triples. FANNG also implements novel private comparison protocols and an optimized library supporting Neural Network functionality. All our theoretical claims are substantiated by an extensive evaluation using an open-sourced implementation, including the private inference of popular neural networks like LeNet and VGG16.
2024
TCHES
Trojan Insertion versus Layout Defenses for Modern ICs: Red-versus-Blue Teaming in a Competitive Community Effort
Abstract
Hardware Trojans (HTs) are a longstanding threat to secure computation. Among different threat models, it is the fabrication-time insertion of additional malicious logic directly into the layout of integrated circuits (ICs) that constitutes the most versatile, yet challenging scenario, for both attackers and defenders.Here, we present a large-scale, first-of-its-kind community effort through red-versus-blue teaming that thoroughly explores this threat. Four independently competing blue teams of 23 IC designers in total had to analyze and fix vulnerabilities of representative IC layouts at the pre-silicon stage, whereas a red team of 3 experts in hardware security and IC design continuously pushed the boundaries of these defense efforts through different HTs and novel insertion techniques. Importantly, we find that, despite the blue teams’ commendable design efforts, even highly-optimized layouts retained at least some exploitable vulnerabilities.Our effort follows a real-world setting for a modern 7nm technology node and industrygrade tooling for IC design, all embedded into a fully-automated and extensible benchmarking framework. To ensure the relevance of this work, strict rules that adhere to real-world requirements for IC design and manufacturing were postulated by the organizers. For example, not a single violation for timing and design-rule checks were allowed for defense techniques. Besides, in an advancement over prior art, neither red nor blue teams were allowed to use any so-called fillers and spares for trivial attack or defense approaches.Finally, we release all methods and artifacts: the representative IC layouts and HTs, the devised attack and defense techniques, the evaluation metrics and setup, the technology setup and commercial-grade reference flow for IC design, the encompassing benchmarking framework, and all best results. This full release enables the community to continue exploring this important challenge for hardware security, in particular to focus on the urgent need for further advancements in defense strategies.
2024
TCHES
TPUXtract: An Exhaustive Hyperparameter Extraction Framework
Abstract
Model stealing attacks on AI/ML devices undermine intellectual property rights, compromise the competitive advantage of the original model developers, and potentially expose sensitive data embedded in the model’s behavior to unauthorized parties. While previous research works have demonstrated successful side-channelbased model recovery in embedded microcontrollers and FPGA-based accelerators, the exploration of attacks on commercial ML accelerators remains largely unexplored. Moreover, prior side-channel attacks fail when they encounter previously unknown models. This paper demonstrates the first successful model extraction attack on the Google Edge Tensor Processing Unit (TPU), an off-the-shelf ML accelerator. Specifically, we show a hyperparameter stealing attack that can extract all layer configurations including the layer type, number of nodes, kernel/filter sizes, number of filters, strides, padding, and activation function. Most notably, our attack is the first comprehensive attack that can extract previously unseen models. This is achieved through an online template-building approach instead of a pre-trained ML-based approach used in prior works. Our results on a black-box Google Edge TPU evaluation show that, through obtained electromagnetic traces, our proposed framework can achieve 99.91% accuracy, making it the most accurate one to date. Our findings indicate that attackers can successfully extract various types of models on a black-box commercial TPU with utmost detail and call for countermeasures.
2024
TCHES
Bit t-SNI Secure Multiplication Gadget for Inner Product Masking
Abstract
Masking is a sound countermeasure to protect against differential power analysis. Since the work by Balasch et al. in ASIACRYPT 2012, inner product masking has been explored as an alternative to the well known Boolean masking. In CARDIS 2017, Poussier et al. showed that inner product masking achieves higherorder security versus Boolean masking, for the same shared size, in the bit-probing model. Wang et al. in TCHES 2020 verified the inner product masking’s security order amplification in practice and proposed new gadgets for inner product masking. Finally, Wu et al. in TCHES 2022 showed that this security amplification comes from the bit-probing model, but that Wang et al.’s gadgets are not higher-order bitprobing secure reducing the computation’s practical security. The authors concluded their work with the open question of providing an inner product multiplication gadget which maintains the masking’s bit-probing security, and conjectured that such gadget maintains the practical security order amplification of the masking during its computation.In this paper, we answer positively to Wu et al.’s open problems. We are the first to present a multiplication gadget for inner product masking which is proven secure in the bit-level probing model using the t-Strong Non-Interference (SNI) property. Moreover, we provide practical evidence that the gadget indeed maintains the security amplification of its masking. This is done via an evaluation of an assembly implementation of the gadget on an ARM Cortex-M4 core. We used this implementation to take leakage measurements and show no leakage happens for orders below the gadget’s bit-probing security level either for its univariate or multivariate analysis.
2024
ASIACRYPT
On the Spinor Genus and the Distinguishing Lattice Isomorphism Problem
Abstract
This paper addresses the spinor genus, a previously unrecognized classification of quadratic forms in the context of cryptography, related to the lattice isomorphism problem (LIP). The spinor genus lies between the genus and equivalence class, thus refining the concept of genus. We present algorithms to determine whether two quadratic forms belong to the same spinor genus. If they do not, it provides a negative answer to the distinguishing variant of LIP. However, these algorithms have very high complexity, and we show that the proportion of genera splitting into multiple spinor genera is vanishing (assuming rank n ≥ 3). For the special case of anisotropic integral binary forms (n = 2) over number fields with class number 1, we offer an efficient quantum algorithm to test if two forms lie in the same spinor genus. Our algorithm does not apply to the HAWK protocol, which uses integral binary Hermitian forms over number fields with class number greater than 1.
2024
ASIACRYPT
Key Exchange in the Post-Snowden Era: Universally Composable Subversion-Resilient PAKE
Abstract
Password-Authenticated Key Exchange (PAKE) allows two parties to establish a common high-entropy secret from a possibly low-entropy pre-shared secret such as a password. In this work, we provide the first PAKE protocol with subversion resilience in the framework of universal composability (UC), where the latter roughly means that UC security still holds even if one of the two parties is malicious and the honest party's code has been subverted (in an undetectable manner).
We achieve this result by sanitizing the PAKE protocol from oblivious transfer (OT) due to Canetti et al. (PKC'12) via cryptographic reverse firewalls in the UC framework (Chakraborty et al., EUROCRYPT'22). This requires new techniques, which help us uncover new cryptographic primitives with sanitation-friendly properties along the way (such as OT, dual-mode cryptosystems, and signature schemes).
As an additional contribution, we delve deeper in the backbone of communication required in the subversion-resilient UC framework, extending it to the {\em unauthenticated} setting, in line with the work of Barak et al. (CRYPTO'05).
2024
ASIACRYPT
Crooked Indifferentiability of the Feistel Construction
Abstract
The Feistel construction is a fundamental technique for building pseudorandom permutations and block ciphers. This paper shows that a simple adaptation of the construction is resistant, even to algorithm substitution attacks---that is, adversarial subversion---of the component round functions. Specifically, we establish that a Feistel-based construction with more than $337n/\log(1/\epsilon)$ rounds can transform a subverted random function---which disagrees with the original one at a small fraction (denoted by $\epsilon$) of inputs---into an object that is \emph{crooked-indifferentiable} from a random permutation, even if the adversary is aware of all the randomness used in the transformation. Here, $n$ denotes the length of both the input and output of the round functions that underlie the Feistel cipher. We also provide a lower bound showing that the construction cannot use fewer than $2n/\log(1/\epsilon)$ rounds to achieve crooked-indifferentiable security.
2024
ASIACRYPT
Adaptive Hardcore Bit and Quantum Key Leasing over Classical Channel from LWE with Polynomial Modulus
Abstract
Quantum key leasing, also known as public key encryption with secure key leasing (PKE-SKL), allows a user to lease a (quantum) secret key to a server for decryption purpose, with the capability of revoking the key afterwards. In the pioneering work by Chardouvelis et al (arXiv:2310.14328), a PKE-SKL scheme utilizing classical channels was successfully built upon the noisy trapdoor claw-free (NTCF) family. This approach, however, relies on the superpolynomial hardness of learning with errors (LWE) problem, which could affect both efficiency and security of the scheme.
In our work, we demonstrate that the reliance on superpolynomial hardness is unnecessary, and that LWE with polynomial-size modulus is sufficient to achieve the same goal. Our approach enhances both efficiency and security, thereby improving the practical feasibility of the scheme on near-term quantum devices. To accomplish this, we first construct a noticeable NTCF (NNTCF) family with the adaptive hardcore bit property, based on LWE with polynomial-size modulus. To the best of our knowledge, this is the first demonstration of the adaptive hardcore bit property based on LWE with polynomial-size modulus, which may be of independent interest. Building on this foundation, we address additional challenges in prior work to construct the first PKE-SKL scheme satisfying the following properties: (i) the entire protocol utilizes only classical communication, and can also be lifted to support homomorphism. (ii) the security is solely based on LWE assumption with polynomial-size modulus.
As a demonstration of the versatility of our noticeable NTCF, we show that an efficient proof of quantumness protocol can be built upon it. Specifically, our protocol enables a classical verifier to test the quantumness while relying exclusively on the LWE assumption with polynomial-size modulus.
2024
ASIACRYPT
Reducing the Number of Qubits in Quantum Information Set Decoding
Abstract
This paper presents an optimization of the memory cost of the quantum \emph{Information Set Decoding} (ISD) algorithm proposed by Bernstein (PQCrypto 2010), obtained by combining Prange's ISD with Grover's quantum search.
When the code has constant rate and length $n$, this algorithm essentially performs a quantum search which, at each iterate, solves a linear system of dimension $\mathcal{O}(n)$. The typical code lengths used in post-quantum public-key cryptosystems range from $10^3$ to $10^5$. Gaussian elimination, which was used in previous works, needs $\mathcal{O}(n^2)$ space to represent the matrix, resulting in millions or billions of (logical) qubits for these schemes.
In this paper, we propose instead to use the algorithm for sparse matrix inversion of Wiedemann (IEEE Trans. inf. theory 1986). The interest of Wiedemann's method is that one relies only on the implementation of a matrix-vector product, where the matrix can be represented in an implicit way. This is the case here.
We propose two main trade-offs, which we have fully implemented, tested on small instances, and benchmarked for larger instances. The first one is a quantum circuit using $\mathcal{O}(n)$ qubits, $\mathcal{O}(n^3)$ Toffoli gates like Gaussian elimination, and depth $\mathcal{O}(n^2 \log n)$. The second one is a quantum circuit using $\mathcal{O}(n \log^2 n)$ qubits, $\mathcal{O}(n^3)$ gates in total but only $\mathcal{O}( n^2 \log^2 n)$ Toffoli gates, which relies on a different representation of the search space.
As an example, for the smallest Classic McEliece parameters we estimate that the Quantum Prange's algorithm can run with 18098 qubits, while previous works would have required at least half a million qubits.
2024
ASIACRYPT
Tightly Secure Non-Interactive BLS Multi-Signatures
Abstract
Due to their simplicity, compactness, and algebraic structure, BLS signatures are among the most widely used signatures in practice.
For example, used as multi-signatures, they are integral in Ethereum's proof-of-stake consensus.
From the perspective of concrete security, however, BLS (multi-)signatures suffer from a security loss linear in the number of signing queries. It is well-known that this loss can not be avoided using current proof techniques.
In this paper, we introduce a new variant of BLS multi-signatures that achieves tight security while remaining fully compatible with regular BLS. In particular, our signatures can be seamlessly combined with regular BLS signatures, resulting in regular BLS signatures.
Moreover, it can easily be implemented using existing BLS implementations in a black-box way.
Our scheme is also one of the most efficient non-interactive multi-signatures, and in particular more efficient than previous tightly secure schemes.
We demonstrate the practical applicability of our scheme by showing how proof-of-stake protocols that currently use BLS can adopt our variant for fully compatible opt-in tight security.