27 February 2024

Pierre Briaud, Maxime Bros, Ray Perlner, Daniel Smith-Tone
DME is a multivariate scheme submitted to the call for additional signatures recently launched by NIST. Its performance is one of the best among all the candidates. The public key is constructed from the alternation of very structured linear and non-linear components that constitute the private key, the latter being defined over an extension field. We exploit these structures by proposing an algebraic attack which is practical on all DME parameters.
Yuval Ishai, Yifan Song
A leakage-resilient circuit for $f:\{0,1\}^n\to\{0,1\}^m$ is a randomized Boolean circuit $C$ mapping a randomized encoding of an input $x$ to an encoding of $y=f(x)$, such that applying any leakage function $L\in \cal L$ to the wires of $C$ reveals essentially nothing about $x$. A leakage-tolerant circuit achieves the stronger guarantee that even when $x$ and $y$ are not protected by any encoding, the output of $L$ can be simulated by applying some $L'\in \cal L$ to $x$ and $y$ alone. Thus, $C$ is as secure as an ideal hardware implementation of $f$ with respect to leakage from $\cal L$.

Leakage-resilient circuits were constructed for low-complexity classes $\cal L$, including (length-$t$ output) $\mathcal{AC}0$ functions, parities, and functions with bounded communication complexity. In contrast, leakage-tolerant circuits were only known for the simple case of probing leakage, where $L$ outputs the values of $t$ wires in $C$.

We initiate a systematic study of leakage-tolerant circuits for natural classes $\cal L$ of global leakage functions, obtaining the following main results.

Leakage-tolerant circuits for depth-1 leakage. Every circuit $C_f$ for $f$ can be efficiently compiled into an $\cal L$-tolerant circuit $C$ for $f$, where $\cal L$ includes all leakage functions $L$ that output either $t$ parities or $t$ disjunctions (alternatively, conjunctions) of any number of wires or their negations. In the case of parities, our simulator runs in $2^{O(t)}$ time. We provide partial evidence that this may be inherent.

Application to stateful leakage-resilient circuits. Using a general transformation from leakage-tolerant circuits, we obtain the first construction of stateful $t$-leakage-resilient circuits that tolerate a continuous parity leakage, and the first such construction for disjunction/conjunction leakage in which the circuit size grows sub-quadratically with $t$. Interestingly, here we can obtain $\mathtt{poly}(t)$-time simulation even in the case of parities.
Maryam Bahrani, Pranav Garimidi, Tim Roughgarden
The incentive-compatibility properties of blockchain transaction fee mechanisms have been investigated with passive block producers that are motivated purely by the net rewards earned at the consensus layer. This paper introduces a model of active block producers that have their own private valuations for blocks (representing, for example, additional value derived from the application layer). The block producer surplus in our model can be interpreted as one of the more common colloquial meanings of the phrase ``maximal extractable value (MEV).''

We first prove that transaction fee mechanism design is fundamentally more difficult with active block producers than with passive ones: With active block producers, no non-trivial or approximately welfare maximizing transaction fee mechanism can be incentive-compatible for both users and block producers. These impossibility results can be interpreted as a mathematical justification for augmenting transaction fee mechanisms with additional components such as orderflow auctions, block producer competition, trusted hardware, or cryptographic techniques.

We then proceed to a more fine-grained model of block production that is inspired by current practice, in which we distinguish the roles of ``searchers'' (who actively identify opportunities for value extraction from the application layer and compete for the right to take advantage of them) and ``proposers'' (who participate directly in the blockchain protocol and make the final choice of the published block). Searchers can effectively act as an ``MEV oracle'' for a transaction fee mechanism, thereby enlarging the design space. Here, we first consider a transaction fee mechanism that resembles how searchers have traditionally been incorporated into the block production process, with each transaction effectively sold off to a searcher through a first-price auction. We then explore the design space with searchers more generally, and design a mechanism that circumvents our impossibility results for mechanisms without searchers. Our mechanism (the ``SAKA'' mechanism) is deterministic, incentive-compatible (for users, searchers, and the block producer), and sybil-proof, and it guarantees roughly 50% of the maximum-possible welfare when transaction sizes are small relative to block sizes. We conclude with a matching negative result: even when transactions are small relative to blocks, no incentive-compatible, sybil proof, and deterministic transaction fee mechanism can guarantee more than 50% of the maximum-possible welfare.
Aron van Baarsen, Sihang Pu
Traditional private set intersection (PSI) involves a receiver and a sender holding sets $X$ and $Y$, respectively, with the receiver learning only the intersection $X\cap Y$. We turn our attention to its fuzzy variant, where the receiver holds \(|X|\) hyperballs of radius \(\delta\) in a metric space and the sender has $|Y|$ points. Representing the hyperballs by their center, the receiver learns the points $x\in X$ for which there exists $y\in Y$ such that $\mathsf{dist}(x,y)\leq \delta$ with respect to some distance metric. Previous approaches either require general-purpose multi-party computation (MPC) techniques like garbled circuits or fully homomorphic encryption (FHE), leak details about the sender’s precise inputs, support limited distance metrics, or scale poorly with the hyperballs' volume.

This work presents the first black-box construction for fuzzy PSI (including other variants such as PSI cardinality, labeled PSI, and circuit PSI), which can handle polynomially large radius and dimension (i.e., a potentially exponentially large volume) in two interaction messages, supporting general \(L_{p\in[1,\infty]}\) distance, without relying on garbled circuits or FHE. The protocol excels in both asymptotic and concrete efficiency compared to existing works. For security, we solely rely on the assumption that the Decisional Diffie-Hellman (DDH) holds in the random oracle model.
Houda Ferradi
This paper introduces \textsl{signature validation}, a primitive allowing any \underline{t}hird party $T$ (\underline{T}héodore) to verify that a \underline{v}erifier $V$ (\underline{V}adim) computationally verified a signature $s$ on a message $m$ issued by a \underline{s}igner $S$ (\underline{S}arah).

A naive solution consists in sending by Sarah $x=\{m,\sigma_s\}$ where $\sigma_s$ is Sarah's signature on $m$ and have Vadim confirm reception by a signature $\sigma_v$ on $x$.

Unfortunately, this only attests \textsl{proper reception} by Vadim, i.e. that Vadim \textsl{could have checked} $x$ and not that Vadim \textsl{actually verified} $x$. By ``actually verifying'' we mean providing a proof or a convincing argument that a program running on Vadim's machine checked the correctness of $x$.

This paper proposes several solutions for doing so, thereby providing a useful building-block in numerous commercial and legal interactions for proving informed consent.
Cécile Delerablée, Lénaïck Gouriou, David Pointcheval
Attribute-based cryptography allows fine-grained control on the use of the private key. In particular, attribute-based signature (ABS) specifies the capabilities of the signer, which can only sign messages associated to a policy that is authorized by his set of attributes. Furthermore, we can expect signature to not leak any information about the identity of the signer. ABS is a useful tool for identity-preserving authentication process which requires granular access-control, and can furthermore be enhanced with additional properties, for example delegation where users are able to manage a set of keys derived from their original one.

In this paper, we address delegation of signing keys. Our first delegation works for any subset of the original attributes, which is the intuitive approach of delegation. Furthermore, we also provide another kind of delegation where the delegator can choose a policy at delegation time to produce keys that can sign any message under this specific policy. This last approach to delegation is a direct application of a new version of the indexing technique, which was first introduced by Okamoto and Takashima in order to prove adaptive security in ABS and its counterpart for encryption, ABE. On top of that, we prove that our scheme is compatible with a well studied feature of ABS, traceability, by using an approach based on Linearly-Homomorphic signatures. All our schemes also guarantee the anonymity of the real signer. The unforgeability of our schemes is proven using the SXDH assumption, and our constructions use the Dual Pairing Vector Spaces (DPVS) framework developed by Okamoto and Takashima, which has been widely used for all kind of attribute and functional cryptography mechanisms.
Ziqi Zhu, Jiangtao Li, Kai Zhang, Junqing Gong, Haifeng Qian
This work initiates the study of concrete registered functional encryption (Reg-FE) beyond ``all-or-nothing'' functionalities:

- We build the first Reg-FE for linear function or inner-product evaluation (Reg-IPFE) from pairings. The scheme achieves adaptive IND-security under $k$-Lin assumption in the prime-order bilinear group. A minor modification yields the first Registered Inner-Product Encryption (Reg-IPE) scheme from $k$-Lin assumption. Prior work achieves the same security in the generic group model. -We build the first Reg-FE for quadratic function (Reg-QFE) from pairings. The scheme achieves very selective simulation-based security (SIM-security) under bilateral $k$-Lin assumption in the prime-order bilinear group. Here, ``very selective'' means that the adversary claims challenge messages, all quadratic functions to be registered and all corrupted users at the beginning.

Besides focusing on the compactness of the master public key and helper keys, we also aim for compact ciphertexts in Reg-FE. Let $L$ be the number of slots and $n$ be the input size. Our first Reg-IPFE has weakly compact ciphertexts of size $O(n\cdot\log L)$ while our second Reg-QFE has compact ciphertexts of size $O(n+\log L)$. Technically, for our first Reg-IPFE, we employ nested dual-system method within the context of Reg-IPFE; for our second Reg-QFE, we follow Wee's ``IPFE-to-QFE'' transformation [TCC' 20] but devise a set of new techniques that make our pairing-based Reg-IPFE compatible. Along the way, we introduce a new notion named Pre-Constrained Registered IPFE which generalizes slotted Reg-IPFE by constraining the form of functions that can be registered.
Nicolas Alhaddad, Mayank Varia, Ziling Yang
Asynchronous complete secret sharing (ACSS) is a foundational primitive in the design of distributed algorithms and cryptosystems that require secrecy. Dual-threshold ACSS permits a dealer to distribute a secret to a collection of $n$ servers so that everyone holds shares of a polynomial containing the dealer's secret.

This work contributes a new ACSS protocol, called Haven++, that uses packing and batching to make asymptotic and concrete advances in the design and application of ACSS for large secrets. Haven++ allows the dealer to pack multiple secrets in a single sharing phase, and to reconstruct either one or all of them later. For even larger secrets, we contribute a batching technique to amortize the cost of proof generation and verification across multiple invocations of our protocol.

The result is an asymptotic improvement in amortized communication and computation complexity, both for ACSS itself and for its application to asynchronous distributed key generation. We implement Haven++ and find that it improves performance over the hbACSS protocol of Yurek et al. by a factor of 3-10$\times$ or more across a wide range of parameters for the number of parties and batch size.

26 February 2024

Benedikt Bünz, Jessica Chen
We construct two new accumulation schemes. The first one is for checking that $\ell$ read and write operations were performed correctly from a memory of size $T$. Unlike all prior work, the prover time is entirely independent of $T$ and only depends on $\ell$. The second one is for deterministic computations. It does not require committing to the intermediate wires of the computation but only the input and output. This is achieved by building an accumulation scheme for a modified version of the famous GKR protocol. We show that these schemes are highly compatible and that the accumulation for GKR can further reduce the cost of the memory-checking scheme. Using the BCLMS (Crypto 21) compiler, these protocols yield an efficient incrementally verifiable computation (IVC) scheme that is particularly useful for machine computations with large memories and deterministic steps.
Jake Januzelli, Lawrence Roy, Jiayu Xu
A Password-Authenticated Key Exchange (PAKE) protocol allows two parties to agree upon a cryptographic key, in the setting where the only secret shared in advance is a low-entropy password. The standard security notion for PAKE is in the Universal Composability (UC) framework. In recent years there have been a large number of works analyzing the UC-security of Encrypted Key Exchange (EKE), the very first PAKE protocol, and its One-encryption variant (OEKE), both of which compile an unauthenticated Key Agreement (KA) protocol into a PAKE.

In this work, we present a comprehensive and thorough study of the UC-security of both EKE and OEKE in the most general setting and using the most efficient building blocks:

1. We show that among the seven existing results on the UC-security of (O)EKE, six are flawed;

2. We show that for (O)EKE to be UC-secure, the underlying KA protocol needs to satisfy the properties of strong pseudorandomness, pseudorandom non-malleability, and collision resistance, all of which are missing in existing works;

3. We give UC-security proofs for EKE and OEKE using Programmable-Once Random Function (POPF), which is the most efficient instantiation to date and is around 4 times faster than the standard instantiation using Ideal Cipher (IC).

Our results in particular allow for PAKE constructions from post-quantum KA protocols such as Kyber. We also give a security analysis of POPF in a new composition framework called almost UC, which we believe is interesting in its own right.
Ruida Wang, Yundi Wen, Zhihao Li, Xianhui Lu, Benqiang Wei, Kun Liu, Kunpeng Wang
We present a novel circuit bootstrapping algorithm that outperforms the state-of-the-art TFHE method with 9.9× speedup and 15.6× key size reduction. These improvements can be attributed to two technical contributions. Firstly, we redesigned the circuit bootstrapping workflow to operate exclusively under the ring ciphertext type, which eliminates the need of conversion between LWE and RLWE ciphertexts. Secondly, we improve the LMKC+ blind rotation algorithm by reducing the number of automorphisms, then propose the first automorphism type multi-value functional bootstrapping. These automorphism-based techniques lead to further key size optimization, and are of independent interest besides circuit bootstrapping. Based our new circuit bootstrapping we can evaluate AES-128 in 26.2s (single thread), achieving 10.3× speedup compared with the state-of-the-art TFHE-based approach.
Weixi Zheng, Liu Zhang, Zilong Wang
At CRYPTO 2019, Gohr demonstrated that differential-neural distinguishers (DNDs) for Speck32/64 can learn more features than classical cryptanalysis's differential distribution tables (DDT). Furthermore, a non-classical key recovery procedure is devised by combining the Upper Confidence Bound (UCB) strategy and the BayesianKeySearch algorithm. Consequently, the time complexity of 11-round key recovery attacks on Speck32/64 is significantly reduced compared with the state-of-the-art results in classical cryptanalysis. This advancement in deep learning-assisted cryptanalysis has opened up new possibilities. However, the specific encryption features exploited by DNDs remain unclear.

In this paper, we begin by analyzing the features learned by DND based on the probability distribution of a ciphertext pair. Our analysis reveals that DND not only learns the differential features of the ciphertext pair but also captures the XOR information of the left and right branches of the ciphertext pair. This explains why the performance of DND can outperform DDT in certain cases. For other ciphers, we can also predict whether deep learning methods can achieve superior results to classical methods based on the probability distribution of the ciphertext pair. Next, we modify the input data format and network structure based on the specific features that can be learned to train DND specifically. With these modifications, it is possible to reduce the size of their parameters to only 1/16 of their previous networks while maintaining high precision. Additionally, the training time for the DNDs is significantly reduced. Finally, to improve the efficiency of deep learning-assisted cryptanalysis, we introduce Bayes-UCB to select promising ciphertext structures more efficiently. We also introduce an improved BayesianKeySearch algorithm to retain guessed keys with the highest scores in key guessing. We use both methods to launch 11-round, 12-round, and 13-round key recovery attacks on Speck32/64. The results show that under the same conditions, the success rate of 11-round key recovery attacks has increased from Gohr's 36.1% to 52.8%, the success rate of 12-round key recovery attacks has increased from Gohr's 39% to 50%, and the success rate of 13-round key recovery attacks has increased from Zhang et al.'s 21% to 24%. In addition, the time complexity of these experiments is also significantly reduced.
Vincent Hwang
We show that there is a discrepancy between the emulated floating-point multiplications in the submission package of Falcon and the claimed behavior. In particular, we show that floating-point products with absolute values the smallest normal positive floating-point number are incorrectly zeroized. However, we show that the discrepancy doesn’t effect the complex fast Fourier transform by modeling the floating-point addition, subtraction, and multiplication in CryptoLine. We later implement our own floating-point multiplications in Armv7-M assembly and Jasmin and prove their equivalence with our model, demonstrating the possibility of transferring the challenging verification task (verifying highly-optimized assembly) to the presumably more readable code base (Jasmin).
Hanjun Li, Sela Navot, Stefano Tessaro
This paper proposes POPSTAR, a new lightweight protocol for the private computation of heavy hitters, also known as a private threshold reporting system. In such a protocol, the users provide input measurements, and a report server learns which measurements appear more than a pre-specified threshold. POPSTAR follows the same architecture as STAR (Davidson et al, CCS 2022) by relying on a helper randomness server in addition to a main server computing the aggregate heavy hitter statistics. While STAR is extremely lightweight, it leaks a substantial amount of information, consisting of an entire histogram of the provided measurements (but only reveals the actual measurements that appear beyond the threshold). POPSTAR shows that this leakage can be reduced at a modest cost ($\sim$7$\times$ longer aggregation time). Our leakage is closer to that of Poplar (Boneh et al, S&P 2021), which relies however on distributed point functions and a different model which requires interactions of two non-colluding servers (with equal workloads) to compute the heavy hitters.
Vasyl Ustimenko
We suggest the family of ciphers s^E^n, n=2,3,.... with the space of plaintexts (Z*_{2^s})^n, s >1 such that the encryption map is the composition of kind G=G_1A_1G_2A_2 where A_i are the affine transformations from AGL_n(Z_{2^s}) preserving the variety (Z*_{2^s)}^n , Eulerian endomorphism G_i , i=1,2 of K[x_1, x_2,...., x_n] moves x_i to monomial term ϻ(x_1)^{d(1)}(x_2)^{d(2)}...(x_n)^{d(n)} , ϻϵ Z*_{2^s} and act on (Z*_{2^s})^n as bijective transformations. The cipher is converted to a protocol supported cryptosystem. Protocols of Noncommutative Cryptography implemented on the platform of Eulerian endomorphism are used for the delivery of G_i and A_i from Alice to Bob. One can use twisted Diffie-Hellman protocols which security rests on the complexity of Conjugacy Power problem or hidden tame homomorphism protocol which security rests of the word decomposition problem. Instead of the delivery of G_i Alice and Bob can elaborate these transformations via the inverse twisted Diffie-Hellman protocol implemented on the platform of tame Eulerian transformations of (Z*_{2^s})^n. The cost of single protocol is O(n^3) and the cost of the computation of the reimage of used nonlinear map is O(n^2). So the verification of n^t , t≥1 signatures takes time O(n^{t+2}). Instead of inverse twisted Diffie-Hellman protocol correspondents can use inverse hidden tame homomorphism protocol which rests on the complexity of word decomposition for tame Eulerian transformations. We use natural bijections between Z*_{2^s} and Z_{2^{s-1}}, Z*_{2^s} and finite field F_{2^{s-1}} and Z*_{2^s} and Boolean ring B_{s-1} of order 2^{s-1} to modify the family of ciphers or cryptosystems via the change of AGL_n(Z_{2^s}) for the AGL_n(K), where K is one of the rings Z_{2^{s-1}, F_{2^{s-1} and B_{s-1}. New ciphers are defined via the multiplications of two different commutative rings Z_{2^s} and K. It does not allow to treat them as stream ciphers of multivariate cryptography and use corresponding cryptanalytic technique. Adversary is not able to use known cryptanalytical methods such as linearisation attacks. We discuss the option of change the mentioned above elements of AGL_n(Z_{2^s) or AGL_n(K) for nonlinear multivariate transformation F of (Z_{2^s})^n or K^n with the symmetric trapdoor accelerator T, i.e. the piece of information such that the knowledge of T allows to compute the value F(p) in arbitrarily chosen p ϵ P in time O(n^2) and to solve the equation of kind F(x)=c for each c from C in time O(n ^2).
Alexander Hoover, Sarvar Patel, Giuseppe Persiano, Kevin Yeo
We study single-server private information retrieval (PIR) where a client wishes to privately retrieve the $x$-th entry from a database held by a server without revealing the index $x$. In our work, we focus on PIR with client pre-processing where the client may compute hints during an offline phase. The hints are then leveraged during queries to obtain sub-linear online time. We present Plinko that is the first single-server PIR with client pre-processing that obtains optimal trade-offs between client storage and query time for all parameters. Our scheme uses $t = \tilde{O}(n/r)$ query time for any client storage size $r$. This matches known lower bounds of $r \cdot t = \Omega(n)$ up to logarithmic factors for all parameterizations whereas prior works could only match the lower bound when $r = \tilde{O}(\sqrt{n})$. Moreover, Plinko is also the first updateable PIR scheme where an entry can be updated in worst-case $\tilde{O}(1)$ time.

As our main technical tool, we define the notion of an invertible pseudorandom function (iPRF) that generalizes standard PRFs to be equipped with an efficient inversion algorithm. We present a construction of an iPRF from one-way functions where forward evaluation runs in $\tilde{O}(1)$ time and inversion runs in time linear in the inverse set (output) size. Furthermore, our iPRF construction is the first that remains efficient and secure for arbitrary domain and range sizes (including small domains and ranges). In the context of single-server PIR, we show that iPRFs may be used to construct the first hint set representation where finding a hint containing an entry $x$ may be done in $\tilde{O}(1)$ time.
Giovanni Deligios, Mose Mizrahi Erbes
In the consensus problem, $n$ parties want to agree on a common value, even if some of them are corrupt and arbitrarily misbehave. If the parties have a common input $m$, then they must agree on $m$. Protocols solving consensus assume either a synchronous communication network, where messages are delivered within a known time, or an asynchronous network with arbitrary delays. Asynchronous protocols only tolerate $t_a < n/3$ corrupt parties. Synchronous ones can tolerate $t_s < n/2$ corruptions with setup, but their security completely breaks down if the synchrony assumptions are violated. Network-agnostic consensus protocols, as introduced by Blum, Katz, and Loss [TCC'19], are secure regardless of network conditions, tolerating up to $t_s$ corruptions with synchrony and $t_a$ without, under provably optimal assumptions $t_a \leq t_s$ and $2t_s + t_a < n$. Despite efforts to improve their efficiency, all known network-agnostic protocols fall short of the asymptotic complexity of state-of-the-art purely synchronous protocols. In this work, we introduce a novel technique to compile any synchronous and any asynchronous consensus protocols into a network-agnostic one. This process only incurs a small constant number of overhead rounds, so that the compiled protocol matches the optimal round complexity for synchronous protocols. Our compiler also preserves under a variety of assumptions the asymptomatic communication complexity of state-of-the-art synchronous and asynchronous protocols. Hence, it closes the current efficiency gap between synchronous and network-agnostic consensus.

As a plus, our protocols support $\ell$-bit inputs, and can be extended to achieve communication complexity $O(n^2\kappa + \ell n)$ under the assumptions for which this is known to be possible for purely synchronous protocols.
Schuyler Rosefield, abhi shelat, LaKyah Tyner
The folklore approach to designing a threshold variant of symmetric cryptographic algorithms involves applying generic MPC methods to se- cret sharing techniques: the MPC first combines participant input shares using the secret sharing scheme, and then evaluates the cryptographic function on the reconstructed key. Hardening this secure against n − 1 malicious parties requires some mechanism to ensure input consistency, e.g., adding MACs to inputs, which consequently, increases the number of inputs and gates to the MPC. In many cases, this extra overhead is substantially more than the underlying cost of evaluating the symmetric cryptographic algorithm. We present a scheme that can convert any suitable maliciously secure dishonest majority boolean-circuit FMPC into a threshold scheme Fthresh with almost no overhead. Specifically, we present an SUC-secure scheme that allows for reactive threshold t-of-n boolean circuit evaluation amongst a group of n parties P , for any t ≤ n, against a malicious adversary that corrupts any number of parties less than the threshold t. Moreover, mul- tiple circuits can be evaluated sequentially with the secret-shared authen- ticated outputs of a circuit to be used subsequently as inputs for a new circuit by any S ⊆ P of size |S| ≥ t. Building upon the works of Wang et al, Hazay et al, and Yang et al, [WRK17, HSSV17, YWZ20] for dishonest majority FMPC, our key insight is to create threshold versions of the “authenticated bits” used to han- dle input in these recent n-party garbled circuits protocols. The resulting design incurs a small overhead to produce the reusable “threshold authen- ticated bits” during preprocessing, and adds no extra communication to evaluate with the authenticated input during the online phase. Using our methods, thresholdizing a boolean circuit has essentially no performance overhead. For example, to compute HMAC, a full Setup+Eval execution of the (n − 2)-out-of-n thresholdized version is approximately 4% more expensive than the state-of-the-art n-party MPC. In contrast, using the folklore method is approximately 100% more expensive. This is especially true for small circuits such as AES which has 6800 gates and thus incurs the most overhead for thresholdizing. Simply considering the online Eval cost, our approach can evaluate AES blocks at 2.3/s with 16 parties, exceeding the baseline MPC cost without preprocessing, and sur- passing the folklore method that only achieves .33/s blocks. Ultimately, this result makes threshold boolean circuit MPC as feasible as any MPC application.
Christina Boura, Patrick Derbez, Margot Funk
The AES block cipher is today the most important and analyzed symmetric algorithm. While all versions of the AES are known to be secure in the single-key setting, this is not the case in the related-key scenario. In this article we try to answer the question whether the AES would resist better differential-like related-key attacks if the key schedule was different. For this, we search for alternative permutation-based key schedules by extending the work of Khoo et al. at ToSC 2017 and Derbez et al. at SAC 2018. We first show that the model of Derbez et al. was flawed. Then, we develop different approaches together with MILP-based tools to find good permutations that could be used as the key schedule for AES-128, AES-192 and AES-256. Our methods permitted to find permutations that outperform the permutation exhibited by Khoo et al. for AES-128. Moreover, our new approach based on two MILP models that call one another allowed us to handle a larger search space and thus to search for alternative key schedules for the two bigger versions of AES. This method permitted us to find permutations for AES-192 and AES-256 that provide better resistance to related-key differential attacks. Most importantly, we showed that these variants can resist full-round boomerang attacks.
Andrey Kim, Ahmet Can Mert, Anisha Mukherjee, Aikata Aikata, Maxim Deryabin, Sunmin Kwon, HyungChul Kang, Sujoy Sinha Roy
Recognizing the importance of fast and resource-efficient polynomial multiplication in homomorphic encryption, in this paper, we introduce a novel method that enables integer multiplier-less Number Theoretic Transform (NTT) for computing polynomial multiplication. First, we use a Fermat number as an auxiliary modulus of NTT. However, this approach of using Fermat number scales poorly with the degree of polynomial. Hence, we propose a transformation of a large-degree univariate polynomial into small-degree multi-variable polynomials. After that, we compute these NTTs on small-degree polynomials with Fermat number as modulus. We design an accelerator architecture customized for the novel multivariate NTT and use it for benchmarking practical homomorphic encryption applications. The accelerator can achieve a 1,200× speed-up compared to software implementations. We further discuss the potential and limitations of the proposed polynomial multiplication method in the context of homomorphic encryption.
