IACR News
If you have a news item you wish to distribute, they should be sent to the communications secretary. See also the events database for conference announcements.
Here you can see all recent updates to the IACR webpage. These updates are also available:
04 May 2023
Luxemburg, Luxemburg, 3 October - 6 October 2023
      Event date: 3 October to 6 October 2023
Submission deadline: 15 May 2023
          
  Submission deadline: 15 May 2023
Berlin, Germany, 23 July - 25 July 2023
      Event date: 23 July to 25 July 2023
Submission deadline: 12 May 2023
Notification: 1 June 2023
  Submission deadline: 12 May 2023
Notification: 1 June 2023
Srivilliputtur, India, 2 January - 7 January 2024
      Event date: 2 January to 7 January 2024
Submission deadline: 30 June 2023
Notification: 30 September 2023
  Submission deadline: 30 June 2023
Notification: 30 September 2023
03 May 2023
Anubhab Baksi, Sylvain Guilley, Ritu-Ranjan Shrivastwa, Sofiane Takarabt
      With the escalating demand for lightweight ciphers as well as side channel protected implementation of those ciphers in recent times, this work focuses on two aspects. First, we present a tool for automating the task of finding a Threshold Implementation (TI) of a given Substitution Box (SBox). Our tool returns `with decomposition' and `without decomposition' based TI. The `with decomposition' based implementation returns a combinational SBox; whereas we get a sequential SBox from the `without decomposition' based implementation. Despite being high in demand, it appears that this kind of tool has been missing so far. Second, we show an algorithmic approach where a given cipher implementation can be tweaked (without altering the cipher specification) so that its TI cost can be significantly reduced. We take the PRESENT cipher as our case study (our methodology can be applied to other ciphers as well). Indeed, we show over 31 percent reduction in area and over 52 percent reduction in depth compared to the basic threshold implementation.
          
  Jung Hee Cheon, Minsik Kang, Taeseong Kim, Junyoung Jung, Yongdong Yeo
      Secure Machine Learning as a Service is a viable
solution where clients seek secure delegation of
the ML computation while protecting their sensi-
tive data. We propose an efficient method to se-
curely evaluate deep standard convolutional neu-
ral networks based on CKKS fully homomorphic
encryption, in the manner of batch inference. In
this paper, we introduce a packing method called
Channel-by-Channel Packing that maximizes the
slot compactness and single-instruction-multiple-
data capabilities in ciphertexts. Along with fur-
ther optimizations such as lazy rescaling, lazy
Baby-Step Giant-Step, and ciphertext level man-
agement, we could significantly reduce the com-
putational cost of standard ResNet inference. Sim-
ulation results show that our work has improve-
ments in amortized time by 5.04× (from 79.46s
to 15.76s) and 5.20×(from 455.56s to 87.60s) for
ResNet-20 and ResNet-110, compared to the pre-
vious best results, resp. We also got a dramatic
reduction in memory usage for rotation keys from
several hundred GBs to 6.91GB, which is about
38× smaller than the previous result.
          
  KeYi LIU, Chungen Xu, Bennian Dou
      Homomorphic encryption can perform calculations on encrypted data, which can protect the privacy of data during the usage of data. Functional Bootstraps algorithm proposed by I. Chillotti et al. can compute arbitrary functions represented as lookup table whilst bootstrapping, but the computational efficiency of F unctional Bootstraps with large lookup table or highly precise functions is not high enough. To tackle this issue, we propose a new Tree-BML algorithm. Our Tree-BML algorithm accelerates the computation of F unctional Bootstraps with large LUT by compressing more LWE ciphertexts to a TRLWE ciphertext and utilizing the PBSmanyLUT algorithm which was proposed by I. Chillotti et al. in 2021. The Tree-BML algorithm reduces the running time of LUT computation by 72.09% compared to Tree-based method(Antonio Guimarães et al., 2021). Additionally, we introduce a new TLWE-to-TRLWE Packing Key Switching algorithm which reduces the storage space required and communication overhead of homomorphic encryption algorithm by only generating one of those key-switching key ciphertexts of polynomials with the same non-zero coefficient values but only those values located in different slots. Our algorithm reduces the key-switching key size by 75% compared to Base-aware TLWE-to-TRLWE Key Switching algorithm. Finally, we obtained that our algorithms does not introduce new output error through theoretical and experiment result.
          
  Benjamin E. Diamond, Jim Posen
      A fundamental result dating to Ligero (CCS '17) establishes that each fixed linear block code exhibits proximity gaps with respect to the collection of affine subspaces, in the sense that each given subspace either resides entirely close to the code, or else contains only a small portion which resides close to the code. In particular, any given subspace's failure to reside entirely close to the code is necessarily witnessed, with high probability, by a uniformly randomly sampled element of that subspace. We investigate a variant of this phenomenon in which the witness is not sampled uniformly from the subspace, but rather from a much smaller subset of it. We show that a logarithmic number of random field elements (in the dimension of the subspace) suffice to effect an analogous proximity test, with moreover only a logarithmic (multiplicative) loss in the possible prevalence of false witnesses. We discuss applications to recent noninteractive proofs based on linear codes, including Brakedown and Orion (CRYPTO '22).
          
  Vlasis Koutsos, Dimitrios Papadopoulos
      We introduce the notion of publicly auditable functional encryption (PAFE).
Compared to standard functional encryption, PAFE operates in an extended setting that includes an entity called auditor, besides key-generating authority, encryptor, and decryptor. 
The auditor requests function outputs from the decryptor and wishes to check their correctness with respect to the ciphertexts produced by the encryptor, without having access to the functional secret key that is used for decryption. 
This is in contrast with previous approaches for result verifiability and consistency in functional encryption that aim to ensure decryptors about the legitimacy of the results they decrypt. We propose four different flavors of public auditability with respect to different sets of adversarially controlled parties (only decryptor, encryptor-decryptor, authority-decryptor, and authority-encryptor-decryptor) and provide constructions for building corresponding secure PAFE schemes from standard functional encryption, commitment schemes, and non-interactive witness-indistinguishable proof systems. 
At the core of our constructions lies the notion of a functional public key, that works as the public analog of the functional secret key of functional encryption and is used for verification purposes by the auditor. 
Crucially, in order to ensure that these new keys cannot be used to infer additional information about plaintext values (besides the requested decryptions by the auditor), we propose a new indistinguishability-based security definition for PAFE to accommodate not only functional secret key queries (as in standard functional encryption) but also functional public key and decryption queries.
Finally, we propose a publicly auditable multi-input functional encryption scheme (MIFE)  that supports inner-product functionalities and is secure against adversarial decryptors. Instantiated with existing MIFE using ``El Gamal''-like ciphertexts and Σ-protocols, this gives a lightweight publicly auditable scheme.
          
  Debadrita Talapatra, Nimish Mishra, Arnab Bag, Sikhar Patranabis, Debdeep Mukhopadhyay
      Encrypted computation has over the past thirty years, turned into one of the holy grails of modern cryptography especially with the advent of cloud computing. Modern cryptographic techniques like Fully Homomorphic Encryption (FHE) allow arbitrary Boolean circuit evaluation with encrypted inputs. However, the prohibitively high computation and storage overhead coupled with high communication bandwidth of
FHE severely limit its scalability in practical applications like real-time analytics or machine learning inference. In summary, the current cryptographic literature lacks robust and scalable methods for efficient encrypted computation in practical outsourced applications.
In this work, we introduce a new approach for encrypted computation called SEC (Symmetric Encryption-based Computation) which offers fast Boolean circuit evaluation with optimal storage and communication overhead while scaling smoothly to real applications. SEC relies on an efficient Searchable Symmetric Encryption (SSE) construction to leverage the power of encrypted lookups in Boolean circuit evaluation. SEC is specifically suited for client-server systems, and the server, honest-but-curious receives the client’s encrypted inputs and outputs the encrypted evaluation result while leaking only benign information to the server. SEC essentially extends the capabilities of SSE schemes from searching over encrypted databases to arbitrary function evaluation over encrypted inputs. SEC supports Boolean function composition, allowing it to evaluate complex functions efficiently without blowing up storage overhead. SEC outperforms the state-of-the-art FHE, namely, Torus FHE (TFHE) scheme with an average 103× speed-up in basic Boolean gate evaluations. We present a prototype implementation of SEC and experimentally validate its practical efficiency. Our experiments show that SEC executes arbitrary depth Boolean circuit in a single round of communication between client and server with a significant improvement in performance than the fastest TFHE backends. We exemplify the applicability of our scheme by implementing one byte AES SBox using SEC and comparing the results with TFHE.
          
  Benny Applebaum, Eliran Kachlon
      Let $C$ be an error-correcting code over a large alphabet $q$ of block length $n$, and assume that, a possibly corrupted, codeword $c$ is distributively stored among $n$ servers where the $i$th entry is being held by the $i$th server. Suppose that every pair of servers publicly announce whether the corresponding coordinates are ``consistent'' with some legal codeword or ``conflicted''. What type of information about $c$ can be inferred from this consistency graph? Can we check whether errors occurred and if so, can we find the error locations and effectively decode? We initiate the study of conflict-checkable and conflict-decodable codes and prove the following main results:
(1) (Almost-MDS conflict-checkable codes:) For every distance $d\leq n$, there exists a code that supports conflict-based error-detection whose dimension $k$ almost achieves the singleton bound, i.e., $k\geq n-d+0.99$. Interestingly, the code is non-linear, and we give some evidence that suggests that this is inherent. Combinatorially, this yields an $n$-partite graph over $[q]^n$ that contains $q^k$ cliques of size $n$ whose pair-wise intersection is at most $n-d\leq k-0.99$ vertices, generalizing a construction of Alon (Random Struct. Algorithms, '02) that achieves a similar result for the special case of triangles ($n=3$).
(2) (Conflict Decodable Codes below half-distance:) For every distance $d\leq n$ there exists a linear code that supports conflict-based error-decoding up to half of the distance. The code's dimension $k$ ``half-meets'' the singleton bound, i.e., $k=(n-d+2)/2$, and we prove that this bound is tight for a natural class of such codes. The construction is based on symmetric bivariate polynomials and is rooted in the literature on verifiable secret sharing (Ben-Or, Goldwasser and Wigderson, STOC '88; Cramer, Damgård, and Maurer, EUROCRYPT '00).
(3) (Robust Conflict Decodable Codes:) We show that the above construction also satisfies a non-trivial notion of robust decoding/detection even when the number of errors is unbounded and up to $d/2$ of the servers are Byzantine and may lie about their conflicts. The resulting conflict-decoder runs in exponential time in this case, and we present an alternative construction that achieves quasipolynomial complexity at the expense of degrading the dimension to $k=(n-d+3)/3$. Our construction is based on trilinear polynomials, and the algorithmic result follows by showing that the induced conflict graph is structured enough to allow efficient recovery of a maximal vertex cover.
As an application of the last result, we present the first polynomial-time statistical two-round Verifiable Secret Sharing (resp., three-round general MPC protocol) that remains secure in the presence of an active adversary that corrupts up to $t
  (1) (Almost-MDS conflict-checkable codes:) For every distance $d\leq n$, there exists a code that supports conflict-based error-detection whose dimension $k$ almost achieves the singleton bound, i.e., $k\geq n-d+0.99$. Interestingly, the code is non-linear, and we give some evidence that suggests that this is inherent. Combinatorially, this yields an $n$-partite graph over $[q]^n$ that contains $q^k$ cliques of size $n$ whose pair-wise intersection is at most $n-d\leq k-0.99$ vertices, generalizing a construction of Alon (Random Struct. Algorithms, '02) that achieves a similar result for the special case of triangles ($n=3$).
(2) (Conflict Decodable Codes below half-distance:) For every distance $d\leq n$ there exists a linear code that supports conflict-based error-decoding up to half of the distance. The code's dimension $k$ ``half-meets'' the singleton bound, i.e., $k=(n-d+2)/2$, and we prove that this bound is tight for a natural class of such codes. The construction is based on symmetric bivariate polynomials and is rooted in the literature on verifiable secret sharing (Ben-Or, Goldwasser and Wigderson, STOC '88; Cramer, Damgård, and Maurer, EUROCRYPT '00).
(3) (Robust Conflict Decodable Codes:) We show that the above construction also satisfies a non-trivial notion of robust decoding/detection even when the number of errors is unbounded and up to $d/2$ of the servers are Byzantine and may lie about their conflicts. The resulting conflict-decoder runs in exponential time in this case, and we present an alternative construction that achieves quasipolynomial complexity at the expense of degrading the dimension to $k=(n-d+3)/3$. Our construction is based on trilinear polynomials, and the algorithmic result follows by showing that the induced conflict graph is structured enough to allow efficient recovery of a maximal vertex cover.
As an application of the last result, we present the first polynomial-time statistical two-round Verifiable Secret Sharing (resp., three-round general MPC protocol) that remains secure in the presence of an active adversary that corrupts up to $t
Michael Mirkin, Lulu Zhou, Ittay Eyal, Fan Zhang
      Cryptocurrencies and decentralized platforms are rapidly gaining traction since Nakamoto's discovery of Bitcoin's blockchain protocol.
These systems use Proof of Work (PoW) to achieve unprecedented security for digital assets.
However, the significant power consumption and ecological impact of PoW are leading policymakers to consider stark measures against them and prominent systems to explore alternatives.
But these alternatives imply stepping away from key security aspects of PoW.
We present Sprints, a blockchain protocol that achieves almost the same security guarantees as PoW blockchains, but with an order-of-magnitude lower ecological impact, taking into account both power and hardware. To achieve this, Sprints forces miners to mine intermittently. It interleaves Proof of Delay (PoD, e.g., using a Verifiable Delay Function) and PoW, where only the latter bares a significant resource expenditure. We prove that in Sprints the attacker's success probability is the same as that of legacy PoW. To evaluate practical performance, we analyze the effect of shortened PoW duration, showing a minor reduction in resilience (49% instead of 50%). We confirm the results with a full implementation using 100 patched Bitcoin clients in an emulated network.
  We present Sprints, a blockchain protocol that achieves almost the same security guarantees as PoW blockchains, but with an order-of-magnitude lower ecological impact, taking into account both power and hardware. To achieve this, Sprints forces miners to mine intermittently. It interleaves Proof of Delay (PoD, e.g., using a Verifiable Delay Function) and PoW, where only the latter bares a significant resource expenditure. We prove that in Sprints the attacker's success probability is the same as that of legacy PoW. To evaluate practical performance, we analyze the effect of shortened PoW duration, showing a minor reduction in resilience (49% instead of 50%). We confirm the results with a full implementation using 100 patched Bitcoin clients in an emulated network.
Junru Li, Pengzhen Ke, Liang Feng Zhang
      An $n$-server information-theoretic \textit{Distributed Point Function} (DPF) allows a client to secret-share a  point function $f_{\alpha,\beta}(x)$ with domain $[N]$ and output group $\mathbb{G}$ among $n$ servers such that each server learns no information about the function from its share (called a key) but can compute an additive share of   $f_{\alpha,\beta}(x)$ for any $x$. DPFs with small key sizes and general output groups  are preferred. In this paper, we propose a new transformation from share conversions to information-theoretic DPFs.  By applying it to share conversions from Efremenko's PIR and Dvir-Gopi PIR,  we obtain both an  8-server  DPF with key size $O(2^{10\sqrt{\log N\log\log N}}+\log p)$ and output group $\mathbb{Z}_p$ and a 4-server DPF with key size $O(\tau \cdot 2^{6\sqrt{\log N\log\log N}})$ and output group $\mathbb{Z}_{2^\tau}$. The former allows us to partially answer an open question by  Boyle,  Gilboa,  Ishai, and  Kolobov (ITC 2022) and the latter allows us to build the first DPFs that may take any finite Abelian groups as output groups. We also discuss how to further reduce the key sizes by using different PIR, how to reduce  the number of servers by resorting to statistical security or using nice integers, and how to obtain DPFs with $t$-security. We  show the applications of the new DPFs by constructing new efficient PIR protocols with result verification.
          
  Jung Hee Cheon, Hyeongmin Choe, Julien Devevey, Tim Güneysu, Dongyeon Hong, Markus Krausz, Georg Land, Marc Möller, Damien Stehlé, MinJune Yi
      We present HAETAE(Hyperball bimodAl modulE rejecTion signAture schemE), a new lattice-based signature scheme, which we submitted to the Korean Post-Quantum Cryptography Competition for standardization. Like the NIST-selected Dilithium signature scheme, HAETAE is based on the Fiat-Shamir with Aborts paradigm,but our design choices target an improved complexity/compactness compromise that is highly relevant for many space-limited application scenarios. We primarily focus on reducing signature and verification key sizes so that signatures fit into one TCP or UDP datagram while preserving a high level of security against a variety of attacks. As a result, our scheme has signature and verification key sizes up to 40% and 25% smaller, respectively, compared than Dilithium. Moreover, we describe how to efficiently protect HAETAE against implementation attacks such as side-channel analysis, making it an attractive candidate for use in IoT and other embedded systems.
          
  01 May 2023
Duhyeong Kim, Dongwon Lee, Jinyeong Seo, Yongsoo Song
      In the last decade, zero-knowledge proof of knowledge protocols have been extensively studied to achieve active security of various cryptographic protocols. However, the existing solutions simply seek zero-knowledge for both message and randomness, which is an overkill in many applications since protocols may remain secure even if some information about randomness is leaked to the adversary.
We develop this idea to improve the state-of-the-art proof of knowledge protocols for RLWE-based public-key encryption and BDLOP commitment schemes. In a nutshell, we present new proof of knowledge protocols without using noise flooding or rejection sampling which are provably secure under a computational hardness assumption, called Hint-MLWE. We also show an efficient reduction from Hint-MLWE to the standard MLWE assumption.
Our approach enjoys the best of two worlds because it has no computational overhead from repetition (abort) and achieves a polynomial overhead between the honest and proven languages. We prove this claim by demonstrating concrete parameters and compare with previous results. Finally, we explain how our idea can be further applied to other proof of knowledge providing advanced functionality.
  We develop this idea to improve the state-of-the-art proof of knowledge protocols for RLWE-based public-key encryption and BDLOP commitment schemes. In a nutshell, we present new proof of knowledge protocols without using noise flooding or rejection sampling which are provably secure under a computational hardness assumption, called Hint-MLWE. We also show an efficient reduction from Hint-MLWE to the standard MLWE assumption.
Our approach enjoys the best of two worlds because it has no computational overhead from repetition (abort) and achieves a polynomial overhead between the honest and proven languages. We prove this claim by demonstrating concrete parameters and compare with previous results. Finally, we explain how our idea can be further applied to other proof of knowledge providing advanced functionality.
Emanuele Bellini, David Gerault, Juan Grados, Yun Ju Huang, Mohamed Rachidi, Sharwan Tiwari
      This paper introduces CLAASP, a Cryptographic Library for the Automated Analysis of Symmetric Primitives.  The library is designed to be modular, extendable, easy to use, generic, efficient and \emph{fully} automated. It is an extensive toolbox gathering state-of-the-art techniques aimed at simplifying the manual tasks of symmetric primitive designers and analysts. CLAASP is built on top of Sagemath and is open-source under the GPLv3 license. The central input of CLAASP is the description of a cryptographic primitive as a list of connected components in the form of a directed acyclic graph. From this representation, the library can automatically:  (1) generate the Python or C code of the primitive evaluation function, (2) execute a wide range of statistical and avalanche tests on the primitive, (3) generate SAT, SMT, CP and MILP models to search, for example, differential and linear trails, (4) measure algebraic properties of the primitive, (5) test neural-based distinguishers. In this work, we also present a comprehensive survey and comparison of other software libraries aiming at similar goals as CLAASP.
          
  Claude Carlet
      The graphs ${\cal G}_F=\{(x,F(x)); x\in \mathbb{F}_2^n\}$ of those $(n,n)$-functions $F:\mathbb{F}_2^n\mapsto \mathbb{F}_2^n$ that are almost perfect nonlinear (in brief, APN; an important notion in symmetric cryptography) are, equivalently to their original definition by K. Nyberg, those Sidon sets  (an important notion in combinatorics) $S$ in $({\Bbb F}_2^n\times {\Bbb F}_2^n,+)$ such that, for every $x\in {\Bbb F}_2^n$, there exists a unique $y\in {\Bbb F}_2^n$ such that $(x,y)\in S$. Any subset of a Sidon set being a Sidon set, an important question is to determine which Sidon sets are maximal relatively to the order of inclusion. In this paper, we study whether the graphs of APN functions are maximal (that is, optimal) Sidon sets. We show that this question is related to the problem of the existence / non-existence of pairs of APN functions lying at distance 1 from each others, and to the related problem of the existence of APN functions of algebraic degree $n$. We revisit the conjectures that have been made on these latter problems.
          
  Benedikt Bünz, Binyi Chen
      We provide a generic, efficient accumulation (or folding) scheme for any $(2k-1)$-move special sound protocol with a verifier that checks $\ell$ degree-$d$ equations. The accumulation verifier only performs $k+d-1$ elliptic curve multiplications and $k+1$ field operations. Alternatively the accumulation verifier performs just $k$ EC multiplications at the cost of $O(\log\ell)$ additional hashes. Using the compiler from BCLMS21 (Crypto 21), this enables building efficient IVC schemes where the recursive circuit only depends on the number of rounds and the verifier degree of the underlying special-sound protocol but not the proof size or the verifier time. We use our generic accumulation compiler to build ProtoStar.  ProtoStar is a non-uniform IVC scheme for Plonk that supports high-degree gates and (vector) lookups. The recursive circuit is dominated by the lesser of either $d^*+1$ group scalar multiplications or $\log(n)+2$ hashes, where $d^*$ is the degree of the highest gate and $n$ is the number of gates in a supported circuit. The scheme does not require a trusted setup or pairings, and the prover does not need to compute any FFTs. The prover in each accumulation/IVC step is also only logarithmic in the number of supported circuits and independent of the table size in the lookup.
          
  Hiroki Furue, Tsuyoshi Takagi
      The enumeration of all outputs of a given multivariate polynomial is a fundamental mathematical problem and is incorporated in some algebraic attacks on multivariate public key cryptosystems. For a degree-$d$ polynomial in $n$ variables over the finite field with $q$ elements, solving the enumeration problem classically requires $O\left(\tbinom{n+d}{d} \cdot q^n\right)$ operations. At CHES 2010, Bouillaguet et al. proposed a fast enumeration algorithm over the binary field $\mathbb{F}_2$. Their proposed algorithm covers all the inputs of a given polynomial following the order of Gray codes and is completed by $O(d\cdot2^n)$ bit operations. However, to the best of our knowledge, a result achieving the equivalent efficiency in general finite fields is yet to be proposed.
In this study, we propose a novel algorithm that enumerates all the outputs of a degree-$d$ polynomial in $n$ variables over $\mathbb{F}_q$ with a prime number $q$ by $O(d\cdot q^n)$ operations. The proposed algorithm is constructed by using a lexicographic order instead of Gray codes to cover all the inputs. This result can be seen as an extension of the result of Bouillaguet et al. to general finite fields and is almost optimal in terms of time complexity. We can naturally apply the proposed algorithm to the case where $q$ is a prime power. Notably, our enumeration algorithm differs from the algorithm by Bouillaguet et al. even in the case of $q=2$.
          
  Jonas Bertels, Michiel Van Beirendonck, Furkan Turan, Ingrid Verbauwhede
      The magic of Fully Homomorphic Encryption (FHE) is that it allows operations on encrypted data without decryption. Unfortunately, the slow computation time limits their adoption. The slow computation time results from the vast memory requirements (64Kbits per ciphertext), a bootstrapping key of 1.3 GB, and sizeable computational overhead (10240 NTTs, each NTT requiring 5120 32-bit multiplications). We accelerate the FHEW bootstrapping in hardware on a high-end U280 FPGA.
To reduce the computational complexity, we propose a fast hardware NTT architecture modified from with support for negatively wrapped convolution. The IP module includes large I/O ports to the NTT accelerator and an index bit-reversal block. The total architecture requires less than 225000 LUTs and 1280 DSPs.
Assuming that a fast interface to the FHEW bootstrapping key is available, the execution speed of FHEW bootstrapping can increase by at least 7.5 times.
  To reduce the computational complexity, we propose a fast hardware NTT architecture modified from with support for negatively wrapped convolution. The IP module includes large I/O ports to the NTT accelerator and an index bit-reversal block. The total architecture requires less than 225000 LUTs and 1280 DSPs.
Assuming that a fast interface to the FHEW bootstrapping key is available, the execution speed of FHEW bootstrapping can increase by at least 7.5 times.
Soham Roy, Anubhab Baksi, Anupam Chattopadhyay
      In this paper, we show an in-place implementation of the ASCON linear layer. An in-place implementation is important in the context of quantum computing, we expect our work will be useful in quantum implementation of ASCON. In order to get the implementation, we first write the ASCON linear layer as a binary matrix; then apply two legacy algorithms (Gauss-Jordan elimination and PLU factorization) as well as our modified version of Xiang et al.'s algorithm/source-code (published in ToSC/FSE'20). Our in-place implementation takes 1595 CNOT gates and 119 quantum depth; and this is the first in-place implementation of the ASCON linear layer, to the best of our knowledge.
          
  