IACR News
Here you can see all recent updates to the IACR webpage. These updates are also available:
20 April 2020
Kristian L. McDonald
ePrint Report
Pointcheval-Sanders (PS) signatures are well-studied in the literature and have found use within e.g. threshold credential schemes and redactable anonymous credential schemes. The present work leverages a mapping between PS signatures and a related class of polynomial-based signatures to construct multiple new signature/credential schemes. Specifically, new protocols for multi-message signatures, sequential aggregate signatures, signatures for message commitments, redactable signatures, and unlinkable redactable signatures are presented. A redactable anonymous credential scheme is also constructed. All original protocols employ constant-sized secret keys rather than linear-sized (in the number of messages/attributes). Security properties of the new protocols are analysed and a general discussion of security properties for both PS signatures and the new schemes is provided.
Kristian L. McDonald
ePrint Report
Variant secret sharing schemes deriving from Shamir's threshold secret sharing protocol are presented. Results include multi-secret sharing protocols using shares with $O(1)$ elements, independent of the number of secrets. The new schemes achieve a weaker notion of security (they're secure rather than strongly secure) but maintain a property called $K$-privacy (inspired by $k$-anonymity). $K$-privacy ensures that all secrets remain private with respect to a subset of the secret space, though the particular subset providing privacy may vary among adversaries that acquire distinct sub-threshold sets of shares. Depending on the number of secrets and the protocol details, secure $K$-private multi-secret sharing schemes may be ``almost'' strongly secure or may remain merely secure and $K$-private - a difference captured by the notion of $K$-security. Novel applications of the multi-secret sharing schemes are presented, realising a primitive called a switched threshold signature. Switched threshold signatures have the quirky property that aggregating a threshold number of signatures of one type (e.g. Pointcheval-Sanders signatures) ``switches'' the signatures into a master signature of a different type. Collectively these results may permit efficiencies within, e.g., threshold credential issuance protocols.
Amir Jafari, Shahram Khazaei
ePrint Report
The information ratio of an access structure is an important parameter for quantifying the efficiency of the best secret sharing scheme (SSS) realizing it. The most common security notion is perfect security. The following relaxations, in increasing level of security, have been presented in the literature: quasi-perfect, almost-perfect and statistical. Understanding the power of relaxing the correctness and privacy requirements in the efficiency of SSSs is a long-standing open problem.
In this article, we introduce and study an extremely relaxed security notion, called partial security, for which it is only required that any qualified set gains strictly more information about the secret than any unqualified one. To compensate the extreme imperfection, we quantify the efficiency of such schemes using a parameter called partial information ratio. Despite our compensation, partial security turns out weaker than the weakest mentioned non-perfect security notion, i.e., quasi-perfect security.
We present three main results in this paper. First, we prove that partial and perfect information ratios coincide for the class of linear SSSs. Consequently, for this class, information ratio is invariant with respect to all security notions. Second, by viewing a partial SSS as a wiretap channel, we prove that for the general (i.e., non-linear) class of SSSs, partial and statistical information ratios are equal. Consequently, for this class, information ratio is invariant with respect to all non-perfect security notions. Third, we show that partial and almost-perfect information ratios do not coincide for the class of mixed-linear schemes (i.e., schemes constructed by combining linear schemes with different underlying finite fields).
Our first result strengthens the previous decomposition theorems for constructing perfect linear schemes. Our second result leads to a very strong decomposition theorem for constructing general (i.e., non-linear) statistical schemes. Our third result provides a rare example of the effect of imperfection on the efficiency of SSSs for a certain class of schemes.
In this article, we introduce and study an extremely relaxed security notion, called partial security, for which it is only required that any qualified set gains strictly more information about the secret than any unqualified one. To compensate the extreme imperfection, we quantify the efficiency of such schemes using a parameter called partial information ratio. Despite our compensation, partial security turns out weaker than the weakest mentioned non-perfect security notion, i.e., quasi-perfect security.
We present three main results in this paper. First, we prove that partial and perfect information ratios coincide for the class of linear SSSs. Consequently, for this class, information ratio is invariant with respect to all security notions. Second, by viewing a partial SSS as a wiretap channel, we prove that for the general (i.e., non-linear) class of SSSs, partial and statistical information ratios are equal. Consequently, for this class, information ratio is invariant with respect to all non-perfect security notions. Third, we show that partial and almost-perfect information ratios do not coincide for the class of mixed-linear schemes (i.e., schemes constructed by combining linear schemes with different underlying finite fields).
Our first result strengthens the previous decomposition theorems for constructing perfect linear schemes. Our second result leads to a very strong decomposition theorem for constructing general (i.e., non-linear) statistical schemes. Our third result provides a rare example of the effect of imperfection on the efficiency of SSSs for a certain class of schemes.
Asma Aloufi, Peizhao Hu, Yongsoo Song, and Kristin Lauter
ePrint Report
New cryptographic techniques such as homomorphic encryption (HE) allow computations to be outsourced to and evaluated blindfolded in a resourceful cloud. These computations often require private data owned by multiple participants, engaging in joint evaluation of some functions. For example, Genome-Wide Association Study (GWAS) is becoming feasible because of recent proliferation of genome sequencing technology. Due to the sensitivity of genomic data, these data should be encrypted using different keys. However, supporting computation on ciphertexts encrypted under multiple keys is a non-trivial task. In this paper, we present a comprehensive survey on different state-of-the-art cryptographic techniques and schemes that are commonly used. We review techniques and schemes including Attribute-Based Encryption (ABE), Proxy Re-Encryption (PRE), Threshold Homomorphic Encryption (ThHE), and Multi-Key Homomorphic Encryption (MKHE). We analyze them based on different system and security models and examine their complexities. We share lessons learned and draw observations for designing better schemes with reduced overheads.
19 April 2020
Tim Fritzmann, Georg Sigl, Johanna Sepúlveda
ePrint Report
Empowering electronic devices to support Post-Quantum Cryptography (PQC) is a challenging task. Compared with traditional cryptography, PQC introduces new mathematical elements and operations which are usually not easy to implement on standard CPU architectures. Especially for low cost and resource constraint devices, hardware acceleration is absolutely required. In addition, as the standardization process of PQC is still ongoing, a focus on maintaining crypto-agility is mandatory. To cope with such requirements, Hardware/Software Co-Design techniques have been recently used for developing complex and highly customized PQC solutions. However, while most of the previous works have developed loosely coupled PQC accelerators, the design of tightly coupled accelerators and Instruction Set Architecture (ISA) extensions for PQC have been barely explored. To this end, we present RISQ-V, an enhanced RISC-V architecture that integrates a set of powerful tightly coupled accelerators to speed up lattice-based PQC. RISQ-V efficiently reuses processor resources and reduces the amount of memory accesses. This significantly increases the performance while keeping the silicon area overhead low. We present three contributions. First, we propose a set of powerful hardware accelerators deeply integrated into the RISC-V pipeline. Second, we extended the RISC-V ISA with 28 new instructions to efficiently perform operations for lattice-based cryptography. Third, we implemented our RISQ-V in ASIC technology and on FPGA. We evaluated the performance of NewHope, Kyber, and Saber on RISQ-V. Compared to the pure software implementation on RISC-V, our Co-Design implementations show a speed up factor of up to 10.5 for NewHope, 9.6 for Kyber, and 2.7 for Saber. For the ASIC implementation, the energy consumption was reduced by factors of up to 8.8 for NewHope, 7.7 for Kyber, and 2.1 for Saber. The cell count of the CPU was increased by a factor of 1.6 compared to the original RISC-V design, which can be considered as a moderate increase for the achieved performance gain.
Thomas Agrikola, Geoffroy Couteau, Yuval Ishai, Stanislaw Jarecki, Amit Sahai
ePrint Report
We initiate a systematic study of pseudorandom encodings: efficiently computable and decodable encoding functions that map messages from a given distribution to a random-looking distribution. For instance, every distribution that can be perfectly compressed admits such a pseudorandom encoding. Pseudorandom encodings are motivated by a variety of cryptographic applications, including password-authenticated key exchange, "honey encryption" and steganography.
The main question we ask is whether every efficiently samplable distribution admits a pseudorandom encoding. Under different cryptographic assumptions, we obtain positive and negative answers for different flavors of pseudorandom encodings and relate this question to problems in other areas of cryptography. In particular, by establishing a two-way relation between pseudorandom encoding schemes and efficient invertible sampling algorithms, we reveal a connection between adaptively secure multi-party computation and questions in the domain of steganography.
The main question we ask is whether every efficiently samplable distribution admits a pseudorandom encoding. Under different cryptographic assumptions, we obtain positive and negative answers for different flavors of pseudorandom encodings and relate this question to problems in other areas of cryptography. In particular, by establishing a two-way relation between pseudorandom encoding schemes and efficient invertible sampling algorithms, we reveal a connection between adaptively secure multi-party computation and questions in the domain of steganography.
Satō Shinichi
ePrint Report
This paper revisits the Abe--Okamoto signature scheme to present a version of their signature scheme augmented with modern best practices, with major influences taken from EdDSA. Implementation guidance is given on how to reuse existing Ed25519 code.
Okan Seker, Thomas Eisenbarth, Maciej Liskiewicz
ePrint Report
White-box cryptography attempts to protect cryptographic secrets in pure software implementations. Due to its high utility, white-box cryptosystems (WBC) are deployed even though their secure construction is not well understood. A major breakthrough in generic cryptanalysis of WBC was Differential Computation Analysis (DCA), which requires minimal knowledge of the underlying white-box protection and also thwarts many obfuscation methods. To avert DCA, classic masking countermeasures originally intended to protect against highly related side channel attacks have been proposed for use in WBC. However, due to the controlled environment of WBCs, new algebraic attacks able to break all classic masking schemes have quickly been found. These algebraic DCA attacks break classic masking countermeasures efficiently, as they are independent of the masking order.
In this work, we propose a novel generic masking scheme that can resist both DCA and algebraic attacks. The proposed scheme extends the seminal work by Ishai et al. which is probing secure and thus resists DCA, to also resist algebraic attacks. To prove the security of our scheme, we demonstrate the connection between two main security notions in white-box cryptography: Side Channel Analysis (SCA) security and prediction security. Resistance of our masking scheme to DCA is proven for an arbitrary order of protection. Our masking scheme also resists algebraic attacks, which we show concretely for first and second order algebraic protection, and show how it can be generalized to any order. Moreover, we present an extensive performance analysis and quantify the overhead of our scheme, for a proof-of-concept protection of an AES implementation.
In this work, we propose a novel generic masking scheme that can resist both DCA and algebraic attacks. The proposed scheme extends the seminal work by Ishai et al. which is probing secure and thus resists DCA, to also resist algebraic attacks. To prove the security of our scheme, we demonstrate the connection between two main security notions in white-box cryptography: Side Channel Analysis (SCA) security and prediction security. Resistance of our masking scheme to DCA is proven for an arbitrary order of protection. Our masking scheme also resists algebraic attacks, which we show concretely for first and second order algebraic protection, and show how it can be generalized to any order. Moreover, we present an extensive performance analysis and quantify the overhead of our scheme, for a proof-of-concept protection of an AES implementation.
Alon Rosen
ePrint Report
Fine-grained cryptography is concerned with adversaries that are only moderately more powerful than the honest parties. We will survey recent results in this relatively underdeveloped area of study and examine whether the time is ripe for further advances in it.
Yonglin Hao, Gregor Leander, Willi Meier, Yosuke Todo, Qingju Wang
ePrint Report
A division property is a generic tool to search for integral distinguishers, and automatic tools such as MILP or SAT/SMT allow us to evaluate the propagation efficiently.
In the application to stream ciphers, it enables us to estimate the security of cube attacks theoretically, and it leads to the best key-recovery attacks against well-known stream ciphers.
However, it was reported that some of the key-recovery attacks based on the division property degenerate to distinguishing attacks due to the inaccuracy of the division property.
Three-subset division property (without unknown subset) is a promising method to solve this inaccuracy problem, and a new algorithm using automatic tools for the three-subset division property was recently proposed at Asiacrypt2019.
In this paper, we first show that this state-of-the-art algorithm is not always efficient and we cannot improve the existing key-recovery attacks. Then, we focus on the feature of the three-subset division property without unknown subset and propose another new efficient algorithm using automatic tools. Our algorithm is more efficient than existing algorithms, and it can improve existing key-recovery attacks. In the application to Trivium, we show an 842-round key-recovery attack. We also show that an 855-round key-recovery attack, which was proposed at CRYPTO2018, has a critical flaw and does not work. As a result, our 842-round attack becomes the best key-recovery attack. In the application to Grain-128AEAD, we show that the known 184-round key-recovery attack degenerates to distinguishing attacks. Then, the distinguishing attacks are improved up to 189 rounds, and we also show the best key-recovery attack against 190 rounds. In the application to ACORN, we prove that the 772-round key-recovery attack at ISC2019 is in fact a constant-sum distinguisher. We then give new key-recovery attacks mounting to 773- and 774-round ACORN. We also verify the current best key-recovery attack on 892-round Kreyvium and recover the exact superpoly.
In this paper, we first show that this state-of-the-art algorithm is not always efficient and we cannot improve the existing key-recovery attacks. Then, we focus on the feature of the three-subset division property without unknown subset and propose another new efficient algorithm using automatic tools. Our algorithm is more efficient than existing algorithms, and it can improve existing key-recovery attacks. In the application to Trivium, we show an 842-round key-recovery attack. We also show that an 855-round key-recovery attack, which was proposed at CRYPTO2018, has a critical flaw and does not work. As a result, our 842-round attack becomes the best key-recovery attack. In the application to Grain-128AEAD, we show that the known 184-round key-recovery attack degenerates to distinguishing attacks. Then, the distinguishing attacks are improved up to 189 rounds, and we also show the best key-recovery attack against 190 rounds. In the application to ACORN, we prove that the 772-round key-recovery attack at ISC2019 is in fact a constant-sum distinguisher. We then give new key-recovery attacks mounting to 773- and 774-round ACORN. We also verify the current best key-recovery attack on 892-round Kreyvium and recover the exact superpoly.
Hao Chen
ePrint Report
Since 2010 the Ring-LWE has been the hard computational problem for lattice cryptographic constructions. The fundamental problem is its hardness which has been based on the conjectured hardness of approximating ideal-SIVP or ideal-SVP. Though it is now widely conjectured both are hard in classical and quantum computation model there have no sufficient attacks proposed and considered. In this paper we propose sublattice attacks on Ring-LWE over an arbitrary number field from sublattice pairs. We give a sequence of number fields of degrees going to the infinity, such that the decision Ring-LWE with very wide error distributions over integer rings of can be solved by a polynomial time algorithm from our sublattice attack. The widths of error distributions in our attack is in the range of hardness reduction results. Hence we also prove that approximating ideal-SIVP with some polynomial factor for ideal lattices in these number fields can be solved by a polynomial time quantum algorithm.
Hiroshi Onuki, Yusuke Aikawa, Tsuyoshi Takagi
ePrint Report
In this paper, we consider the structure of isogeny graphs in SIDH, that is an isogeny-based key-exchange protocol. SIDH is the underlying protocol of SIKE, which is one of the candidates for NIST post quantum cryptography standardization. Since the security of SIDH is based on the hardness of the path-finding problem in isogeny graphs, it is important to study those structure. The existence of cycles in isogeny graph is related to the path-finding problem, so we investigate cycles in the graphs used in SIKE. In particular, we focus on SIKEp434 and SIKEp503, which are the parameter sets of SIKE claimed to satisfy the NIST security level 1 and 2, respectively. We show that there are two cycles in the 3-isogeny graph in SIKEp434, and there is no cycles in the other graphs in SIKEp434 and SIKEp503.
Yin Li, Yu Zhang, Wei He
ePrint Report
We continue the study of Mastrovito form of Karatsuba multipliers under the shifted polynomial basis (SPB), recently introduced by Li et al. (IEEE TC (2017)).
A Mastrovito-Karatsuba (MK) multiplier utilizes the Karatsuba algorithm (KA) to optimize polynomial multiplication and the Mastrovito approach to combine it with the modular reduction. The authors developed a MK multiplier for all trinomials, which obtain a better space and time trade-off compared with previous non-recursive Karatsuba counterparts.
Based on this work, we make two types of contributions in our paper.
FORMULATION. We derive a new modular reduction formulation for constructing Mastrovito matrix associated with Type II pentanomial. This formula can also be applied to other special type of pentanomials, e.g. Type I pentanomial and Type C.1 pentanomial. Through related formulations, we demonstrate that Type I pentanomial is less efficient than Type II one because of a more complicated modular reduction under the same SPB; conversely, Type C.1 pentanomial is as good as Type II pentanomial under an alternative generalized polynomial basis (GPB).
EXTENSION. We introduce a new MK multiplier for Type II pentanomial. It is shown that our proposal is only one $T_X$ slower than the fastest bit-parallel multipliers for Type II pentanomial, but its space complexity is roughly 3/4 of those schemes, where $T_X$ is the delay of one 2-input XOR gate. To the best of our knowledge, it is the first time for hybrid multiplier to achieve such a time delay bound.
FORMULATION. We derive a new modular reduction formulation for constructing Mastrovito matrix associated with Type II pentanomial. This formula can also be applied to other special type of pentanomials, e.g. Type I pentanomial and Type C.1 pentanomial. Through related formulations, we demonstrate that Type I pentanomial is less efficient than Type II one because of a more complicated modular reduction under the same SPB; conversely, Type C.1 pentanomial is as good as Type II pentanomial under an alternative generalized polynomial basis (GPB).
EXTENSION. We introduce a new MK multiplier for Type II pentanomial. It is shown that our proposal is only one $T_X$ slower than the fastest bit-parallel multipliers for Type II pentanomial, but its space complexity is roughly 3/4 of those schemes, where $T_X$ is the delay of one 2-input XOR gate. To the best of our knowledge, it is the first time for hybrid multiplier to achieve such a time delay bound.
Mike Hamburg
ePrint Report
The Montgomery ladder and Joye double-add ladder are well-known algorithms for elliptic curve scalar multiplication with a regular structure. The Montgomery ladder is best known for its implementation on Montgomery curves, which requires $5\mathbf{M} + 4\mathbf{S} + 1\mathbf{m} + 8\mathbf{A}$, and 6 field registers. Here $(\mathbf{M},\mathbf{S},\mathbf{m},\mathbf{A})$ represent respectively field multiplications, squarings, multiplications by a curve constant, and additions or subtractions. This ladder is also complete, meaning that it works on all input points and all scalars.
Many protocols do not use Montgomery curves, but instead use prime-order curves in short Weierstrass form. As of 2011, the fastest formulas for the Montgomery and Joye ladders on these curves each required $9\mathbf{M}+5\mathbf{S}+18\mathbf{A}$ per bit. In 2017, Kim et al. improved this for the Montgomery ladder only to $8\mathbf{M}+4\mathbf{S}+12\mathbf{A}+1\mathbf{H}$ per bit using 9 registers, where the $\mathbf{H}$ represents a halving. Hamburg simplified Kim et al.'s formulas to $8\mathbf{M}+4\mathbf{S}+8\mathbf{A}+1\mathbf{H}$ using 6 registers.
Here we present improved formulas which compute the Montgomery ladder on short Weierstrass curves using $8\mathbf{M}+3\mathbf{S}+7\mathbf{A}$ per bit, and requiring 6 registers. We also give formulas for the Joye ladder that use $9\mathbf{M}+3\mathbf{S}+7\mathbf{A}$ per bit, requiring 5 registers.
We also show a novel technique to make these ladders complete when the curve order is not divisible by 2 or 3, at a modest increase in cost.
Finally, we discuss curve invariants, exceptional points, side-channel protection and how to set up and finish these ladder operations.
Many protocols do not use Montgomery curves, but instead use prime-order curves in short Weierstrass form. As of 2011, the fastest formulas for the Montgomery and Joye ladders on these curves each required $9\mathbf{M}+5\mathbf{S}+18\mathbf{A}$ per bit. In 2017, Kim et al. improved this for the Montgomery ladder only to $8\mathbf{M}+4\mathbf{S}+12\mathbf{A}+1\mathbf{H}$ per bit using 9 registers, where the $\mathbf{H}$ represents a halving. Hamburg simplified Kim et al.'s formulas to $8\mathbf{M}+4\mathbf{S}+8\mathbf{A}+1\mathbf{H}$ using 6 registers.
Here we present improved formulas which compute the Montgomery ladder on short Weierstrass curves using $8\mathbf{M}+3\mathbf{S}+7\mathbf{A}$ per bit, and requiring 6 registers. We also give formulas for the Joye ladder that use $9\mathbf{M}+3\mathbf{S}+7\mathbf{A}$ per bit, requiring 5 registers.
We also show a novel technique to make these ladders complete when the curve order is not divisible by 2 or 3, at a modest increase in cost.
Finally, we discuss curve invariants, exceptional points, side-channel protection and how to set up and finish these ladder operations.
Houssem Maghrebi
ePrint Report
Deep Learning based Side-Channel Attacks (DL-SCA) are an emerging security assessment method increasingly being adopted by the majority of certification schemes and certification bodies to assess the resistance of cryptographic implementations. The related published investigations have demonstrated that DL-SCA are very efficient when targeting cryptographic designs protected with the common side-channel countermeasures. Furthermore, these attacks allow to streamline the evaluation process as the pre-processing of the traces (\emph{e.g.} alignment, dimensionality reduction, \dots) is no longer mandatory. In practice, the DL-SCA are applied following the divide-and-conquer strategy such that the target, for the training and the attack phases, only depends on $8$ key bits at most (to avoid high time complexity especially during the training). Then, the same process is repeated to recover the remaining bits of the key. To mitigate this practical issue, we propose in this work a new profiling methodology for DL-SCA based on the so-called multi-label classification. We argue that our new profiling methodology allows applying DL-SCA to target a bigger chunk of the key (typically $16$ bits) without introducing a learning time overhead and while guaranteeing a similar attack efficiency compared to the commonly used training strategy.
As a side benefit, we demonstrate that our leaning strategy can be applied as well to train several intermediate operations at once. Interestingly, we show that, in this context, our methodology is even faster than the classical training and leads to a more efficient key recovery phase. We validated the
soundness of our proposal on simulated traces and experimental data-sets; amongst them, some are publicly available side-channel databases. The obtained results have proven that our profiling methodology is of great practical interest especially in the context of performing penetration tests with high attack potential (\emph{e.g.} Common Criteria, EMVCO) where the time required to perform the attack has an impact on its final rating.
17 April 2020
Durham , USA, 16 November - 19 November 2020
TCC
Event date: 16 November to 19 November 2020
Submission deadline: 19 May 2020
Notification: 4 September 2020
Submission deadline: 19 May 2020
Notification: 4 September 2020
CHES
Due to the coronavirus pandemic, its consequences and the uncertainty
regarding when it will be under control and eventually stopped, the CHES
Steering Committee in coordination with the local organizers for 2020
and 2021 as well as with the IACR Board of Directors decided today to
turn CHES 2020 into a virtual event and to postpone the schedule for
physical conferences by one year. That means:
Please note that the TCHES publication cycle is not affected by this decision.
We apologize for any inconvenience caused and hope that we can all enjoy CHES 2021 in Beijing under normal conditions. In the meantime, please stay tuned for updates concerning the first virtual CHES 2020 in the next weeks!
The CHES Steering Committee
- CHES 2020 online,
- CHES 2021 Beijing
Please note that the TCHES publication cycle is not affected by this decision.
We apologize for any inconvenience caused and hope that we can all enjoy CHES 2021 in Beijing under normal conditions. In the meantime, please stay tuned for updates concerning the first virtual CHES 2020 in the next weeks!
The CHES Steering Committee
15 April 2020
Riham AlTawy, Guang Gong, Kalikinkar Mandal, Raghvendra Rohit
ePrint Report
This paper presents WAGE, a new lightweight sponge-based authenticated cipher whose underlying permutation is based on a 37-stage Galois NLFSR over $\mathbb{F}_{2^7}$. At its core, the round function of the permutation consists of the well-analyzed Welch-Gong permutation (WGP), primitive feedback polynomial, a newly designed 7-bit SB sbox and partial word-wise XORs. The construction of the permutation is carried out such that the design of individual components is highly coupled with cryptanalysis and hardware efficiency. As such, we analyze the security of WAGE against differential, linear, algebraic and meet/miss-in-the-middle attacks. For 128-bit authenticated encryption security, WAGE achieves a throughput of 535 Mbps with hardware area of 2540 GE in ASIC ST Micro 90 nm standard cell library. Additionally, WAGE is designed with a twist where its underlying permutation can be efficiently turned into a pseudorandom bit generator based on the WG transformation (WG-PRBG) whose output bits have theoretically proved randomness properties.
Sujoy Sinha Roy, Andrea Basso
ePrint Report
In this paper, we present an instruction set coprocessor architecture for the module lattice-based post-quantum key encapsulation (KEM) scheme Saber. To achieve fast computation time, the architecture is a full-hardware, i.e., all the building blocks (including CCA transformations) are implemented in the hardware. Since polynomial multiplication plays a performance-critical role in the module and ideal lattice-based public-key cryptography, a parallel polynomial multiplier architecture is proposed that overcomes memory access bottlenecks and results in a highly parallel yet simple and easy-to-scale design. Besides optimizing polynomial multiplication, we make important design decisions and perform architectural optimizations to reduce overall cycle counts as well as improve resource utilization.
For the module dimension 3 (security comparable to AES-192), the coprocessor computes CCA key generation, encapsulation, and decapsulation in only 5,453, 6,618 and 8,034 cycles respectively. On a Xilinx UltraScale+ XCZU9EG-2FFVB1156 FPGA, the entire instruction set coprocessor architecture runs at 250 MHz clock frequency and consumes 23,708 LUTs, 9764 FFs, and 2 BRAM tiles (including 5124 LUTs and 3070 FFs for the Keccak core).
For the module dimension 3 (security comparable to AES-192), the coprocessor computes CCA key generation, encapsulation, and decapsulation in only 5,453, 6,618 and 8,034 cycles respectively. On a Xilinx UltraScale+ XCZU9EG-2FFVB1156 FPGA, the entire instruction set coprocessor architecture runs at 250 MHz clock frequency and consumes 23,708 LUTs, 9764 FFs, and 2 BRAM tiles (including 5124 LUTs and 3070 FFs for the Keccak core).
Martin Westerkamp, Jacob Eberhardt
ePrint Report
We facilitate trusted cross-blockchain state proofs by implementing a chain-relay that validates block headers from proof-of-work blockchains. While current approaches require proof sizes linear to the amount of blocks the state was built on, trusted intermediaries, or economic assumptions, we propose the utilization of off-chain computations through zkSNARKs to provide a cryptographically secure and highly scalable sidechain mechanism. Multiple block headers are included in batches and verified off-chain, while preserving light client support. Only the validity of the off-chain computation is verified on-chain, creating a sidechain mechanism that requires constant verification costs and releases the target ledger from processing and storing every single block header of the source blockchain. We provide a prototypical implementation that facilitates the verification of 504 Bitcoin headers in a single proof on Ethereum using the ZoKrates framework. Hereby, the verification costs are reduced by a factor of 187 compared to current approaches such as BTC Relay.