IACR News
Here you can see all recent updates to the IACR webpage. These updates are also available:
29 November 2024
Giulia Gaggero, Elisa Gorla
ePrint Report
In this paper, we discuss what it means for a polynomial system to be random and how hard it is to solve a random polynomial system. We propose an algebraic definition of randomness, that we call algebraic randomness. Using a conjecture from commutative algebra, we produce a sharp upper bound for the degree of regularity, hence the complexity of solving an algebraically random polynomial system by Groebner bases methods. As a proof of concept, we apply our result to Rainbow and GeMSS and show that these systems are far from being algebraically random.
Jianjun HU
ePrint Report
In 2016,Petit et al. first studied the implementation of the index calculus method on elliptic curves in prime finite fields, and in 2018, Momonari and Kudo et al. improved algorithm of Petit et al. This paper analyzes the research results of Petit, Momonari and Kudo, and points out the existing problems of the algorithm. Therefore, with the help of sum polynomial function and index calculus, a pseudo-index calculus algorithm for elliptic curves discrete logarithm problem over prime finite fields is proposed, and its correctness is analyzed and verified. It is pointed out that there is no subexponential time method for solving discrete logarithms on elliptic curves in the finite fields of prime numbers, or at least in the present research background, there is no method for solving discrete logarithms in subexponential time.
Brandon "Cryptskii" Ramsay
ePrint Report
Presenting a formal analysis of the Overpass protocol's hierarchical state channel architecture, focusing on its unique approach to state synchronization and tamper detection through cryptographic primitives. The protocol achieves global state consistency without traditional consensus mechanisms by leveraging Sparse Merkle Trees (SMTs), zero-knowledge proofs, and a deterministic hierarchical structure. We provide mathematical proofs of security properties and analyze the protocol's efficiency in terms of computational and communication complexity.
Antonina Bondarchuk, Olive Chakraborty, Geoffroy Couteau, Renaud Sirdey
ePrint Report
This paper focuses on the issue of reducing the bandwidth requirement for FHE ciphertext transmission. While this issue has been extensively studied from the uplink viewpoint (transmission of encrypted inputs towards a FHE calculation) where several approaches exist to essentially cancel FHE ciphertext expansion, the downlink case (transmission of encrypted results towards an end-user) has been the object of much less attention. In this paper, we address this latter issue with a particular focus on the TFHE scheme for which we investigate a number of methods including several approaches for switching to more compact linearly homomorphic schemes, reducing the precision of T(R)LWE coefficients (while maintaining acceptable probabilities of decryption errors) and others. We also investigate how to use these methods in combination, depending on the number of FHE results to transmit. We further perform extensive experiments demonstrating that the downlink FHE ciphertext expansion factor can be practically reduced to values below 10, depending on the setup, with little additional computational burden.
Cas Cremers, Aleksi Peltonen, Mang Zhao
ePrint Report
Despite decades of work on threshold signature schemes, there is still limited agreement on their desired properties and threat models. In this work we significantly extend and repair previous work to give a unified syntax for threshold signature schemes and a new hierarchy of security notions for them. Moreover, our new hierarchy allows us to develop an automated analysis approach for protocols that use threshold signatures, which can discover attacks on protocols that exploit the details of the security notion offered by the used scheme, which can help choose the correct security notion (and scheme that fulfills it) that is required for a specific protocol.
Unlike prior work, our syntax for threshold signatures covers both non-interactive and interactive signature schemes with any number of key generation and signing rounds, and our hierarchy of security notions additionally includes elements such as various types of corruption and malicious key generation. We show the applicability of our hierarchy by selecting representative threshold signature schemes from the literature, extracting their core security features, and categorizing them according to our hierarchy. As a side effect of our work, we show through a counterexample that a previous attempt at building a unified hierarchy of unforgeability notions does not meet its claimed ordering, and show how to repair it without further restricting the scope of the definitions.
Based on our syntax and hierarchy, we develop the first systematic, automated analysis method for higher-level protocols that use threshold signatures. We use a symbolic analysis framework to abstractly model threshold signature schemes that meet security notions in our hierarchy, and implement this in the Tamarin prover. Given a higher-level protocol that uses threshold signatures, and a security notion from our hierarchy, our automated approach can find attacks on such protocols that exploit the subtle differences between elements of our hierarchy. Our approach can be used to formally analyze the security implications of implementing different threshold signature schemes in higher-level protocols.
Unlike prior work, our syntax for threshold signatures covers both non-interactive and interactive signature schemes with any number of key generation and signing rounds, and our hierarchy of security notions additionally includes elements such as various types of corruption and malicious key generation. We show the applicability of our hierarchy by selecting representative threshold signature schemes from the literature, extracting their core security features, and categorizing them according to our hierarchy. As a side effect of our work, we show through a counterexample that a previous attempt at building a unified hierarchy of unforgeability notions does not meet its claimed ordering, and show how to repair it without further restricting the scope of the definitions.
Based on our syntax and hierarchy, we develop the first systematic, automated analysis method for higher-level protocols that use threshold signatures. We use a symbolic analysis framework to abstractly model threshold signature schemes that meet security notions in our hierarchy, and implement this in the Tamarin prover. Given a higher-level protocol that uses threshold signatures, and a security notion from our hierarchy, our automated approach can find attacks on such protocols that exploit the subtle differences between elements of our hierarchy. Our approach can be used to formally analyze the security implications of implementing different threshold signature schemes in higher-level protocols.
Aikata Aikata, Daniel Sanz Sobrino, Sujoy Sinha Roy
ePrint Report
Fully Homomorphic Encryption (FHE) enables privacy-preserving computation but imposes significant computational and communication overhead on the client for the public-key encryption. To alleviate this burden, previous works have introduced the Hybrid Homomorphic Encryption (HHE) paradigm, which combines symmetric encryption with homomorphic decryption to enhance performance for the FHE client. While early HHE schemes focused on binary data, modern versions now support integer prime fields, improving their efficiency for practical applications such as secure machine learning.
Despite several HHE schemes proposed in the literature, there has been no comprehensive study evaluating their performance or area advantages over FHE for encryption tasks. This paper addresses this gap by presenting the first implementation of an HHE scheme- PASTA. It is a symmetric encryption scheme over integers designed to facilitate fast client encryption and homomorphic symmetric decryption on the server. We provide performance results for both FPGA and ASIC platforms, including a RISC-V System-on-Chip (SoC) implementation on a low-end 130nm ASIC technology, which achieves a 43–171$\times$ speedup compared to a CPU. Additionally, on high-end 7nm and 28nm ASIC platforms, our design demonstrates a 97$\times$ speedup over prior public-key client accelerators for FHE. We have made our design public and benchmarked an application to support future research.
Despite several HHE schemes proposed in the literature, there has been no comprehensive study evaluating their performance or area advantages over FHE for encryption tasks. This paper addresses this gap by presenting the first implementation of an HHE scheme- PASTA. It is a symmetric encryption scheme over integers designed to facilitate fast client encryption and homomorphic symmetric decryption on the server. We provide performance results for both FPGA and ASIC platforms, including a RISC-V System-on-Chip (SoC) implementation on a low-end 130nm ASIC technology, which achieves a 43–171$\times$ speedup compared to a CPU. Additionally, on high-end 7nm and 28nm ASIC platforms, our design demonstrates a 97$\times$ speedup over prior public-key client accelerators for FHE. We have made our design public and benchmarked an application to support future research.
Florian Hirner, Florian Krieger, Constantin Piber, Sujoy Sinha Roy
ePrint Report
Zero-knowledge proofs (ZKPs) are cryptographic protocols that enable one party to prove the validity of a statement without revealing the underlying data. Such proofs have applications in privacy-preserving technologies and verifiable computations. However, slow proof generation poses a significant challenge in the wide-scale adoption of ZKP. Orion is a recent ZKP scheme with linear prover time. It leverages coding theory, expander graphs, and Merkle hash trees to improve computational efficiency. However, the polynomial commitment phase in Orion is yet a primary performance bottleneck due to the memory-intensive nature of expander graph-based encoding and the data-heavy hashing required for Merkle Tree generation.
This work introduces several algorithmic and hardware-level optimizations aimed at accelerating Orion’s commitment phase. We replace the recursive encoding construction with an iterative approach and propose novel expander graph strategies optimized for hardware to enable more parallelism and reduce off-chip memory access. Additionally, we implement an on-the-fly expander graph generation technique, reducing memory usage by gigabytes. Further optimizations in Merkle Tree generation reduce the cost of SHA3 hashing, resulting in significant speedups of the polynomial commitment phase. Our FPGA implementation heavily optimizes access to the off-chip high-bandwidth memory (HBM) utilizing memory-efficient computational strategies. The accelerator demonstrates speedups of up to 381$\times$ for linear encoding and up to 2,390$\times$ for the hashing operations over a software implementation on a high-end CPU. In the context of real-world applications, such as zero-knowledge proof-of-training of deep neural networks (DNNs), our techniques show up to 241$\times$ speed up for the polynomial commitment.
This work introduces several algorithmic and hardware-level optimizations aimed at accelerating Orion’s commitment phase. We replace the recursive encoding construction with an iterative approach and propose novel expander graph strategies optimized for hardware to enable more parallelism and reduce off-chip memory access. Additionally, we implement an on-the-fly expander graph generation technique, reducing memory usage by gigabytes. Further optimizations in Merkle Tree generation reduce the cost of SHA3 hashing, resulting in significant speedups of the polynomial commitment phase. Our FPGA implementation heavily optimizes access to the off-chip high-bandwidth memory (HBM) utilizing memory-efficient computational strategies. The accelerator demonstrates speedups of up to 381$\times$ for linear encoding and up to 2,390$\times$ for the hashing operations over a software implementation on a high-end CPU. In the context of real-world applications, such as zero-knowledge proof-of-training of deep neural networks (DNNs), our techniques show up to 241$\times$ speed up for the polynomial commitment.
Gurgen Arakelov, Sergey Gomenyuk, Hovsep Papoyan
ePrint Report
The concept of a decentralized computer is a powerful and transformative idea that has proven its significance in enabling trustless, distributed computations. However, its application has been severely constrained by an inability to handle private data due to the inherent transparency of blockchain systems. This limitation restricts the scope of use cases, particularly in domains where confidentiality is critical.
In this work, we introduce a model for a Fully Homomorphic Encryption (FHE) decentralized computer. Our approach leverages recent advancements in FHE technology to enable secure computations on encrypted data while preserving privacy. By integrating this model into the decentralized ecosystem, we address the long-standing challenge of privacy in public blockchain environments. The proposed FHE computer supports a wide range of use cases, is scalable, and offers a robust framework for incentivizing developer contributions.
In this work, we introduce a model for a Fully Homomorphic Encryption (FHE) decentralized computer. Our approach leverages recent advancements in FHE technology to enable secure computations on encrypted data while preserving privacy. By integrating this model into the decentralized ecosystem, we address the long-standing challenge of privacy in public blockchain environments. The proposed FHE computer supports a wide range of use cases, is scalable, and offers a robust framework for incentivizing developer contributions.
Seunghwan Lee, Dohyuk Kim, Dong-Joon Shin
ePrint Report
This paper proposes a fast, compact key-size, and hardware-friendly bootstrapping using only 16-bit integer arithmetic and fully homomorphic encryption FHE16, which enables gate operations on ciphertexts using only 16-bit integer arithmetic. The proposed bootstrapping consists of unit operations on ciphertexts, such as (incomplete) number theoretic transform (NTT), inverse NTT, polynomial multiplication, gadget decomposition, and automorphism, under a composite modulus constructed from 16-bit primes. Since our bootstrapping does not use any floating-point operations, extra floating-point errors do not occur so that FHE16 can pack more message bits into a single ciphertext than TFHE-rs which utilizes floating-point operations. Furthermore, we propose multiple instruction multiple ciphertext(MIMC) method to accelerate the simultaneous execution of different homomorphic operations across multiple ciphertexts. Finally, experimental results show that the bootstrapping operation completes in 2.89 milliseconds for ciphertext dimension of 512.
Thomas Prévost, Olivier Alibart, Anne Marin, Marc Kaplan
ePrint Report
We introduce MULTISS, a new distributed storage proto-
col over multiple remote Quantum Key Distribution (QKD) networks
that ensures long-term data confidentiality. Our protocol extends LIN-
COS, a secure storage protocol that uses Shamir secret sharing
to distribute data in a single QKD network. Instead MULTISS uses a
hierarchical secret scheme that makes certain shares mandatory for the
reconstruction of the original secret. We prove that MULTISS ensures
that the stored data remain secure even if an eavesdropper (1) gets full
access to all storage servers of some of the QKD networks or (2) stores
and breaks later all the classical communication between the QKD net-
works. We demonstrate that this is strictly more secure than LINCOS
which is broken as soon as one QKD network is compromised.
Our protocol, like LINCOS, has a procedure to update the shares stored
in each QKD network without reconstructing the original data. In addi-
tion, we provide a procedure to recover from a full compromission of one
of the QKD network. In particular, we introduce a version of the protocol
that can only be implemented over a restricted network topologies, but
minimizes the communication required in the recovery procedure.
In practice, the MULTISS protocol is designed for the case of several
QKD networks at the metropolitan scale connected to each other through
channels secured by classical cryptography. Hence, MULTISS offers a
secure distributed storage solution in a scenario that is compatible with
the current deployment of quantum networks.
Zhuo Wu, Shi Qi, Xinxuan Zhang, Yi Deng
ePrint Report
This work introduces a novel technique to enhance the efficiency of proving composite statements. We present the \textit{Hash-and-Prove} framework to construct zkSNARKs for proving satisfiability of arithmetic circuits with additional \textit{Algebraic Gate}. These algebraic gates serve as building blocks for forming more generalized relations in algebra. Unlike Pedersen-committed \textit{Commit-and-Prove} SNARKs, which suffer from increased proof size and verification overhead when proving composite statements, our solution significantly improves both proof size and verification time while maintaining competitive and practical prover efficiency.
In the application of proof of solvency where we need to prove knowledge of $x$ such that SHA$256(g^x)=y$, our approach achieves a 100$\times$ reduction in proof size and a 500$\times$ reduction in verification time, along with a 2$\times$ speedup in proving time compared to the work of Agrawal et al.(CRYPTO 2018). For proving ECDSA signatures verification, we achieve a proof time of 2.1 seconds, which is a 70$\times$ speedup compared to using Groth16, and a proof size of 4.81 kb, which is a 160$\times$ reduction compared to Field Agnostic SNARKs(Block et al., CRYPTO 2024).
In the application of proof of solvency where we need to prove knowledge of $x$ such that SHA$256(g^x)=y$, our approach achieves a 100$\times$ reduction in proof size and a 500$\times$ reduction in verification time, along with a 2$\times$ speedup in proving time compared to the work of Agrawal et al.(CRYPTO 2018). For proving ECDSA signatures verification, we achieve a proof time of 2.1 seconds, which is a 70$\times$ speedup compared to using Groth16, and a proof size of 4.81 kb, which is a 160$\times$ reduction compared to Field Agnostic SNARKs(Block et al., CRYPTO 2024).
Yipeng Shi
ePrint Report
White-box cryptography is a software implementation technique based on lookup tables, with effective resistance against key extraction and code lifting attacks being a primary focus of its research. Space hardness is a widely used property for evaluating the resistance of white-box ciphers against code lifting attacks. However, none of the existing ciphers can provide strong space hardness under adaptively chosen-space attack model.
We propose a new scheme based on the lookup table pool and key guidance implementation as a more efficient approach to utilizing lookup tables to provide better security and practicality. Specifically, we introduce a new white-box cipher, RubikStone, which offers a range of variants from tens of kilobytes to infinite size. For the first time, we prove that all variants of RubikStone can provide strong space hardness under an adaptively chosen-space attack model. Additionally, we present a specific key guidance application for cloud-based DRM scenarios. Based on our proposed RubikStone variants, the key guidance applications can achieve at least overall $(0.950T, 128)$-space hardness.
Furthermore, we introduce a novel property, table consumption rate, for evaluating the durability of a specific white-box cryptographic implementation. In our evaluation, all the instantiations of RubikStone exhibit the lowest table consumption rate in algorithms with equally sized lookup tables. Besides, we conduct a comprehensive statistical analysis of the operations in all existing white-box ciphers. Our findings indicate that RubikStone remains highly competitive in terms of computational efficiency despite offering unprecedented levels of security.
Shaoqiang Wu, Ding Wang
ePrint Report
The password-hardening service (PH) is a crypto service that armors canonical password authentication with an external key against offline password guessing in case the password file is somehow compromised/leaked. The game-based formal treatment of PH was brought by Everspaugh et al. at USENIX Security'15. Their work is followed by efficiency-enhancing PO-COM (CCS'16), security-patching Phoenix (USENIX Security'17), and functionality-refining PW-Hero (SRDS'22). However, the issue of single points of failure (SPF) inherently impairs the availability of these PH schemes. More specifically, the failure of a single PH server responsible for crypto computation services will suspend password authentication for all users.
We propose the notion of reliable PH, which improves the availability of PH by eliminating SPF. We present a modular PH construction, TF-PH, essentially a generic compiler that can transform any PH protocol into a reliable one without SPF via introducing threshold failover. Particularly, we propose a concrete reliable PH protocol, called TF-RePhoenix, a simple and efficient construction with RePhoenix (which improves over Phoenix at USENIX Security'17) as the PH module. Security is proven within the universally composable (UC) security framework and the random oracle model (ROM), where we, for the first time, formalize the ideal UC functionalities of PH and reliable PH. We comparatively evaluate the efficiency of our TF-PH with the canonical threshold method (taken as an example, the threshold solution introduced by Brost et al. at CCS'20 in a PH-derived domain -- password-hardened encryption). Results show that our threshold failover-based solution to SPF provides optimal performance and achieves failover in a millisecond.
We propose the notion of reliable PH, which improves the availability of PH by eliminating SPF. We present a modular PH construction, TF-PH, essentially a generic compiler that can transform any PH protocol into a reliable one without SPF via introducing threshold failover. Particularly, we propose a concrete reliable PH protocol, called TF-RePhoenix, a simple and efficient construction with RePhoenix (which improves over Phoenix at USENIX Security'17) as the PH module. Security is proven within the universally composable (UC) security framework and the random oracle model (ROM), where we, for the first time, formalize the ideal UC functionalities of PH and reliable PH. We comparatively evaluate the efficiency of our TF-PH with the canonical threshold method (taken as an example, the threshold solution introduced by Brost et al. at CCS'20 in a PH-derived domain -- password-hardened encryption). Results show that our threshold failover-based solution to SPF provides optimal performance and achieves failover in a millisecond.
25 November 2024
Mia Filić, Keran Kocher, Ella Kummer, Anupama Unnikrishnan
ePrint Report
Probabilistic data structures (PDS) are compact representations of high-volume data that provide approximate answers to queries about the data. They are commonplace in today's computing systems, finding use in databases, networking and more. While PDS are designed to perform well under benign inputs, they are frequently used in applications where inputs may be adversarially chosen. This may lead to a violation of their expected behaviour, for example an increase in false positive rate.
In this work, we focus on PDS that handle approximate membership queries (AMQ). We consider adversarial users with the capability of making adaptive insertions, deletions and membership queries to AMQ-PDS, and analyse the performance of AMQ-PDS under such adversarial inputs.
We argue that deletions significantly empower adversaries, presenting a challenge to enforcing honest behaviour when compared to insertion-only AMQ-PDS. To address this, we introduce a new concept of an honest setting for AMQ-PDS with deletions. By leveraging simulation-based security definitions, we then quantify how much harm can be caused by adversarial users to the functionality of AMQ-PDS. Our resulting bounds only require calculating the maximal false positive probability and insertion failure probability achievable in our novel honest setting.
We apply our results to Cuckoo filters and Counting filters. We show how to protect these AMQ-PDS at low cost, by replacing or composing the hash functions with keyed pseudorandom functions in their construction. This strategy involves establishing practical bounds for the probabilities mentioned above. Using our new techniques, we demonstrate that achieving security against adversarial users making both insertions and deletions remains practical.
In this work, we focus on PDS that handle approximate membership queries (AMQ). We consider adversarial users with the capability of making adaptive insertions, deletions and membership queries to AMQ-PDS, and analyse the performance of AMQ-PDS under such adversarial inputs.
We argue that deletions significantly empower adversaries, presenting a challenge to enforcing honest behaviour when compared to insertion-only AMQ-PDS. To address this, we introduce a new concept of an honest setting for AMQ-PDS with deletions. By leveraging simulation-based security definitions, we then quantify how much harm can be caused by adversarial users to the functionality of AMQ-PDS. Our resulting bounds only require calculating the maximal false positive probability and insertion failure probability achievable in our novel honest setting.
We apply our results to Cuckoo filters and Counting filters. We show how to protect these AMQ-PDS at low cost, by replacing or composing the hash functions with keyed pseudorandom functions in their construction. This strategy involves establishing practical bounds for the probabilities mentioned above. Using our new techniques, we demonstrate that achieving security against adversarial users making both insertions and deletions remains practical.
Gal Cohen, Itamar Levy
ePrint Report
Interconnected devices enhance daily life but introduce security
vulnerabilities, new technologies enable malicious activities
such as information theft. This article combines radio frequency (RF) side-channel attacks with software Trojans to create a hard-to-detect, stealthy method for extracting kilobytes of secret information per millisecond over record distances with a single measurement in the RF spectrum. The technique exploits Trojan-induced electrical disturbances in RF components originating from peripherals, buses, memories and CPUs to achieve high SNR data leakage schemes. Experimental results show negligible acquisition time and stealth. The research introduces optimized modulation, demodulation schemes, and specialized synchronization symbols to minimize error rates and maximize data rates. It highlights the need for advanced detection and defense mechanisms to ensure the security and privacy of interconnected devices.
Pengfei Lu, Hongyuan Qu
ePrint Report
Private information retrieval (PIR) is a key component of many privacy-preserving systems. Although numerous PIR protocols have been proposed, designing a PIR scheme with communication overhead independent of the database size $N$ and computational cost practical for real-world applications remains a challenge. In this paper, we propose the NewtonPIR protocol, a communication efficient single-server PIR scheme. NewtonPIR can directly generate query values for the entire index without splitting the index and sending multiple query ciphertexts. Specifically, NewtonPIR achieves communication overhead that is 7.5$\times$ better than the state-of-the-art PIR protocol and 35.9$\sim$75$\times$ better than the other protocols. In experiments, when the database size and entry size increase, the communication overhead of NewtonPIR remains stable. By utilizing the single-ciphertext fully homomorphic encryption (FHE) scheme and the simple Newton interpolation polynomial, along with precomputing coefficients in the offline phase, we reduce the computational overhead of NewtonPIR from hours in previous schemes to seconds. To the best of our knowledge, NewtonPIR is the first protocol to achieve communication cost independent of $N$ along with computational overhead comparable to ring learning with errors (RLWE)-based PIR schemes. Additionally, we extend and introduce a private set intersection (PSI) protocol that balances computational and communication overhead more effectively.
Ling Song, Qinggan Fu, Qianqian Yang, Yin Lv, Lei Hu
ePrint Report
Impossible differential cryptanalysis is a crucial cryptanalytical method for symmetric ciphers. Given an impossible differential, the key recovery attack typically proceeds in two steps: generating pairs of data and then identifying wrong keys using the guess-and-filtering method. At CRYPTO 2023, Boura \etal first proposed a new key recovery technique - the differential meet-in-the-middle attack, which recovers the key in a meet-in-the-middle manner. Inspired by this technique, we incorporate the meet-in-the-middle technique into impossible cryptanalysis and propose a generic impossible differential meet-in-the-middle attack (\idma) framework. We apply \idma to block ciphers \skinny, \skinnye-v2, and \forkskinny and achieve remarkably efficient attacks. We improve the impossible differential attack on \skinny-$n$-$3n$ by 2 rounds in the single-tweakey setting and 1 round in the related-tweakey setting. For \skinnye-v2, the impossible differential attacks now can cover 2 more rounds in the related-tweakey setting and the first 23/24/25-round attacks in the single-tweakey model are given. For \forkskinny-$n$-$3n$, we improve the attacks by 2 rounds in the limited setting specified by the designers and 1 round in relaxed settings.
These results confirm that the meet-in-the-middle technique can result in more efficient key recovery, reaching beyond what traditional methods can achieve on certain ciphers.
Ruiyang Li, Chun Guo, Xiao Wang
ePrint Report
State-of-the-art garbling schemes for boolean circuits roughly consist of two families, i.e., ideal model garbling that combines linear operations and ideal blockciphers (aiming at maximizing performance), and PRF-based garbling that insists on using theoretically sound assumptions. In the linear garbling framework introduced by Zahur, Rosulek, and Evans (Eurocrypt 2015), it was established that garbling an AND gate requires at least $2(\kappa +1)$ bits of ciphertext, with $\kappa$ as the security parameter. Recent contributions from Lei Fan et al. and Chunghun Baek et al. have provided a detailed model showing that, under the free-XOR setting, which relies on a non-standard assumption, garbling an AND gate requires at least $1.5\kappa + O(1)$ bits. In contrast, regarding PRF-based garbling, the general model and efficiency bounds remain open questions.
In this paper, we present a comprehensive model for PRF-based garbled circuits and establish both the communication and computation lower bound. Specifically, we demonstrate that garbling an AND gate requires at least $2\kappa + 2$ bits communication and 6 PRF calls, while an XOR gate requires a minimum of $\kappa$ bits communication and 4 PRF calls. Notably, the state-of-the-art garbling scheme (GLNP scheme) under the PRF assumption, introduced by Shay, Yehuda, Ariel, and Benny (JOC 2018), uses $2\kappa + 4$ bits and 8 PRF calls for an AND gate, which exceeds our established lower bound. We finally introduce an optimal garbling scheme showing that our communication and computation lower bounds are tight.
In this paper, we present a comprehensive model for PRF-based garbled circuits and establish both the communication and computation lower bound. Specifically, we demonstrate that garbling an AND gate requires at least $2\kappa + 2$ bits communication and 6 PRF calls, while an XOR gate requires a minimum of $\kappa$ bits communication and 4 PRF calls. Notably, the state-of-the-art garbling scheme (GLNP scheme) under the PRF assumption, introduced by Shay, Yehuda, Ariel, and Benny (JOC 2018), uses $2\kappa + 4$ bits and 8 PRF calls for an AND gate, which exceeds our established lower bound. We finally introduce an optimal garbling scheme showing that our communication and computation lower bounds are tight.
Guangwu Xu, Ke Han, Yunxiao Tian
ePrint Report
The family of Koblitz curves $E_b: y^2=x^3+b/\mathbb{F}_p$ over primes fields has close connections to the ring $\mathbb{Z}[\omega]$ of Eisenstein integers. Utilizing nice facts from the theory of cubic residues, this paper derives an efficient formula for a (complex) scalar multiplication by $\tau=1-\omega$. This enables us to develop a window $\tau$-NAF method for Koblitz curves over prime fields. This probably is the first window $\tau$-NAF method to be designed for curves over fields with large characteristic. Besides its theoretical interest, a higher performance is also achieved due to the facts that (1) the operation $\tau^2$ can be done more efficiently that makes the average cost of $\tau$ to be close to $2.5\mathbf{S}+3\mathbf{M}$ ( $\mathbf{S}$ and $\mathbf{M}$ denote the costs for field squaring and multiplication, respectively); (2) the pre-computation for the window $\tau$-NAF method is surprisingly simple in that only one-third of the coefficients need to be processed. The overall improvement over the best current method is more than $11\%$. The paper also suggests a simplified modular reduction for Eisenstein integers where the division operations are eliminated. The efficient formula of $\tau P$ can be further used to speed up the computation of $3P$, compared to $10\mathbf{S}+5\mathbf{M}$ , our new formula just costs $4\mathbf{S}+6\mathbf{M}$. As a main ingredient for double base chain method for scalar multiplication, the $3P$ formula will contribute to a greater efficiency.
OPL4GPT: An Application Space Exploration of Optimal Programming Language for Hardware Design by LLM
Kimia Tasnia, Sazadur Rahman
ePrint Report
Despite the emergence of Large Language Models (LLMs) as potential tools for automating hardware design, the optimal programming language to describe hardware functions remains unknown. Prior works extensively explored optimizing Verilog-based HDL design, which often overlooked the potential capabilities of alternative programming languages for hardware designs. This paper investigates the efficacy of C++ and Verilog as input languages in extensive application space exploration, tasking an LLM to generate implementations for various System-on-chip functional blocks. We proposed an automated Optimal Programming Language (OPL) framework that leverages OpenAI's GPT-4o LLM to translate natural language specifications into hardware descriptions using both high-level and low-level programming paradigms. The OPL4GPT demonstration initially employs a novel prompt engineering approach that decomposes design specifications into manageable submodules, presented to the LLM to generate code in both C++ and Verilog. A closed-loop feedback mechanism automatically incorporates error logs from the LLM's outputs, encompassing both syntax and functionality. Finally, functionally correct outputs are synthesized using either RTL (Register-Transfer Level) for Verilog or High-Level Synthesis for C++ to assess area, power, and performance. Our findings illuminate the strengths and weaknesses of each language across various application domains, empowering hardware designers to select the most effective approach.