International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News

Updates on the COVID-19 situation are on the Announcement channel.

Here you can see all recent updates to the IACR webpage. These updates are also available:

RSS symbol icon
via RSS feed
Twitter bird icon
via Twitter
Weibo icon
via Weibo
Facebook icon
via Facebook

28 June 2024

Ulrich Haboeck, Al Kindi
ePrint Report ePrint Report
We discuss zero-knowledge in the context of FRI-based STARKs using techniques desirable in practice: Randomization by polynomials over the basefield, and decomposing the overall quotient into polynomials of smaller degree.
Expand
Ulrich Haboeck
ePrint Report ePrint Report
For primes $p$ with $p+1$ being smooth, the G-FFT from Li and Xing [LX23] is an algebraic FFT, which at first glance seems equivalent to the circle FFT from [IACR eprint 2024/278]: It also uses the circle curve over $\mathbb F_p$ (in other words the projective line) as underlying domain, and interpolates by low-degree functions with poles over the same set of points. However, their approach to control the degree of the FFT basis is fundamentally different. The G-FFT makes use of punctured Riemann-Roch spaces, and the construction works with the group doubling map only, no projection onto the $x$-axis involved.

In this note we give an elementary description of the G-FFT without using abstract algebra. We describe a variant which uses a simpler, and in our opinion more natural function space, and which treats the exceptional point of the domain (the group identity) differently. In comparison to the circle FFT, the G-FFT (both the original as well as our variant) has the following downsides. Interpolation and domain evaluation costs the double number of multiplications (the twiddle is not an ``odd'' function), and the function space is not invariant under the group action, causing additional overhead when applied in STARKs.
Expand
Sirui Shen, Chenglu Jin
ePrint Report ePrint Report
Garbled circuits (GC) are a secure multiparty computation protocol that enables two parties to jointly compute a function using their private data without revealing it to each other. While garbled circuits are proven secure at the protocol level, implementations can still be vulnerable to side-channel attacks. Recently, side-channel analysis of GC implementations has garnered significant interest from researchers.

We investigate popular open-source GC frameworks and discover that the AES encryption used in the garbling process follows a secret-dependent sequence. This vulnerability allows private inputs to be exposed through side-channel analysis. Based on this finding, we propose a side-channel attack on garbled circuits to recover the private inputs of both parties. Our attack does not require access to any plaintexts or ciphertexts in the protocol and is single-trace, adhering to the constraint that a garbled circuit can be executed only once. Furthermore, unlike existing attacks that can only target input non-XOR gates, our method applies to both input and internal non-XOR gates. Consequently, the secrets associated with every non-XOR gate are fully exposed as in an open book.

We comprehensively evaluate our attack in various scenarios. First, we perform the attack on single-platform software implementations of standard AES and interleaved AES on a 32-bit ARM processor, achieving a $100\%$ success rate in both cases. Next, we target a hardware implementation on a Xilinx Artix-7 FPGA, where the resolution of power consumption measurements and the number of samples are significantly limited. In this scenario, our attack achieves a success rate of $79.58\%$. Finally, we perform a cross-platform attack on two processors with different microarchitectures representing the two parties. The differing execution cycles and power sensors across the platforms increase the difficulty of side-channel analysis. Despite these challenges, our point-of-interest (POI) selection method allows our attack to achieve a $100\%$ success rate in this scenario as well. We also discuss effective countermeasures that can be readily applied to GC frameworks to mitigate this vulnerability.
Expand
Eleni Diamanti, Alex B. Grilo, Adriano Innocenzi, Pascal Lefebvre, Verena Yacoub, Álvaro Yángüez
ePrint Report ePrint Report
We present a new simulation-secure quantum oblivious transfer (QOT) protocol based on one-way functions in the plain model. With a focus on practical implementation, our protocol surpasses prior works in efficiency, promising feasible experimental realization. We address potential experimental errors and their correction, offering analytical expressions to facilitate the analysis of the required quantum resources. Technically, we achieve simulation security for QOT through an equivocal and relaxed-extractable quantum bit commitment.
Expand
Shuichi Katsumata, Michael Reichle, Kaoru Takemure
ePrint Report ePrint Report
T-out-of-N threshold signatures have recently seen a renewed interest, with various types now available, each offering different tradeoffs. However, one property that has remained elusive is adaptive security. When we target thresholdizing existing efficient signatures schemes based on the Fiat-Shamir paradigm such as Schnorr, the elusive nature becomes clear. This class of signature schemes typically rely on the forking lemma to prove unforgeability. That is, an adversary is rewound and run twice within the security game. Such a proof is at odds with adaptive security, as the reduction must be ready to answer 2(T-1) secret key shares in total, implying that it can reconstruct the full secret key. Indeed, prior works either assumed strong idealized models such as the algebraic group model (AGM) or modified the underlying signature scheme so as not to rely on rewinding based proofs.

In this work, we propose a new proof technique to construct adaptively secure threshold signatures for existing rewinding-based Fiat-Shamir signatures. As a result, we obtain the following: 1. The first adaptively secure 5 round lattice-based threshold signature under the MLWE and MSIS assumptions in the ROM. The resulting signature is a standard signature of Raccoon, a lattice-based signature scheme by del Pino et al., submitted to the additional NIST call for proposals. 2. The first adaptively secure 5 round threshold signature under the DL assumption in the ROM. The resulting signature is a standard Schnorr signature. To the best of our knowledge, this is the first adaptively secure threshold signature based on DL even assuming stronger models like AGM.

Our work is inspired by the recent statically secure lattice-based 3 round threshold signature by del Pino et al. (Eurocrypt~2024) based on Raccoon. While they relied on so-called one-time additive masks to solve lattice specific issues, we notice that these masks can also be a useful tool to achieve adaptive security. At a very high level, we use these masks throughout the signing protocol to carefully control the information the adversary can learn from the signing transcripts. Intuitively, this allows the reduction to return a total of 2(T-1) randomly sampled secret key shares to the adversary consistently and without being detected, resolving the above paradoxical situation. Lastly, by allowing the parties to maintain a simple state, we can compress our 5 round schemes into 4 rounds.
Expand
Animesh Singh, Sikhar Patranabis, Debdeep Mukhopadhyay
ePrint Report ePrint Report
An oblivious pseudorandom function (OPRF) is a two-party protocol in which a party holds an input and the other party holds the PRF key, such that the party having the input only learns the PRF output and the party having the key would not learn the input. Now, in a threshold oblivious pseudorandom function (TOPRF) protocol, a PRF key K is initially shared among T servers. A client can obtain a PRF value by interacting with t(≤ T) servers but is unable to compute the same with up to (t − 1) servers. In this paper, we present a practically efficient homomorphic encryption (HE)-based post-quantum secure TOPRF protocol. Our proposed approach, which is based on a novel use of threshold HE, is agnostic of the underlying PRF and outperforms existing fully homomorphic encryption (FHE)-based approaches for TOPRF computation by several orders of magnitude in terms of running time. The FHE-based approaches require bootstrapping, a computationally extensive operation, and the primary bottleneck for evaluating large-depth circuits. Whereas, our proposed approach is based on a multi-party computation (MPC) protocol that uses a threshold additive HE scheme based on Regev’s cryptosystem (J’ACM 2009) alternative to FHE-based approaches. Concretely, we show a novel replacement of bootstrapping required in traditional FHE schemes by a threshold additive HE-based interactive protocol that performs masked decryption followed by table look-ups, jointly performed by a group of servers holding secret shares of the HE decryption key. Finally, We present a practical validation of our approach by realizing an AES-based TOPRF with an evaluation time of less than 1 second on consumer-grade server(s).
Expand
Uddipana Dowerah, Subhranil Dutta, Frank Hartmann, Aikaterini Mitrokotsa, Sayantan Mukherjee, Tapas Pal
ePrint Report ePrint Report
Privacy is a major concern in large-scale digital applications, such as cloud-computing, machine learning services, and access control. Users want to protect not only their plain data but also their associated attributes (e.g., age, location, etc). Functional encryption (FE) is a cryptographic tool that allows fine-grained access control over encrypted data. However, existing FE fall short as they are either inefficient and far from reality or they leak sensitive user-specific information.

We propose SACfe, a novel attribute-based FE scheme that provides secure, fine-grained access control and hides both the user’s attributes and the function applied to the data, while preserving the data’s confidentiality. Moreover, it enables users to encrypt unbounded-length messages along with an arbitrary number of hidden attributes into ciphertexts. We design SACfe, a protocol for performing linear computation on encrypted data while enforcing access control based on inner product predicates. We show how SACfe can be used for online biometric authentication for privacy-preserving access control. As an additional contribution, we introduce an attribute-based linear FE for unbounded length of messages and functions where access control is realized by monotone span programs. We implement our protocols using the CiFEr cryptographic library and show its efficiency for practical settings.
Expand
Yijing Ning, Jiankuo Dong, Jingqiang Lin, Fangyu Zheng, Yu Fu, Zhenjiang Dong, Fu Xiao
ePrint Report ePrint Report
$SPHINCS^+$, one of the Post-Quantum Cryptography Digital Signature Algorithms (PQC-DSA) selected by NIST in the third round, features very short public and private key lengths but faces significant performance challenges compared to other post-quantum cryptographic schemes, limiting its suitability for real-world applications. To address these challenges, we propose the GPU-based paRallel Accelerated $SPHINCS^+$ (GRASP), which leverages GPU technology to enhance the efficiency of $SPHINCS^+$ signing and verification processes. We propose an adaptable parallelization strategy for $SPHINCS^+$, analyzing its signing and verification processes to identify critical sections for efficient parallel execution. Utilizing CUDA, we perform bottom-up optimizations, focusing on memory access patterns and hypertree computation, to enhance GPU resource utilization. These efforts, combined with kernel fusion technology, result in significant improvements in throughput and overall performance. Extensive experimentation demonstrates that our optimized CUDA implementation of $SPHINCS^+$ achieves superior performance. Specifically, our GRASP scheme delivers throughput improvements ranging from 1.37× to 5.13× compared to state-of-the-art GPU-based solutions and surpasses the NIST reference implementation by over three orders of magnitude, highlighting a significant performance advantage.
Expand
Ananya Appan, David Heath, Ling Ren
ePrint Report ePrint Report
Oblivious RAM (ORAM) allows a client to securely outsource memory storage to an untrusted server. It has been shown that no ORAM can simultaneously achieve small bandwidth blow-up, small client storage, and a single roundtrip of latency.

We consider a weakening of the RAM model, which we call the Single Access Machine (SAM) model. In the SAM model, each memory slot can be written to at most once and read from at most once. We adapt existing tree-based ORAM to obtain an oblivious SAM (OSAM) that has $O(\log n)$ bandwidth blow-up (which we show is optimal), small client storage, and a single roundtrip.

OSAM unlocks improvements to oblivious data structures/algorithms. For instance, we achieve oblivious unbalanced binary trees (e.g. tries, splay trees). By leveraging splay trees, we obtain a notion of caching ORAM, where an access in the worst case incurs amortized $O(\log^2 n)$ bandwidth blow-up and $O(\log n)$ roundtrips, but in many common cases (e.g. sequential scans) incurs only amortized $O(\log n)$ bandwidth blow-up and $O(1)$ roundtrips. We also give new oblivious graph algorithms, including computing minimum spanning trees and single source shortest paths, in which the OSAM client reads/writes $O(|E| \cdot \log |E|)$ words using $O(|E|)$ roundtrips, where $|E|$ is the number of edges. This improves over prior custom solutions by a log factor.

At a higher level, OSAM provides a general model for oblivious computation. We construct a programming interface around OSAM that supports arbitrary pointer-manipulating programs such that dereferencing a pointer to an object incurs $O(\log d \log n)$ bandwidth blowup and $O(\log d)$ roundtrips, where $d$ is the number of pointers to that object. This new interface captures a wide variety of data structures and algorithms (e.g., trees, tries, doubly-linked lists) while matching or exceeding prior best asymptotic results. It both unifies much of our understanding of oblivious computation and allows the programmer to write oblivious algorithms combining various common data structures/algorithms and beyond.
Expand
Ferhat Karakoç, Betül Güvenç Paltun, Leyli Karaçay, Ömer Tuna, Ramin Fuladi, Utku Gülen
ePrint Report ePrint Report
Enhancing privacy in federal learning (FL) without considering robustness can create an open door for attacks such as poisoning attacks on the FL process. Thus, addressing both the privacy and security aspects simultaneously becomes vital. Although, there are a few solutions addressing both privacy and security in the literature in recent years, they have some drawbacks such as requiring two non-colluding servers, heavy cryptographic operations, or peer-to-peer communication topology. In this paper, we introduce a novel framework that allows the server to run some analysis for detection and mitigation of attacks towards the FL process, while satisfying the confidentiality requirements for the training data against the server. We evaluate the effectiveness of the framework in terms of security and privacy by performing experiments on some concrete examples. We also provide two instantiations of the framework with two different secure aggregation protocols to give a more concrete view how the framework works and we analyse the computation and communication overhead of the framework.
Expand
Dung Bui, Geoffroy Couteau, Nikolas Melissaris
ePrint Report ePrint Report
In this note, we introduce structured-seed local pseudorandom generators, a relaxation of local pseudorandom generators. We provide constructions of this primitive under the sparse-LPN assumption, and explore its implications.
Expand
Martin Zbudila, Erik Pohle, Aysajan Abidin, Bart Preneel
ePrint Report ePrint Report
Secure multi-party computation (MPC) in a three-party, honest majority scenario is currently the state-of-the-art for running machine learning algorithms in a privacy-preserving manner. For efficiency reasons, fixed-point arithmetic is widely used to approximate computation over decimal numbers. After multiplication in fixed-point arithmetic, truncation is required to keep the result's precision. In this paper, we present an efficient three-party truncation protocol secure in the presence of an active adversary without pre-processing and improve on the current state-of-the-art in MPC over rings using replicated secret sharing (RSS). By adding an efficient consistency check, we lift the efficient but only passively secure three-party truncation protocol from the ABY3 framework by Mohassel and Rindal into the malicious setting without pre-processed data. Our benchmark indicates performance improvements of an order of magnitude in the offline phase for a single batch training. Finally, we apply our protocol to a real-world application for diagnostic prediction based on publicly available ECG heartbeat data. We achieve an improvement by a factor of two in the total throughput for both LAN and WAN settings.
Expand
Paula Arnold, Sebastian Berndt, Thomas Eisenbarth, Maximilian Orlt
ePrint Report ePrint Report
While passive side-channel attacks and active fault attacks have been studied intensively in the last few decades, strong attackers combining these attacks have only been studied relatively recently. Due to its simplicity, most countermeasures against passive attacks are based on additive sharing. Unfortunately, extending these countermeasures against faults often leads to quite a significant performance penalty, either due to the use of expensive cryptographic operations or a large number of shares due to massive duplication. Just recently, Berndt, Eisenbarth, Gourjon, Faust, Orlt, and Seker thus proposed to use polynomial sharing against combined attackers (CRYPTO 2023). While they construct gadgets secure against combined attackers using only a linear number of shares, the overhead introduced might still be too large for practical scenarios.

In this work, we show how the overhead of nearly all known constructions using polynomial sharing can be reduced by nearly half by embedding two secrets in the coefficients of one polynomial at the expense of increasing the degree of the polynomial by one. We present a very general framework that allows adapting these constructions to this new sharing scheme and prove the security of this approach against purely passive side-channel attacks, purely active fault attacks, and combined attacks. Furthermore, we present new gadgets allowing us to operate upon the different secrets in a number of useful ways.
Expand
Reyhaneh Rabaninejad, Behzad Abdolmaleki, Sebastian Ramacher, Daniel Slamanig, Antonis Michalas
ePrint Report ePrint Report
Self-sovereign identity (SSI) systems empower users to (anonymously) establish and verify their identity when accessing both digital and real-world resources, emerging as a promising privacy-preserving solution for user-centric identity management. Recent work by Maram et al. proposes the privacy-preserving Sybil-resistant decentralized SSI system CanDID (IEEE S&P 2021). While this is an important step, notable shortcomings undermine its efficacy. The two most significant among them being the following: First, unlinkability breaks in the presence of a single malicious issuer. Second, it introduces interactiveness, as the users are required to communicate each time with issuers to collect credentials intended for use in interactions with applications. This contradicts the goal of SSI, whose aim is to give users full control over their identities. This paper first introduces the concept of publicly verifiable attribute-based threshold anonymous counting tokens (tACT). Unlike recent approaches confined to centralized settings (Benhamouda et al., ASIACRYPT 2023), tACT operates in a distributed-trust environment. Accompanied by a formal security model and a provably secure instantiation, tACT introduces a novel dimension to token issuance, which, we believe, holds independent interest. Next, the paper leverages the proposed tACT scheme to construct an efficient Sybil-resistant SSI system. This system supports various functionalities, including threshold issuance, unlinkable multi-show selective disclosure, and non-interactive, non-transferable credentials that offer constant-size credentials. Finally, our benchmark results show an efficiency improvement in our construction when compared to CanDID all while accommodating a greater number of issuers and additionally reducing to a one-round protocol that can be run in parallel with all issuers.
Expand
Feixiang Zhao, Huaxiong Wang, Jian Weng
ePrint Report ePrint Report
Proxy re-encryption is a cryptosystem that achieves efficient encrypted data sharing by allowing a proxy to transform a ciphertext encrypted under one key into another ciphertext under a different key. Homomorphic proxy re-encryption (HPRE) extends this concept by integrating homomorphic encryption, allowing not only the sharing of encrypted data but also the homomorphic computations on such data. The existing HPRE schemes, however, are limited to a single or bounded number of hops of ciphertext re-encryptions. To address this limitation, this paper introduces a novel lattice-based, unbounded multi-hop fully homomorphic proxy re-encryption (FHPRE) scheme, with constant-size ciphertexts. Our FHPRE scheme supports an unbounded number of reencryption operations and enables arbitrary homomorphic computations over original, re-encrypted, and evaluated ciphertexts. Additionally, we propose a potential application of our FHPRE scheme in the form of a non-interactive, constant-size multi-user computation system for cloud computing environments.
Expand
Ghada Almashaqbeh, Sixia Chen, Alexander Russell
ePrint Report ePrint Report
Layer-two blockchain protocols emerged to address scalability issues related to fees, storage cost, and confirmation delay of on-chain transactions. They aggregate off-chain transactions into a fewer on-chain ones, thus offering immediate settlement and reduced transaction fees. To preserve security of the underlying ledger, layer-two protocols often work in a collateralized model; resources are committed on-chain to backup off-chain activities. A fundamental challenge that arises in this setup is determining a policy for establishing, committing, and replenishing the collateral in a way that maximizes the value of settled transactions.

In this paper, we study this problem under two settings that model collateralized layer-two protocols. The first is a general model in which a party has an on-chain collateral $C$ with a policy to decide on whether to settle or discard each incoming transaction. The policy also specifies when to replenish $C$ based on the remaining collateral value. The second model considers a discrete setup in which $C$ is divided among $k$ wallets, each of which is of size $C/k$, such that when a wallet is full, and so cannot settle any incoming transactions, it will be replenished. We devise several online policies for these models, and show how competitive they are compared to optimal (offline) policies that have full knowledge of the incoming transaction stream. To the best of our knowledge, we are the first to study and formulate online competitive policies for collateral and wallet management in the blockchain setting.
Expand
Nicholas Michel, Mohamed E. Najd, Ghada Almashaqbeh
ePrint Report ePrint Report
Automated market makers (AMMs) are a form of decentralized cryptocurrency exchanges and considered a prime example of Decentralized Finance (DeFi) applications. Their popularity and high trading activity have resulted in millions of on-chain transactions leading to serious scalability issues. In this paper, we address the on-chain storage overhead problem of AMMs by utilizing a new sidechain architecture as a layer 2 solution, building a system called ammBoost. Our system reduces the amount of on-chain transactions, boosts throughput, and supports blockchain pruning. We devise several techniques to enable layer 2 processing for AMMs while preserving correctness and security of the underlying AMM. We also build a proof-of-concept of ammBoost for a Uniswap-inspired use case to empirically evaluate its performance. Our experiments show that ammBoost decreases the gas cost by 94.53% and the chain growth by at least 80%, and that it can support up to 500x of the daily traffic volume observed for Uniswap in practice.
Expand
Zahra Motaqy, Mohamed E. Najd, Ghada Almashaqbeh
ePrint Report ePrint Report
Cryptocurrencies and blockchain technology provide an innovative model for reshaping digital services. Driven by the movement toward Web 3.0, recent systems started to provide distributed services, such as computation outsourcing or file storage, on top of the currency exchange medium. By allowing anyone to join and collect cryptocurrency payments for serving others, these systems create decentralized markets for trading digital resources. Yet, there is still a big gap between the promise of these markets and their practical viability. Existing initiatives are still early-stage and have already encountered security and efficiency obstacles. At the same time, existing work around promising ideas, specifically sidechains, fall short in exploiting their full potential in addressing these problems.

To bridge this gap, we propose chainBoost, a secure performance booster for decentralized resource markets. It expedites service related operations, reduces the blockchain size, and supports flexible service-payment exchange modalities at low overhead. At its core, chainBoost employs a sidechain, that has a (security and semantic) mutual-dependence with the mainchain, to which the system offloads heavy/frequent operations. To enable it, we develop a novel sidechain architecture composed of temporary and permanent blocks, a block suppression mechanism to prune the sidechain, a syncing protocol to permit arbitrary data exchange between the two chains, and an autorecovery protocol to support robustness and resilience. We analyze the security of chainBoost, and implement a proof-of-concept prototype for a distributed file storage market as a use case. For a market handling around 2000 transactions per round, our experiments show up to 11x improvement in throughput and 94% reduction in confirmation time. They also show that chainBoost can reduce the main blockchain size by about 90%, and that it outperforms comparable optimistic rollup solutions by reducing transaction finality by 99.7%.
Expand
Archisman Ghosh, Md. Abdur Rahman, Debayan Das, Santosh Ghosh, Shreyas Sen
ePrint Report ePrint Report
Mathematically secured cryptographic implementations leak critical information in terms of power, EM emanations, etc. Several circuit-level countermeasures are proposed to hinder side channel leakage at the source. Circuit-level countermeasures (e.g., IVR, STELLAR, WDDL, etc) are often preferred as they are generic and have low overhead. They either dither the voltage randomly or attenuate the meaningful signature at $V_{DD}$ port. Although any digital implementation has two generic ports, namely clock and $V_{DD}$, circuit-level countermeasures primarily focus on $V_{DD}$ port, and countermeasures using the clock are mainly unexplored. System-level clock randomization is ineffective due to post-processing techniques. This work, for the first time, presents clock-based countermeasures by providing a controlled slew that exploits the inherent variability of digital circuits in terms of power consumption and transforms power/EM emanation into a complex function of data and slew. Due to this, minimum traces-to-disclosure (MTD) improves by 100$\times$ with respect to the unprotected one. Moreover, the slewed clock reduces the leaky frequency, and the clock randomization countermeasure is more effective as it becomes more difficult} to post-process in the frequency domain. Clock slew and randomization together have a cumulative effect(1800x) more than the multiplication of individual techniques (100x & 5x respectively). In brief, this paper presents a clock-level generic synthesizable countermeasure technique that improved the minimum-traces-to-disclosure (MTD) by 1800$\times$ and incurs only 11% area overhead, $<3\%$ power overhead (measured) and $<6\%$ performance overhead (measured). Moreover, this can be easily combined with other power-port-based mitigation techniques for enhanced security.
Expand
Alan Li, Qingkai Liang, Mo Dong
ePrint Report ePrint Report
As deep learning is being widely adopted across various domains, ensuring the integrity of models has become increasingly crucial. Despite the recent advances in Zero-Knowledge Machine Learning (ZKML) techniques, proving the inference over large ML models is still prohibitive. To enable practical ZKML, model simplification techniques like pruning and quantization should be applied without hesitation. Contrary to conventional belief, recent development in ML space have demonstrated that these simplification techniques not only condense complex models into forms with sparse, low-bit weight matrices, but also maintain exceptionally high model accuracies that matches its unsimplified counterparts.

While such transformed models seem inherently ZK-friendly, directly applying existing ZK proof frameworks still lead to suboptimal inference proving performance. To make ZKML truly practical, a quantization-and-pruning-aware ZKML framework is needed. In this paper, we propose SpaGKR, a novel sparsity-aware ZKML framework that is proven to surpass capabilities of existing ZKML methods. SpaGKR is a general framework that is widely applicable to any computation structure where sparsity arises. It is designed to be modular - all existing GKR-based ZKML frameworks can be seamlessly integrated with it to get remarkable compounding performance enhancements. We tailor SpaGKR specifically to the most commonly-used neural network structure - the linear layer, and propose the SpaGKR-LS protocol that achieves asymptotically optimal prover time. Notably, when applying SpaGKR-LS to a special series of simplified model - ternary network, it achieves further efficiency gains by additionally leveraging the low-bit nature of model parameters.
Expand
◄ Previous Next ►