Kevin Yeo

CryptoDB

Kevin Yeo

Publications and invited talks

Year

Venue

Title

2025

EUROCRYPT

Plinko: Single-Server PIR with Efficient Updates via Invertible PRFs Abstract

Alexander Hoover Sarvar Patel Giuseppe Persiano Kevin Yeo

We study single-server private information retrieval (PIR) where a client wishes to privately retrieve the $x$-th entry from a database held by a server without revealing the index $x$. In our work, we focus on PIR with client pre-processing where the client may compute hints during an offline phase. The hints are then leveraged during queries to obtain sub-linear online time. We present {\em Plinko} that is the first single-server PIR with client pre-processing that obtains optimal trade-offs between client storage and total (client and server) query time for all parameters. Our scheme uses $t = \tilde{O}(n/r)$ query time for any client storage size $r$. This matches known lower bounds of $r \cdot t = \Omega(n)$ up to logarithmic factors for all parameterizations whereas prior works could only match the lower bound when $r = \tilde{O}(\sqrt{n})$. Moreover, Plinko is also the first {\em updateable} PIR scheme where an entry can be updated in worst-case $\tilde{O}(1)$ time. As our main technical tool, we define the notion of an {\em invertible pseudorandom function} ({\em iPRF}) that generalizes standard PRFs to be equipped with an efficient inversion algorithm. We present a construction of an iPRF from one-way functions where forward evaluation runs in $\tilde{O}(1)$ time and inversion runs in time linear in the inverse set (output) size. Furthermore, our iPRF construction is the first that remains efficient and secure for arbitrary domain and range sizes (including small domains and ranges). In the context of single-server PIR, we show that iPRFs may be used to construct the first hint set representation where finding a hint containing an entry $x$ may be done in $\tilde{O}(1)$ time.

2025

EUROCRYPT

Weakly Super-Invertible Matrices and Constant Communication Dishonest Majority MPC Abstract

Alexander Bienstock Kevin Yeo

In recent years, there has been tremendous progress in improving the concrete communication complexity of dishonest majority MPC. In the sub-optimal corruption threshold setting where $t<(1-\varepsilon)\cdot n$ for some constant $0<\varepsilon\leq 1/2$, Sharing Transformation (Goyal $\textit{et al.}$, CRYPTO'22) and SuperPack (Escudero $\textit{et al.}$, EUROCRYPT'23) presented protocols with information-theoretic online phases requiring $O(1)$ field elements of total communication per multiplication gate. However, Sharing Transformation assumes that their offline phase is instantiated by a trusted party, while SuperPack instantiates their offline phase with large communication of $\Omega(n)$ per multiplication gate assuming oblivious linear evaluation (OLE) correlations. The main bottleneck in instantiating the offline phases of both protocols is generating random packed beaver triples of the form $[\mathbf{a}],[\mathbf{b}],[\mathbf{c}]$, for random $\mathbf{a},\mathbf{b}\in\mathbb{F}^k$, and $\mathbf{c}=\mathbf{a}*\mathbf{b}\in\mathbb{F}^k$, where $k=\Omega(n)$ is the $\textit{packing parameter}$. To address this bottleneck, our main technical contribution is introducing and constructing $\textit{weakly}$ super-invertible matrices, a relaxation of super-invertible matrices in which sub-matrices have high (but not necessarily full) rank. This relaxation allows for matrices with only $\widetilde{O}(n)$ non-zero entries, enabling a first step towards generating packed beaver triples with $\widetilde{O}(1)$ total communication per underlying triple, assuming OLE correlations. As the second (and final) step, we use the efficient $\textit{triple extraction}$ protocol of (Choudhury and Patra, Trans. Inform. Theory '17). We also implement our packed beaver triple protocol and provide experimental results. Our new protocol obtains up to 38% smaller communication and 9% reduction in runtime compared to SuperPack's triple protocol. Additionally, by instantiating SuperPack's offline phase with our new protocol, we obtain up to 16% communication reductions. Finally, we use our packed beaver triple protocol to instantiate the offline phase of Sharing Transformation, yielding a dishonest majority MPC protocol with $\widetilde{O}(|C|)$ total communication across both the offline and online phases.

2025

EUROCRYPT

Fine-Grained Complexity in a World without Cryptography Abstract

Josh Alman Yizhi Huang Kevin Yeo

The study of fine-grained cryptography has proliferated in recent years due to its allure of potentially relying on weaker assumptions compared to standard cryptography. As fine-grained cryptography only requires polynomial gaps between the adversary and honest parties, it seems plausible to build primitives relying upon popular hardness assumptions about problems in $\mathbf{P}$ such as $k$-SUM or Zero-$k$-Clique. The ultimate hope is that fine-grained cryptography could still be viable even if all current cryptographic assumptions are false, such as if $\mathbf{P} = \mathbf{NP}$ or if we live in Pessiland where one-way functions do not exist. In our work, we consider whether this approach is viable by studying fine-grained complexity when all standard cryptographic assumptions are false. As our main result, we show that many popular fine-grained complexity problems are easy to solve in the average-case when one-way functions do not exist. In other words, many candidate hardness assumptions for building fine-grained cryptography are no longer options in Pessiland. As an example, we prove that the average-case $k$-SUM or Zero-$k$-Clique conjectures are false for sufficiently large constant $k$ when no one-way functions exist. The average-case Zero-$k$-Clique assumption was used to build fine-grained key-exchange by Lavigne {\em et al.} [CRYPTO'19]. One can also view the contrapositive of our result as providing an explicit construction of one-way functions assuming $n^{\omega_k(1)}$ average-case hardness of $\ksum$ or $\zkc$ for all constant $k$. We also show that barriers for reductions in fine-grained complexity may be explained by problems in cryptography. First, we show that finding faster algorithms for computing discrete logarithms is equivalent to designing average-case equivalence between $k$-SUM and $k$-CYC (an extension of $k$-SUM to cyclic groups). In particular, finding such a reduction from $k$-CYC to $k$-SUM could potentially lead to breakthrough algorithms for the discrete logarithm, factoring, RSA and quadratic residuosity problems. Finally, we show that discrete logarithms with preprocessing may be reduced to the $k$-CYC-Index problem, and we present faster algorithms for average-case $k$-SUM-Index and $k$-CYC-Index.

2023

EUROCRYPT

Lower Bound Framework for Differentially Private and Oblivious Data Structures Abstract

Giuseppe Persiano Kevin Yeo

In recent years, there has been significant work in studying data structures that provide privacy for the operations that are executed. These primitives aim to guarantee that observable access patterns to physical memory do not reveal substantial information about the queries and updates executed on the data structure. Multiple recent works, including Larsen and Nielsen [Crypto'18], Persiano and Yeo [Eurocrypt'19], Hub{\'{a}}{\v{c}}ek {\em et al.} [TCC'19] and Komargodski and Lin [Crypto'21], have shown that logarithmic overhead is required to support even basic RAM (array) operations for various privacy notions including obliviousness and differential privacy as well as different choices of sizes for RAM blocks $b$ and memory cells $\omega$. We continue along this work and present the first logarithmic lower bounds for differentially private RAMs (DPRAMs) that apply regardless of the sizes of blocks $b$ and cells $\omega$. This is the first logarithmic lower bounds for DPRAMs when blocks are significantly smaller than cells, that is $b \ll \omega$. Furthermore, we present new logarithmic lower bounds for differentially private variants of classical data structure problems including sets, predecessor (successor) and disjoint sets (union-find) for which sub-logarithmic plaintext constructions are known. All our lower bounds also extend to the multiple non-colluding servers setting. We also address an unfortunate issue with this rich line of work where the lower bound techniques are difficult to use and required customized for each new result. To make the techniques more accessible, we generalize our proofs into a framework that reduces proving logarithmic lower bounds to showing that a specific problem satisfies two simple, minimal conditions. We show our framework is easy-to-use as all the lower bounds in our paper utilize the framework and hope our framework will spur more usage of these lower bound techniques.

2023

EUROCRYPT

Lower Bounds for (Batch) PIR with Private Preprocessing Abstract

Kevin Yeo

In this paper, we study (batch) private information retrieval with private preprocessing. Private information retrieval (PIR) is the problem where one or more servers hold a database of $n$ bits and a client wishes to retrieve the $i$-th bit in the database from the server(s). In PIR with private preprocessing (also known as offline-online PIR), the client is able to compute a private $r$-bit hint in an offline stage that may be leveraged to perform retrievals accessing at most $t$ entries. For privacy, the client wishes to hide index $i$ from an adversary that has compromised some of the servers. In the batch PIR setting, the client performs queries to retrieve the contents of multiple entries simultaneously. We present a tight characterization for the trade-offs between hint size $r$ and number of accessed entries $t$ during queries. For any PIR scheme that enables clients to perform batch retrievals of $k$ entries, we prove a lower bound of $tr = \Omega(nk)$ when $r \ge k$. When $r < k$, we prove that $t = \Omega(n)$. Our lower bounds hold when the scheme errs with probability at most $1/15$ and against PPT adversaries that only compromise one out of $\ell$ servers for any $\ell = O(1)$. Our work also closes the multiplicative logarithmic gap for the single query setting $(k = 1)$ as our lower bound matches known constructions. Our lower bounds hold in the model where each database entry is stored without modification but each entry may be replicated arbitrarily. Finally, we show connections between PIR and the online matrix-vector (OMV) conjecture from fine-grained complexity. We present barriers for proving lower bounds for two-server PIR schemes in general computational models as they would immediately imply the OMV conjecture.

2023

CRYPTO

Limits of Breach-Resistant and Snapshot-Oblivious RAMs Abstract

Giuseppe Persiano Kevin Yeo

Oblivious RAMs (ORAMs) are an important cryptographic primitive that enable outsourcing data to a potentially untrusted server while hiding patterns of access to the data. ORAMs provide strong guarantees even in the face of a {\em persistent adversary} that views the transcripts of all operations and resulting memory contents. Unfortunately, the strong guarantees against persistent adversaries comes at the cost of efficiency as ORAMs are known to require $\Omega(\log n)$ overhead. In an attempt to obtain faster constructions, prior works considered security against {\em snapshot adversaries} that only have limited access to operational transcripts and memory. We consider $(s,\ell)$-snapshot adversaries that perform $s$ data breaches and views the transcripts of $\ell$ total queries. Promisingly, Du, Genkin and Grubbs [Crypto'22] presented an ORAM construction with $O(\log \ell)$ overhead protecting against $(1,\ell)$-snapshot adversaries with the transcript of $\ell$ consecutive operations from a single breach. For small values of $\ell$, this outperforms standard ORAMs. In this work, we tackle whether it is possible to further push this construction beyond a single breach. Unfortunately, we show that protecting against even slightly stronger snapshot adversaries becomes difficult. As our main result, we present a $\Omega(\log n)$ lower bound for any ORAM protecting against a $(3,1)$-snapshot adversary that performs three breaches and sees the transcript of only one query. In other words, our lower bound holds even if an adversary observes only memory contents during two breaches while managing to view the transcript of only one query in the other breach. Therefore, we surprisingly show that protecting against a snapshot adversary with three data breaches is as difficult as protecting against a persistent adversary.

2023

CRYPTO

Cuckoo Hashing in Cryptography: Optimal Parameters, Robustness and Applications Abstract

Kevin Yeo

Cuckoo hashing is a powerful primitive that enables storing items using small space with efficient querying. At a high level, cuckoo hashing maps $n$ items into $b$ entries storing at most $\ell$ items such that each item is placed into one of $k$ randomly chosen entries. Additionally, there is an overflow stash that can store at most s items. Many cryptographic primitives rely upon cuckoo hashing to privately embed and query data. It is integral to ensure small failure probability for constructing cuckoo hashing tables as it directly relates to the privacy. As our main result, we present a more query-efficient cuckoo hashing construction using more hash functions. For construction failure probability ε, the query overhead of our scheme is $O(1 + \sqrt{\log(1/\epsilon)/\log n})$. Our scheme has quadratically smaller query overhead than prior works for any target failure probability $\epsilon$. We also prove lower bounds matching our construction. Our improvements come from a new understanding of the locality of cuckoo hashing failures for small sets of items. We also initiate the study of robust cuckoo hashing where the input set may be chosen with knowledge of the hash functions. We present a cuckoo hashing scheme using more hash functions with query overhead $\tilde{O}(log \lambda)$ that is robust against $\poly(\lambda)$ adversaries. Furthermore, we present lower bounds showing that this construction is tight and that extending previous approaches of large stashes or entries cannot obtain robustness except with $\Omega(n)$ query overhead. As applications of our results, we obtain improved constructions for batch codes and PIR. In particular, we present the most efficient explicit batch code and blackbox reduction from single-query PIR to batch PIR.

2021

ASIACRYPT

Efficient Boolean Search over Encrypted Data with Reduced Leakage 📺 Abstract

Sarvar Patel Giuseppe Persiano Joon Young Seo Kevin Yeo

Encrypted multi-maps enable outsourcing the storage of a multi-map to an untrusted server while maintaining the ability to query privately. We focus on encrypted Boolean multi-maps that support arbitrary Boolean queries over the multi-map. Kamara and Moataz [Eurocrypt’17] presented the first encrypted multi-map, BIEX, that supports CNF queries with optimal communication, worst-case sublinear search time and non-trivial leakage. We improve on previous work by presenting a new construction CNFFilter for CNF queries with significantly less leakage than BIEX, while maintaining both optimal communication and worst-case sublinear search time. As a direct consequence our construction shows additional resistance to leakage-abuse attacks in comparison to prior works. For most CNF queries, CNFFilter avoids leaking the result sets for any singleton queries for labels appearing in the CNF query. As an example, for the CNF query of the form (l1 ∨ l2) ∧ l3, our scheme does not leak the result sizes of queries to l1, l2 or l3 individually. On the other hand, BIEX does leak some of this information. This is just an example of the reduced leakage obtained by CNFFilter. The core of CNFFilter is a new filtering algorithm that performs set intersections with significantly less leakage compared to prior works. We implement CNFFilter and show that CNFFilter achieves faster search times and similar communication overhead compared to BIEX at the cost of a small increase in server storage.

2021

TCC

Forward Secret Encrypted RAM: Lower Bounds and Applications 📺 Abstract

Alexander Bienstock Yevgeniy Dodis Kevin Yeo

In this paper, we study forward secret encrypted RAMs (FS eRAMs) which enable clients to outsource the storage of an n-entry array to a server. In the case of a catastrophic attack where both client and server storage are compromised, FS eRAMs guarantee that the adversary may not recover any array entries that were deleted or overwritten prior to the attack. A simple folklore FS eRAM construction with O(logn) overhead has been known for at least two decades. Unfortunately, no progress has been made since then. We show the lack of progress is fundamental by presenting an \Omega(log n) lower bound for FS eRAMs proving that the folklore solution is optimal. To do this, we introduce the symbolic model for proving cryptographic data structures lower bounds that may be of independent interest. Given this limitation, we investigate applications where forward secrecy may be obtained without the additional O(log n) overhead. We show this is possible for oblivious RAMs, memory checkers, and multicast encryption by incorporating the ideas of the folklore FS eRAM solution into carefully chosen constructions of the corresponding primitives.

2020

CRYPTO

Lower Bounds for Encrypted Multi-Maps and Searchable Encryption in the Leakage Cell Probe Model 📺 Abstract

Sarvar Patel Giuseppe Persiano Kevin Yeo

Encrypted multi-maps (EMMs) enable clients to outsource the storage of a multi-map to a potentially untrusted server while maintaining the ability to perform operations in a privacy-preserving manner. EMMs are an important primitive as they are an integral building block for many practical applications such as searchable encryption and encrypted databases. In this work, we formally examine the tradeoffs between privacy and efficiency for EMMs. Currently, all known dynamic EMMs with constant overhead reveal if two operations are performed on the same key or not that we denote as the global key-equality pattern. In our main result, we present strong evidence that the leakage of the global key-equality pattern is inherent for any dynamic EMM construction with $O(1)$ efficiency. In particular, we consider the slightly smaller leakage of decoupled key-equality pattern where leakage of key-equality between update and query operations is decoupled and the adversary only learns whether two operations of the same type are performed on the same key or not. We show that any EMM with at most decoupled key-equality pattern leakage incurs $\Omega(\log n)$ overhead in the leakage cell probe model. This is tight as there exist ORAM-based constructions of EMMs with logarithmic slowdown that leak no more than the decoupled key-equality pattern (and actually, much less). Furthermore, we present stronger lower bounds that encrypted multi-maps leaking at most the decoupled key-equality pattern but are able to perform one of either the update or query operations in the plaintext still require $\Omega(\log n)$ overhead. Finally, we extend our lower bounds to show that dynamic, response-hiding searchable encryption schemes must also incur $\Omega(log n)$ overhead even when one of either the document updates or searches may be performed in the plaintext.

2020

TCC

Lower Bounds for Multi-Server Oblivious RAMs 📺 Abstract

Kasper Green Larsen Mark Simkin Kevin Yeo

In this work, we consider oblivious RAMs (ORAM) in a setting with multiple servers and the adversary may corrupt a subset of the servers. We present an $\Omega(log n)$ overhead lower bound for any k-server ORAM that limits any PPT adversary to distinguishing advantage at most $1/4k$ when only one server is corrupted. In other words, if one insists on negligible distinguishing advantage, then multi-server ORAMs cannot be faster than single-server ORAMs even with polynomially many servers of which only one unknown server is corrupted. Our results apply to ORAMs that may err with probability at most 1/128 as well as scenarios where the adversary corrupts larger subsets of servers. We also extend our lower bounds to other important data structures including oblivious stacks, queues, deques, priority queues and search trees.

2019

EUROCRYPT

Lower Bounds for Differentially Private RAMs 📺 Abstract

Giuseppe Persiano Kevin Yeo

In this work, we study privacy-preserving storage primitives that are suitable for use in data analysis on outsourced databases within the differential privacy framework. The goal in differentially private data analysis is to disclose global properties of a group without compromising any individual’s privacy. Typically, differentially private adversaries only ever learn global properties. For the case of outsourced databases, the adversary also views the patterns of access to data. Oblivious RAM (ORAM) can be used to hide access patterns but ORAM might be excessive as in some settings it could be sufficient to be compatible with differential privacy and only protect the privacy of individual accesses.We consider $$(\epsilon ,\delta )$$(ϵ,δ)-Differentially Private RAM, a weakening of ORAM that only protects individual operations and seems better suited for use in data analysis on outsourced databases. As differentially private RAM has weaker security than ORAM, there is hope that we can bypass the $$\varOmega (\log (nb/c))$$Ω(log(nb/c)) bandwidth lower bounds for ORAM by Larsen and Nielsen [CRYPTO ’18] for storing an array of nb-bit entries and a client with c bits of memory. We answer in the negative and present an $$\varOmega (\log (nb/c))$$Ω(log(nb/c)) bandwidth lower bound for privacy budgets of $$\epsilon = O(1)$$ϵ=O(1) and $$\delta \le 1/3$$δ≤1/3.The information transfer technique used for ORAM lower bounds does not seem adaptable for use with the weaker security guarantees of differential privacy. Instead, we prove our lower bounds by adapting the chronogram technique to our setting. To our knowledge, this is the first work that uses the chronogram technique for lower bounds on privacy-preserving storage primitives.