CryptoDB
Breaking and Fixing Length Leakage in Content-Defined Chunking
Authors: | |
---|---|
Download: | |
Presentation: | Slides |
Abstract: | Most applications that deduplicate data first split said data in smaller blocks, called chunks, using content-defined chunking (CDC). CDC cuts the chunks based on a local context window in the data: this means that chunks boundaries are preserved when the data is changed, and enables significant deduplication efficiency gains across applications dealing with large redundant dataset such as backup solutions, software patching systems, and file hosting platforms like IPFS and HuggingFace. However, CDC also introduces a subtle leakage: the length of each chunk leaks information about the data being chunked. This enables fingerprinting attacks, where adversaries exploit chunk length patterns to infer the presence or structure of specific data. Such attacks threaten confidentiality in scenarios ranging from encrypted backups on untrusted cloud servers to data transmitted over encrypted channels. To address these risks, many systems - mainly in the cloud backup setting - have developed bespoke mitigations by mixing a cryptographic key inside the chunking process. We demonstrate the ineffectiveness of these mitigations by presenting efficient key recovery attacks that rely solely on a known plaintext assumption. These attacks entirely circumvent all folklore mitigations except one, re-enabling fingerprinting attacks. To address this, we introduce a formal treatment for Keyed Content-Defined Chunking (KCDC) schemes and propose a provably secure construction that fulfills a strong notion of security. In doing so, we take a step towards making these real-world systems more resilient against leakage. |
Video: | https://youtu.be/fH4xMJDuV5M |
BibTeX
@misc{rwc-2025-35885, title={Breaking and Fixing Length Leakage in Content-Defined Chunking}, note={Video at \url{https://youtu.be/fH4xMJDuV5M}}, howpublished={Talk given at RWC 2025}, author={Kien Tuong Truong and Matteo Scarlata and Simon Phillipp Merz and Felix Günther and Kenny Paterson}, year=2025 }