IACR News item: 18 November 2024

Black-box Collision Attacks on the NeuralHash Perceptual Hash Function

Diane Leblanc-Albarel, Bart Preneel

ePrint Report

Perceptual hash functions map multimedia content that is perceptually close to outputs strings that are identical or similar. They are widely used for the identification of protected copyright and illegal content in information sharing services: a list of undesirable files is hashed with a perceptual hash function and compared, server side, to the hash of the content that is uploaded. Unlike cryptographic hash functions, the design details of perceptual hash functions are typically kept secret. Several governments envisage to extend this detection to end-to-end encrypted services by using Client Side Scanning and local matching against a hashed database. In August 2021, Apple hash published a concrete design for Client Side Scanning based on the NeuralHash perceptual hash function that uses deep learning. There has been a wide criticism of Client Side Scanning based on its disproportionate impact on human rights and risks for function creep and abuse. In addition, several authors have demonstrated that perceptual hash functions are vulnerable to cryptanalysis: it is easy to create false positives and false negatives once the design is known. This paper demonstrates that these designs are vulnerable in a weaker black-box attack model. It is demonstrated that the effective security level of NeuralHash for a realistic set of images is 32 bits rather than 96 bits, implying that finding a collision requires $2^{16}$ steps rather than $2^{48}$. As a consequence, the large scale deployment of NeuralHash would lead to a huge number of false positives, making the system unworkable. It is likely that most current perceptual hash function designs exhibit similar vulnerabilities.

Expand

Additional news items may be found on the IACR news page.

International Association for Cryptologic Research

International Association
for Cryptologic Research