Publicly-Detectable Watermarking for Language Models

CryptoDB

Publicly-Detectable Watermarking for Language Models

Authors:	Jaiden Fairoze , University of California, Berkeley Sanjam Garg , University of California, Berkeley Somesh Jha , University of Wisconsin–Madison Saeed Mahloujifar , Fundamental Artificial Intelligence Research at Meta Mohammad Mahmoody , University of Virginia Mingyuan Wang , New York University Shanghai
Download:	DOI: 10.62056/ahmpdkp10 URL: https://cic.iacr.org/p/1/4/31 Search ePrint Search Google
Abstract:	We present a publicly-detectable watermarking scheme for LMs: the detection algorithm contains no secret information, and it is executable by anyone. We embed a publicly-verifiable cryptographic signature into LM output using rejection sampling and prove that this produces unforgeable and distortion-free (i.e., undetectable without access to the public key) text output. We make use of error-correction to overcome periods of low entropy, a barrier for all prior watermarking schemes. We implement our scheme and find that our formal claims are met in practice.

BibTeX

@article{cic-2025-34924,
  title={Publicly-Detectable Watermarking for Language Models},
  journal={cic},
  publisher={International Association for Cryptologic Research},
  volume={1, Issue 4},
  url={https://cic.iacr.org/p/1/4/31},
  doi={10.62056/ahmpdkp10},
  author={Jaiden Fairoze and Sanjam Garg and Somesh Jha and Saeed Mahloujifar and Mohammad Mahmoody and Mingyuan Wang},
  year=2025
}

International Association for Cryptologic Research

International Associationfor Cryptologic Research

CryptoDB

Publicly-Detectable Watermarking for Language Models

BibTeX

International Association
for Cryptologic Research