CLIDE: Real or Fake?

General and Domain-Specific Zero-shot Detection of Generated Images via Conditional Likelihood, WACV 2026

Abstract

We present CLIDE, a zero-shot method for detecting AI-generated images using a conditional likelihood approximation over CLIP embeddings. Our approach applies a whitening transform to image features and uses the resulting, approximate Gaussian embeddings, to compute likelihood values used as a detection score. Our method operates without training or access to generated images. CLIDE supports both general and domain-specific settings and is designed to remain effective under practical distribution shifts across generators and domains.

CLIDE teaser figure

Key Insights

Conditional Likelihood

Conditional likelihood diagram

We generalize CLIP’s data-driven likelihood approximation from a global score to a conditional likelihood model. Gaussian behavior holds reliably in examined subspaces, not just in the full embedding space.

Zero-shot Detector

Whitening diagram

All existing zero-shot detectors implicitly model the likelihood that an image is real. CLIDE explicitly models this likelihood and by operating under a Gaussian model.

Domain-specific Adaptation

Domain adaptation diagram

Use domain-representative real image examples for local statistics, improving performance on practical target domains (e.g., damaged cars, invoices). We empirically validate Gaussian behavior in subspaces using normality tests.

Consistent Behavior

Deployment diagram

CLIDE operates consistently across different generators and domains. Other methods exhibit a substantial performance drop in specific domains and suffer from classification flipping, where images generated by different generators fall on opposite sides of the real images in the detection criterion space. In contrast, CLIDE maintains a stable scoring geometry, preventing such flips.

Datasets

In addition to existing datasets of general and artistic images, we introduce new synthetic datasets in two practical domains: damaged cars and invoices. The damaged-car dataset includes images generated by five different generators, while the invoice dataset is generated by a single model. All synthetic datasets are publicly available at clide_synthetic_datasets .

Damaged cars dataset examples Invoice dataset examples

BibTeX

@inproceedings{betser2026clide,
  title     = {General and Domain-Specific Zero-shot Detection of Generated Images via Conditional Likelihood},
  author    = {Betser, Roy and Hofman, Omer and Vainshtein, Roman and Gilboa, Guy},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026},
  eprint    = {2512.05590},
  archivePrefix = {arXiv},
}