FID Score Higher than Expected with Patch-based Diffusion Model

First of all, thank you for sharing your great work! I believe the idea behind your patch-based diffusion model has significant potential for high-resolution medical image generation. Before applying your research to my own data, I decided to quickly reproduce the results using the LSUN-church dataset that you used.

Following your pipeline description, I preprocessed the data, initialized with CLIP embeddings, and trained the model with a patch size of 64. After training for 1M iterations, I further trained the model with latent diffusion to generate images. I then calculated the FID score between 50K generated images and real images, and obtained an FID score of 11.4, which is higher than the FID of 5.49 reported in the paper.

I used the default settings as described in the README file. Could you provide any implementation details, tips, or tricks that might help reduce the FID score further?

Thank you for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FID Score Higher than Expected with Patch-based Diffusion Model #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FID Score Higher than Expected with Patch-based Diffusion Model #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions