

DPO is Direct Preference Optimization, the name given to the process whereby a diffusion model is finetuned based on human-chosen images. Meihua Dang et. al. have trained Stable Diffusion 1.5 and Stable Diffusion XL using this method and the Pick-a-Pic v2 Dataset, which can be found at https://huggingface.co/datasets/yuvalkirstain/pickapic_v2, and wrote a paper about it at https://huggingface.co/papers/2311.12908.
The trained DPO models have been observed to produce higher quality images than their untuned counterparts, with a significant emphasis on the adherence of the model to your prompt. These LoRA can bring that prompt adherence to other fine-tuned Stable Diffusion models.
These LoRA are based on the works of Meihua Dang (https://huggingface.co/mhdang) at
https://huggingface.co/mhdang/dpo-sdxl-text2image-v1 and https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1, licensed under OpenRail++.
They were created using Kohya SS by extracting them from other OpenRail++ licensed checkpoints on CivitAI and HuggingFace.
1.5: https://civitai.com/models/240850/sd15-direct-preference-optimization-dpo extracted from https://huggingface.co/fp16-guy/Stable-Diffusion-v1-5_fp16_cleaned/blob/main/sd_1.5.safetensors.
XL: https://civitai.com/models/238319/sd-xl-dpo-finetune-direct-preference-optimization extracted from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors
These are also hosted on HuggingFace at https://huggingface.co/benjamin-paine/sd-dpo-offsets/
1. Dieses Modell dient nur Lernzwecken. Urheber- und Auslegungsrechte liegen beim Originalautor.
2. Bist du der Originalautor eines Modells, kontaktiere uns bitte zur Authentifizierung über unsere offiziellen Kanäle. Wir schützen die Rechte aller Schöpfer. Hier klicken, um es zu verifizieren.
