https://huggingface.co/alibaba-pai/Wan2.2-Fun-Reward-LoRAs/tree/main
sample of workflow: https://www.runninghub.ai/post/1966777795378655234
We explore the Reward Backpropagation technique 1 2 to optimized the generated videos by Wan2.2-Fun for better alignment with human preferences. We provide the following pre-trained models (i.e. LoRAs) along with the training script. You can use these LoRAs to enhance the corresponding base model as a plug-in or train your own reward LoRA.
For more details, please refer to our GitHub repo.
Official MPS reward LoRA (rank=128 and network_alpha=64) for Wan2.2-Fun-A14B-InP (low noise). It is trained with a batch size of 8 for 4,500 steps.
1. Dieses Modell dient nur Lernzwecken. Urheber- und Auslegungsrechte liegen beim Originalautor.
2. Bist du der Originalautor eines Modells, kontaktiere uns bitte zur Authentifizierung über unsere offiziellen Kanäle. Wir schützen die Rechte aller Schöpfer. Hier klicken, um es zu verifizieren.
