https://huggingface.co/alibaba-pai/Wan2.2-Fun-Reward-LoRAs/tree/main
sample of workflow: https://www.runninghub.ai/post/1966777795378655234
We explore the Reward Backpropagation technique 1 2 to optimized the generated videos by Wan2.2-Fun for better alignment with human preferences. We provide the following pre-trained models (i.e. LoRAs) along with the training script. You can use these LoRAs to enhance the corresponding base model as a plug-in or train your own reward LoRA.
For more details, please refer to our GitHub repo.
Official MPS reward LoRA (rank=128 and network_alpha=64) for Wan2.2-Fun-A14B-InP (low noise). It is trained with a batch size of 8 for 4,500 steps.
1. Modèle partagé uniquement à l'apprentissage et au partage. Droits d'auteur et interprétation finale réservés à l'auteur original.
2. Auteur souhaitant revendiquer le modèle : Contactez officiellement SeaArt AI pour l'authentification. Nous protégeons les droits de chaque auteur. Cliquer pour revendiquer
