https://huggingface.co/alibaba-pai/Wan2.2-Fun-Reward-LoRAs/tree/main
sample of workflow: https://www.runninghub.ai/post/1966777795378655234
We explore the Reward Backpropagation technique 1 2 to optimized the generated videos by Wan2.2-Fun for better alignment with human preferences. We provide the following pre-trained models (i.e. LoRAs) along with the training script. You can use these LoRAs to enhance the corresponding base model as a plug-in or train your own reward LoRA.
For more details, please refer to our GitHub repo.
Official MPS reward LoRA (rank=128 and network_alpha=64) for Wan2.2-Fun-A14B-InP (low noise). It is trained with a batch size of 8 for 4,500 steps.
1. 轉载模型僅供學習與交流分享,其版權及最終解释權归原作者。
2. 模型原作者如需認領模型,請通過官方渠道联系海藝AI工作人員進行認證。我們致力於保護每一位創作者的權益。 點擊去認領
