https://huggingface.co/alibaba-pai/Wan2.2-Fun-Reward-LoRAs/tree/main
sample of workflow: https://www.runninghub.ai/post/1966777795378655234
We explore the Reward Backpropagation technique 1 2 to optimized the generated videos by Wan2.2-Fun for better alignment with human preferences. We provide the following pre-trained models (i.e. LoRAs) along with the training script. You can use these LoRAs to enhance the corresponding base model as a plug-in or train your own reward LoRA.
For more details, please refer to our GitHub repo.
Official MPS reward LoRA (rank=128 and network_alpha=64) for Wan2.2-Fun-A14B-InP (low noise). It is trained with a batch size of 8 for 4,500 steps.
1. 재게시된 모델의 권리는 원 제작자에게 있습니다.
2. 모델 원작자가 모델을 인증받으려면 공식 채널을 통해 SeaArt.AI 직원에게 문의하세요. 저희는 모든 창작자의 권리를 보호하기 위해 노력합니다. 인증하러 이동
