-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Wan LoRAs] make T2V LoRAs compatible with Wan I2V #11107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@bot /style |
Style fixes have been applied. View the workflow run here. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @linoytsaban. I've left some comments with changes we discussed here https://huggingface.slack.com/archives/D08GXKYKGAK/p1742301801696229
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left some comments and thanks for catching the nasty key names.
Let's see some results with this?
num_blocks = len({k.split("blocks.")[1].split(".")[0] for k in state_dict}) | ||
is_i2v_lora = any("k_img" in k for k in state_dict) and any("v_img" in k for k in state_dict) | ||
if not is_i2v_lora: | ||
return state_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't perform any extra operation if it's is_i2v_lora
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for loading T2V lora into I2V model, so if it's already I2V lora we return the state dict as-is.
if transformer.config.image_dim is None: | ||
return state_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be moved out at the top of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be slightly faster than checking the keys first, this is checking whether the transformer is I2V. T2V transformer config has image_dim
as None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be not None
tho no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T2V has transformer.config.image_dim
= None
We have T2V loaded -> we are loading T2V lora -> if transformer.config.image_dim is None
-> no state dict modification required -> return as-is
We have I2V loaded -> we are loading T2V lora -> if transformer.config.image_dim is None
does not match as image_dim
is not None
-> continue with state dict modifications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're absolutely right🙌🏻
output.29.mp4
It's experimental, we suspect style does not transfer well and will try a motion-oriented lora, however results are comparable to replicate which recently added support for this |
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: hlky <hlky@hlky.ac>
@bot /style |
@bot /style |
https://github.com/huggingface/diffusers/actions/runs/13948751559/job/39042360914 |
Style fixes have been applied. View the workflow run here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleek!
I would suggest adding a snippet about this in our Wan docs.
Should be good to merge after that!
@sayakpaul Yeah we should definitely have some documentation for it, but I'm thinking maybe in a separate PR to add it somewhere else? I think for other pipelines we didnt include lora examples in the pipeline page so maybe makes more sense as part of LoRA documentation dedicated to videos? |
Yeah I think docs can be added in a follow up as we'll be overhauling all the pipeline doc pages similar to #11112 anyway and we don't have a great example for this right now. |
I respectfully disagree here. Just a snippet and saying that T2V LoRAs are compatible with I2V LoRAs won't be too much. It is quite easy to forget it if we were to wait till the overhaul to complete.
Well, if we don't know with examples if it works as expected, then it could wait until we do I guess. If we're matching the Replicate reference in terms of quality, I am fine with that. |
There are many things that are not explicitly documented. I don't think it should be a blocker to merge, and any documentation should be complete, simply stating that T2V loras are compatible with I2V model is not useful without the example, and we don't have a good example right now, nobody does as it's an experimental thing from the past 1-2 days. If we merge it will allow us and the community to conduct further experimentation/testing without switching branches etc. |
It's a matter of disagreement here, then. I mentioned an example (even if it matches the quality of the Replicate reference) should be fine. Many of the examples we provide in our docs don't always lead to good reasonable results from the get-go. We had similar situations with using Flux Control LoRAs and Turbo LoRAs but we at least added a small section in the Flux docs during the PR.
This can never ideally be a blocker. If a feature is experimental, community understands that and is often okay to switch branches. I would appreciate it if we don't rush into these merges without the little documentation I requested for going forward. |
This PR was authored by @hlky to make T2V Wan LoRAs compatible with Wan I2V 🙌🏻