Skip to content

Conversation

@sayakpaul
Copy link
Member

What does this PR do?

Fixes #12535

@ljk1291, could you check this?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ljk1291
Copy link

ljk1291 commented Oct 25, 2025

The new high noise Lightning lora is loading perfectly now thanks!

I noticed that the new low noise Lightning lora is still having issues with unexpected keys. It loads with these warnings, but the quality of the videos makes it seem like it is not loading correctly. The old low noise Lightning lora works with it though.
Loading adapter weights from state_dict led to unexpected keys found in the model: condition_embedder.image_embedder.ff.net.0.proj.lora_A.light.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.light.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.light.bias, condition_embedder.image_embedder.ff.net.2.lora_A.light.weight, condition_embedder.image_embedder.ff.net.2.lora_B.light.weight, condition_embedder.image_embedder.ff.net.2.lora_B.light.bias, blocks.0.attn2.add_k_proj.lora_A.light.weight, blocks.0.attn2.add_k_proj.lora_B.light.weight, blocks.0.attn2.add_k_proj.lora_B.light.bias, blocks.0.attn2.add_v_proj.lora_A.light.weight, blocks.0.attn2.add_v_proj.lora_B.light.weight, blocks.0.attn2.add_v_proj.lora_B.light.bias, blocks.1.attn2.add_k_proj.lora_A.light.weight, blocks.1.attn2.add_k_proj.lora_B.light.weight, blocks.1.attn2.add_k_proj.lora_B.light.bias, blocks.1.attn2.add_v_proj.lora_A.light.weight, blocks.1.attn2.add_v_proj.lora_B.light.weight, blocks.1.attn2.add_v_proj.lora_B.light.bias, blocks.2.attn2.add_k_proj.lora_A.light.weight, blocks.2.attn2.add_k_proj.lora_B.light.weight, blocks.2.attn2.add_k_proj.lora_B.light.bias, blocks.2.attn2.add_v_proj.lora_A.light.weight, blocks.2.attn2.add_v_proj.lora_B.light.weight, blocks.2.attn2.add_v_proj.lora_B.light.bias, blocks.3.attn2.add_k_proj.lora_A.light.weight, blocks.3.attn2.add_k_proj.lora_B.light.weight, blocks.3.attn2.add_k_proj.lora_B.light.bias, blocks.3.attn2.add_v_proj.lora_A.light.weight, blocks.3.attn2.add_v_proj.lora_B.light.weight, blocks.3.attn2.add_v_proj.lora_B.light.bias, blocks.4.attn2.add_k_proj.lora_A.light.weight, blocks.4.attn2.add_k_proj.lora_B.light.weight, blocks.4.attn2.add_k_proj.lora_B.light.bias, blocks.4.attn2.add_v_proj.lora_A.light.weight, blocks.4.attn2.add_v_proj.lora_B.light.weight, blocks.4.attn2.add_v_proj.lora_B.light.bias, blocks.5.attn2.add_k_proj.lora_A.light.weight, blocks.5.attn2.add_k_proj.lora_B.light.weight, blocks.5.attn2.add_k_proj.lora_B.light.bias, blocks.5.attn2.add_v_proj.lora_A.light.weight, blocks.5.attn2.add_v_proj.lora_B.light.weight, blocks.5.attn2.add_v_proj.lora_B.light.bias, blocks.6.attn2.add_k_proj.lora_A.light.weight, blocks.6.attn2.add_k_proj.lora_B.light.weight, blocks.6.attn2.add_k_proj.lora_B.light.bias, blocks.6.attn2.add_v_proj.lora_A.light.weight, blocks.6.attn2.add_v_proj.lora_B.light.weight, blocks.6.attn2.add_v_proj.lora_B.light.bias, blocks.7.attn2.add_k_proj.lora_A.light.weight, blocks.7.attn2.add_k_proj.lora_B.light.weight, blocks.7.attn2.add_k_proj.lora_B.light.bias, blocks.7.attn2.add_v_proj.lora_A.light.weight, blocks.7.attn2.add_v_proj.lora_B.light.weight, blocks.7.attn2.add_v_proj.lora_B.light.bias, blocks.8.attn2.add_k_proj.lora_A.light.weight, blocks.8.attn2.add_k_proj.lora_B.light.weight, blocks.8.attn2.add_k_proj.lora_B.light.bias, blocks.8.attn2.add_v_proj.lora_A.light.weight, blocks.8.attn2.add_v_proj.lora_B.light.weight, blocks.8.attn2.add_v_proj.lora_B.light.bias, blocks.9.attn2.add_k_proj.lora_A.light.weight, blocks.9.attn2.add_k_proj.lora_B.light.weight, blocks.9.attn2.add_k_proj.lora_B.light.bias, blocks.9.attn2.add_v_proj.lora_A.light.weight, blocks.9.attn2.add_v_proj.lora_B.light.weight, blocks.9.attn2.add_v_proj.lora_B.light.bias, blocks.10.attn2.add_k_proj.lora_A.light.weight, blocks.10.attn2.add_k_proj.lora_B.light.weight, blocks.10.attn2.add_k_proj.lora_B.light.bias, blocks.10.attn2.add_v_proj.lora_A.light.weight, blocks.10.attn2.add_v_proj.lora_B.light.weight, blocks.10.attn2.add_v_proj.lora_B.light.bias, blocks.11.attn2.add_k_proj.lora_A.light.weight, blocks.11.attn2.add_k_proj.lora_B.light.weight, blocks.11.attn2.add_k_proj.lora_B.light.bias, blocks.11.attn2.add_v_proj.lora_A.light.weight, blocks.11.attn2.add_v_proj.lora_B.light.weight, blocks.11.attn2.add_v_proj.lora_B.light.bias, blocks.12.attn2.add_k_proj.lora_A.light.weight, blocks.12.attn2.add_k_proj.lora_B.light.weight, blocks.12.attn2.add_k_proj.lora_B.light.bias, blocks.12.attn2.add_v_proj.lora_A.light.weight, blocks.12.attn2.add_v_proj.lora_B.light.weight, blocks.12.attn2.add_v_proj.lora_B.light.bias, blocks.13.attn2.add_k_proj.lora_A.light.weight, blocks.13.attn2.add_k_proj.lora_B.light.weight, blocks.13.attn2.add_k_proj.lora_B.light.bias, blocks.13.attn2.add_v_proj.lora_A.light.weight, blocks.13.attn2.add_v_proj.lora_B.light.weight, blocks.13.attn2.add_v_proj.lora_B.light.bias, blocks.14.attn2.add_k_proj.lora_A.light.weight, blocks.14.attn2.add_k_proj.lora_B.light.weight, blocks.14.attn2.add_k_proj.lora_B.light.bias, blocks.14.attn2.add_v_proj.lora_A.light.weight, blocks.14.attn2.add_v_proj.lora_B.light.weight, blocks.14.attn2.add_v_proj.lora_B.light.bias, blocks.15.attn2.add_k_proj.lora_A.light.weight, blocks.15.attn2.add_k_proj.lora_B.light.weight, blocks.15.attn2.add_k_proj.lora_B.light.bias, blocks.15.attn2.add_v_proj.lora_A.light.weight, blocks.15.attn2.add_v_proj.lora_B.light.weight, blocks.15.attn2.add_v_proj.lora_B.light.bias, blocks.16.attn2.add_k_proj.lora_A.light.weight, blocks.16.attn2.add_k_proj.lora_B.light.weight, blocks.16.attn2.add_k_proj.lora_B.light.bias, blocks.16.attn2.add_v_proj.lora_A.light.weight, blocks.16.attn2.add_v_proj.lora_B.light.weight, blocks.16.attn2.add_v_proj.lora_B.light.bias, blocks.17.attn2.add_k_proj.lora_A.light.weight, blocks.17.attn2.add_k_proj.lora_B.light.weight, blocks.17.attn2.add_k_proj.lora_B.light.bias, blocks.17.attn2.add_v_proj.lora_A.light.weight, blocks.17.attn2.add_v_proj.lora_B.light.weight, blocks.17.attn2.add_v_proj.lora_B.light.bias, blocks.18.attn2.add_k_proj.lora_A.light.weight, blocks.18.attn2.add_k_proj.lora_B.light.weight, blocks.18.attn2.add_k_proj.lora_B.light.bias, blocks.18.attn2.add_v_proj.lora_A.light.weight, blocks.18.attn2.add_v_proj.lora_B.light.weight, blocks.18.attn2.add_v_proj.lora_B.light.bias, blocks.19.attn2.add_k_proj.lora_A.light.weight, blocks.19.attn2.add_k_proj.lora_B.light.weight, blocks.19.attn2.add_k_proj.lora_B.light.bias, blocks.19.attn2.add_v_proj.lora_A.light.weight, blocks.19.attn2.add_v_proj.lora_B.light.weight, blocks.19.attn2.add_v_proj.lora_B.light.bias, blocks.20.attn2.add_k_proj.lora_A.light.weight, blocks.20.attn2.add_k_proj.lora_B.light.weight, blocks.20.attn2.add_k_proj.lora_B.light.bias, blocks.20.attn2.add_v_proj.lora_A.light.weight, blocks.20.attn2.add_v_proj.lora_B.light.weight, blocks.20.attn2.add_v_proj.lora_B.light.bias, blocks.21.attn2.add_k_proj.lora_A.light.weight, blocks.21.attn2.add_k_proj.lora_B.light.weight, blocks.21.attn2.add_k_proj.lora_B.light.bias, blocks.21.attn2.add_v_proj.lora_A.light.weight, blocks.21.attn2.add_v_proj.lora_B.light.weight, blocks.21.attn2.add_v_proj.lora_B.light.bias, blocks.22.attn2.add_k_proj.lora_A.light.weight, blocks.22.attn2.add_k_proj.lora_B.light.weight, blocks.22.attn2.add_k_proj.lora_B.light.bias, blocks.22.attn2.add_v_proj.lora_A.light.weight, blocks.22.attn2.add_v_proj.lora_B.light.weight, blocks.22.attn2.add_v_proj.lora_B.light.bias, blocks.23.attn2.add_k_proj.lora_A.light.weight, blocks.23.attn2.add_k_proj.lora_B.light.weight, blocks.23.attn2.add_k_proj.lora_B.light.bias, blocks.23.attn2.add_v_proj.lora_A.light.weight, blocks.23.attn2.add_v_proj.lora_B.light.weight, blocks.23.attn2.add_v_proj.lora_B.light.bias, blocks.24.attn2.add_k_proj.lora_A.light.weight, blocks.24.attn2.add_k_proj.lora_B.light.weight, blocks.24.attn2.add_k_proj.lora_B.light.bias, blocks.24.attn2.add_v_proj.lora_A.light.weight, blocks.24.attn2.add_v_proj.lora_B.light.weight, blocks.24.attn2.add_v_proj.lora_B.light.bias, blocks.25.attn2.add_k_proj.lora_A.light.weight, blocks.25.attn2.add_k_proj.lora_B.light.weight, blocks.25.attn2.add_k_proj.lora_B.light.bias, blocks.25.attn2.add_v_proj.lora_A.light.weight, blocks.25.attn2.add_v_proj.lora_B.light.weight, blocks.25.attn2.add_v_proj.lora_B.light.bias, blocks.26.attn2.add_k_proj.lora_A.light.weight, blocks.26.attn2.add_k_proj.lora_B.light.weight, blocks.26.attn2.add_k_proj.lora_B.light.bias, blocks.26.attn2.add_v_proj.lora_A.light.weight, blocks.26.attn2.add_v_proj.lora_B.light.weight, blocks.26.attn2.add_v_proj.lora_B.light.bias, blocks.27.attn2.add_k_proj.lora_A.light.weight, blocks.27.attn2.add_k_proj.lora_B.light.weight, blocks.27.attn2.add_k_proj.lora_B.light.bias, blocks.27.attn2.add_v_proj.lora_A.light.weight, blocks.27.attn2.add_v_proj.lora_B.light.weight, blocks.27.attn2.add_v_proj.lora_B.light.bias, blocks.28.attn2.add_k_proj.lora_A.light.weight, blocks.28.attn2.add_k_proj.lora_B.light.weight, blocks.28.attn2.add_k_proj.lora_B.light.bias, blocks.28.attn2.add_v_proj.lora_A.light.weight, blocks.28.attn2.add_v_proj.lora_B.light.weight, blocks.28.attn2.add_v_proj.lora_B.light.bias, blocks.29.attn2.add_k_proj.lora_A.light.weight, blocks.29.attn2.add_k_proj.lora_B.light.weight, blocks.29.attn2.add_k_proj.lora_B.light.bias, blocks.29.attn2.add_v_proj.lora_A.light.weight, blocks.29.attn2.add_v_proj.lora_B.light.weight, blocks.29.attn2.add_v_proj.lora_B.light.bias, blocks.30.attn2.add_k_proj.lora_A.light.weight, blocks.30.attn2.add_k_proj.lora_B.light.weight, blocks.30.attn2.add_k_proj.lora_B.light.bias, blocks.30.attn2.add_v_proj.lora_A.light.weight, blocks.30.attn2.add_v_proj.lora_B.light.weight, blocks.30.attn2.add_v_proj.lora_B.light.bias, blocks.31.attn2.add_k_proj.lora_A.light.weight, blocks.31.attn2.add_k_proj.lora_B.light.weight, blocks.31.attn2.add_k_proj.lora_B.light.bias, blocks.31.attn2.add_v_proj.lora_A.light.weight, blocks.31.attn2.add_v_proj.lora_B.light.weight, blocks.31.attn2.add_v_proj.lora_B.light.bias, blocks.32.attn2.add_k_proj.lora_A.light.weight, blocks.32.attn2.add_k_proj.lora_B.light.weight, blocks.32.attn2.add_k_proj.lora_B.light.bias, blocks.32.attn2.add_v_proj.lora_A.light.weight, blocks.32.attn2.add_v_proj.lora_B.light.weight, blocks.32.attn2.add_v_proj.lora_B.light.bias, blocks.33.attn2.add_k_proj.lora_A.light.weight, blocks.33.attn2.add_k_proj.lora_B.light.weight, blocks.33.attn2.add_k_proj.lora_B.light.bias, blocks.33.attn2.add_v_proj.lora_A.light.weight, blocks.33.attn2.add_v_proj.lora_B.light.weight, blocks.33.attn2.add_v_proj.lora_B.light.bias, blocks.34.attn2.add_k_proj.lora_A.light.weight, blocks.34.attn2.add_k_proj.lora_B.light.weight, blocks.34.attn2.add_k_proj.lora_B.light.bias, blocks.34.attn2.add_v_proj.lora_A.light.weight, blocks.34.attn2.add_v_proj.lora_B.light.weight, blocks.34.attn2.add_v_proj.lora_B.light.bias, blocks.35.attn2.add_k_proj.lora_A.light.weight, blocks.35.attn2.add_k_proj.lora_B.light.weight, blocks.35.attn2.add_k_proj.lora_B.light.bias, blocks.35.attn2.add_v_proj.lora_A.light.weight, blocks.35.attn2.add_v_proj.lora_B.light.weight, blocks.35.attn2.add_v_proj.lora_B.light.bias, blocks.36.attn2.add_k_proj.lora_A.light.weight, blocks.36.attn2.add_k_proj.lora_B.light.weight, blocks.36.attn2.add_k_proj.lora_B.light.bias, blocks.36.attn2.add_v_proj.lora_A.light.weight, blocks.36.attn2.add_v_proj.lora_B.light.weight, blocks.36.attn2.add_v_proj.lora_B.light.bias, blocks.37.attn2.add_k_proj.lora_A.light.weight, blocks.37.attn2.add_k_proj.lora_B.light.weight, blocks.37.attn2.add_k_proj.lora_B.light.bias, blocks.37.attn2.add_v_proj.lora_A.light.weight, blocks.37.attn2.add_v_proj.lora_B.light.weight, blocks.37.attn2.add_v_proj.lora_B.light.bias, blocks.38.attn2.add_k_proj.lora_A.light.weight, blocks.38.attn2.add_k_proj.lora_B.light.weight, blocks.38.attn2.add_k_proj.lora_B.light.bias, blocks.38.attn2.add_v_proj.lora_A.light.weight, blocks.38.attn2.add_v_proj.lora_B.light.weight, blocks.38.attn2.add_v_proj.lora_B.light.bias, blocks.39.attn2.add_k_proj.lora_A.light.weight, blocks.39.attn2.add_k_proj.lora_B.light.weight, blocks.39.attn2.add_k_proj.lora_B.light.bias, blocks.39.attn2.add_v_proj.lora_A.light.weight, blocks.39.attn2.add_v_proj.lora_B.light.weight, blocks.39.attn2.add_v_proj.lora_B.light.bias.

@ljk1291
Copy link

ljk1291 commented Oct 25, 2025

I also just noticed that the old Wan 2.1 light lora is failing to load in Wan2 2.2 with this commit, getting the following error:

Cell In[9], line 31
     27 kwargs = {}
     28 kwargs["load_into_transformer_2"] = True
---> 31 pipe.load_lora_weights(
     32     light21_lora,
     33     adapter_name="light"
     34 )
     36 pipe.load_lora_weights(
     37     light21_lora,
     38     adapter_name="light_2", **kwargs
     39 )
     44 pipe.set_adapters(["light", "light_2"], adapter_weights=[1.0, 1.0])

File [/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_pipeline.py:4066](https://9v6h914nqqmx1q-8888.proxy.runpod.net/lab/workspaces/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_pipeline.py#line=4065), in WanLoraLoaderMixin.load_lora_weights(self, pretrained_model_name_or_path_or_dict, adapter_name, hotswap, **kwargs)
   4064 # First, ensure that the checkpoint is a compatible one and can be successfully loaded.
   4065 kwargs["return_lora_metadata"] = True
-> 4066 state_dict, metadata = self.lora_state_dict(pretrained_model_name_or_path_or_dict, **kwargs)
   4067 # convert T2V LoRA to I2V LoRA (when loaded to Wan I2V) by adding zeros for the additional (missing) _img layers
   4068 state_dict = self._maybe_expand_t2v_lora_for_i2v(
   4069     transformer=getattr(self, self.transformer_name) if not hasattr(self, "transformer") else self.transformer,
   4070     state_dict=state_dict,
   4071 )

File [/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py:114](https://9v6h914nqqmx1q-8888.proxy.runpod.net/lab/workspaces/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py#line=113), in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    111 if check_use_auth_token:
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File [/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_pipeline.py:3980](https://9v6h914nqqmx1q-8888.proxy.runpod.net/lab/workspaces/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_pipeline.py#line=3979), in WanLoraLoaderMixin.lora_state_dict(cls, pretrained_model_name_or_path_or_dict, **kwargs)
   3965 state_dict, metadata = _fetch_state_dict(
   3966     pretrained_model_name_or_path_or_dict=pretrained_model_name_or_path_or_dict,
   3967     weight_name=weight_name,
   (...)
   3977     allow_pickle=allow_pickle,
   3978 )
   3979 if any(k.startswith("diffusion_model.") for k in state_dict):
-> 3980     state_dict = _convert_non_diffusers_wan_lora_to_diffusers(state_dict)
   3981 elif any(k.startswith("lora_unet_") for k in state_dict):
   3982     state_dict = _convert_musubi_wan_lora_to_diffusers(state_dict)

File [/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_conversion_utils.py:2003](https://9v6h914nqqmx1q-8888.proxy.runpod.net/lab/workspaces/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_conversion_utils.py#line=2002), in _convert_non_diffusers_wan_lora_to_diffusers(state_dict)
   1999 if f"head.head.{lora_down_key}.weight" in state_dict:
   2000     logger.info(
   2001         f"The state dict seems to be have both `head.head.diff` and `head.head.{lora_down_key}.weight` keys, which is unexpected."
   2002     )
-> 2003 converted_state_dict["proj_out.lora_A.weight"] = original_state_dict.pop("head.head.diff")
   2004 down_matrix_head = converted_state_dict["proj_out.lora_A.weight"]
   2005 up_matrix_shape = (down_matrix_head.shape[0], converted_state_dict["proj_out.lora_B.bias"].shape[0])

KeyError: 'head.head.diff'

Code to reproduce:

model_id = "Wan-AI/Wan2.2-I2V-A14B-Diffusers"

light21_lora = hf_hub_download(repo_id="Kijai/WanVideo_comfy", filename="Lightx2v/lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors")

pipe = WanImageToVideoPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
)


kwargs = {}
kwargs["load_into_transformer_2"] = True


pipe.load_lora_weights(
    light21_lora,
    adapter_name="light"
)

pipe.load_lora_weights(
    light21_lora,
    adapter_name="light_2", **kwargs
)

pipe.set_adapters(["light", "light_2"], adapter_weights=[1.0, 1.0])

@sayakpaul
Copy link
Member Author

I also just noticed that the old Wan 2.1 light lora is failing to load in Wan2 2.2 with this commit, getting the following error:

I pushed some changes which should help with the loading problem but even in main the state dict loads with warnings of unexpected keys.

Will check further.

@ljk1291
Copy link

ljk1291 commented Oct 26, 2025

I also just noticed that the old Wan 2.1 light lora is failing to load in Wan2 2.2 with this commit, getting the following error:

I pushed some changes which should help with the loading problem but even in main the state dict loads with warnings of unexpected keys.

Will check further.

Yes, that was always an issue with some wan 2.1 Loras loading in wan 2.2. It still loads and functions correctly though.

I have just tested them all and everything seems to be working correctly now for the new high/low noise lightning models and the old wan 2.1 Loras loading in wan 2.2(even with the key errors). Thanks a lot 🙏

@sayakpaul
Copy link
Member Author

Thanks for confirming. So, are we good to merge? 👀

@asomoza
Copy link
Member

asomoza commented Oct 27, 2025

@sayakpaul I did a run with this too and at first glance it seems to work (I don't see a difference in quality) but the console output is really big with the unexpected keys.

I'll do a test using only the low noise model and see if there is a difference in quality.

@asomoza
Copy link
Member

asomoza commented Oct 27, 2025

this is a hard one for me, I did the experiment with just the low noise, just the high noise and both of them, the quality doesn't suffer but the motion in the low one changed quite a bit, this produces a slow motion with both too. I remember I read somewhere that this lora fixed the slow motion but also read some people saying that it didn't.

I'll test this with ComfyUI sometime soon but I have to download everything again to use it with it and see if it works different there.

in the meantime I think we're good to merge @sayakpaul and if I find something later, maybe we can follow with another PR?

@sayakpaul sayakpaul requested a review from asomoza October 28, 2025 03:02
@sayakpaul sayakpaul merged commit df0e2a4 into main Oct 28, 2025
13 of 15 checks passed
@sayakpaul
Copy link
Member Author

Failing tests are unrelated.

@sayakpaul sayakpaul deleted the support-wan-latest-lora branch October 28, 2025 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wan2.2 i2v Latest Lightning LoRA not loading

4 participants