-
Notifications
You must be signed in to change notification settings - Fork 6k
[LoRA] Improve warning messages when LoRA loading becomes a no-op #10187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# If the serialization format is new (introduced in https://github.com/huggingface/diffusers/pull/2918), | ||
# then the `state_dict` keys should have `cls.unet_name` and/or `cls.text_encoder_name` as | ||
# their prefixes. | ||
keys = list(state_dict.keys()) | ||
only_text_encoder = all(key.startswith(cls.text_encoder_name) for key in keys) | ||
if not only_text_encoder: | ||
# Load the layers corresponding to UNet. | ||
logger.info(f"Loading {cls.unet_name}.") | ||
unet.load_lora_adapter( | ||
state_dict, | ||
prefix=cls.unet_name, | ||
network_alphas=network_alphas, | ||
adapter_name=adapter_name, | ||
_pipeline=_pipeline, | ||
low_cpu_mem_usage=low_cpu_mem_usage, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're handling all of this within the load_lora_adapter()
method now which I think is more appropriate as:
- It takes care of logging when users try to load LoRA into a model via
pipe.load_lora_weights()
. - Users try to load LoRAs directly into a model with the
load_lora_adapter()
method (with something likeunet.load_lora_adapter()
.
Helps to avoid duplication. I have run the integration tests, too and nothing is breaking due to this.
text_encoder_state_dict = {k: v for k, v in state_dict.items() if "text_encoder." in k} | ||
if len(text_encoder_state_dict) > 0: | ||
self.load_lora_into_text_encoder( | ||
text_encoder_state_dict, | ||
network_alphas=network_alphas, | ||
text_encoder=self.text_encoder, | ||
prefix="text_encoder", | ||
lora_scale=self.lora_scale, | ||
adapter_name=adapter_name, | ||
_pipeline=self, | ||
low_cpu_mem_usage=low_cpu_mem_usage, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar philosophy as explained above.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
logger.info(f"Loading {cls.unet_name}.") | ||
unet.load_lora_adapter( | ||
state_dict, | ||
prefix=cls.unet_name, | ||
network_alphas=network_alphas, | ||
adapter_name=adapter_name, | ||
_pipeline=_pipeline, | ||
low_cpu_mem_usage=low_cpu_mem_usage, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case prefix
is not None and no prefix
matched state dict keys are found, we log from the load_lora_adapter()
method.
This way, we cover for both load_lora_weights()
which is pipeline-level and load_lora_adapter()
which is model-level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sayakpaul
Ran the integration tests, too. They are passing. Will merge once the CI is green. |
@sayakpaul This PR broke LoRA loading. It seems like after calling import os
import tempfile
import torch
from peft import LoraConfig
from peft.tuners.lora import LoraLayer
from diffusers import UNet2DConditionModel
from diffusers.utils.testing_utils import floats_tensor
torch_device = 0
class TestIt:
def get_small_unet(self):
# from diffusers UNet2DConditionModelTests
init_dict = {
"block_out_channels": (4, 8),
"norm_num_groups": 4,
"down_block_types": ("CrossAttnDownBlock2D", "DownBlock2D"),
"up_block_types": ("UpBlock2D", "CrossAttnUpBlock2D"),
"cross_attention_dim": 8,
"attention_head_dim": 2,
"out_channels": 4,
"in_channels": 4,
"layers_per_block": 1,
"sample_size": 16,
}
model = UNet2DConditionModel(**init_dict)
return model.to(torch_device)
def test_it(self):
lora_config = LoraConfig(
target_modules=["to_q"],
init_lora_weights=False,
)
unet = self.get_small_unet()
unet.add_adapter(lora_config, adapter_name="adapter0")
num_lora_before = sum(isinstance(m, LoraLayer) for m in unet.modules())
assert num_lora_before > 0
with tempfile.TemporaryDirectory() as tmp_dirname:
unet.save_lora_adapter(tmp_dirname, safe_serialization=True, adapter_name="adapter0")
del unet
torch.manual_seed(0)
unet = self.get_small_unet()
file_name = os.path.join(tmp_dirname, "pytorch_lora_weights.safetensors")
unet.load_lora_adapter(file_name, safe_serialization=True, adapter_name="adapter0")
num_lora_after = sum(isinstance(m, LoraLayer) for m in unet.modules())
# this fails, no LoRA layer is loaded
assert num_lora_after > 0 The previous commit, |
Use:
This is reflected in the tests: diffusers/tests/models/test_modeling_common.py Line 1094 in 733b44a
|
Thanks for the quick reply. Isn't this a backwards incompatible change on a public API? At least for me this was surprising, especially since there is no error message or warning. |
Can you check if you're hitting? diffusers/src/diffusers/loaders/peft.py Line 357 in 733b44a
If not, then there's something wrong that I can look into fixing. |
When I load without
|
Earlier you said
But you are indeed hitting the right message here. Would you suggest turning this into a warning? |
Since
but nothing happens, it would be useful to get a warning. Maybe there is a good reason not to have a warning because there are legitimate use cases, but it feels dangerous to me. As mentioned, everything appears to work on the surface, but the outputs are not what I expect, so this could lead to hard to debug silent errors in code that used to work. Also, I don't find the message to be very helpful, since the solution you provided is not mentioned. But perhaps in the bigger context of things, this is obvious? |
Co-authored-by: SunMarc <marc.sun@hotmail.fr> condition better. support mapping. improvements. [Quantization] Add Quanto backend (#10756) * update * updaet * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/quantization/quanto.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * Update src/diffusers/quantizers/quanto/utils.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> [Single File] Add single file loading for SANA Transformer (#10947) * added support for from_single_file * added diffusers mapping script * added testcase * bug fix * updated tests * corrected code quality * corrected code quality --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> [LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187) * updates * updates * updates * updates * notebooks revert * fix-copies. * seeing * fix * revert * fixes * fixes * fixes * remove print * fix * conflicts ii. * updates * fixes * better filtering of prefix. --------- Co-authored-by: hlky <hlky@hlky.ac> [LoRA] CogView4 (#10981) * update * make fix-copies * update [Tests] improve quantization tests by additionally measuring the inference memory savings (#11021) * memory usage tests * fixes * gguf [`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998) * Add initial template * Second template * feat: Add TextEmbeddingModule to AnyTextPipeline * feat: Add AuxiliaryLatentModule template to AnyTextPipeline * Add bert tokenizer from the anytext repo for now * feat: Update AnyTextPipeline's modify_prompt method This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe. * Fill in the `forward` pass of `AuxiliaryLatentModule` * `make style && make quality` * `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library` * Update error handling to raise and logging * Add `create_glyph_lines` function into `TextEmbeddingModule` * make style * Up * Up * Up * Up * Remove several comments * refactor: Remove ControlNetConditioningEmbedding and update code accordingly * Up * Up * up * refactor: Update AnyTextPipeline to include new optional parameters * up * feat: Add OCR model and its components * chore: Update `TextEmbeddingModule` to include OCR model components and dependencies * chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task * `make style` * refactor: Update `AnyTextPipeline`'s docstring * Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once * simplify * `make style` * Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function * Simplify for now * `make style` * Up * feat: Add scripts to convert AnyText controlnet to diffusers * `make style` * Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule` * make style * Up * Simplify * Up * feat: Add safetensors module for loading model file * Fix device issues * Up * Up * refactor: Simplify * refactor: Simplify code for loading models and handling data types * `make style` * refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule * refactor: Update dtype in embedding_manager.py to match proj.weight * Up * Add attribution and adaptation information to pipeline_anytext.py * Update usage example * Will refactor `controlnet_cond_embedding` initialization * Add `AnyTextControlNetConditioningEmbedding` template * Refactor organization * style * style * Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding` * Follow one-file policy * style * [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel * [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py * [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py * Refactor AnyTextControlNet to use configurable conditioning embedding channels * Complete control net conditioning embedding in AnyTextControlNetModel * up * [FIX] Ensure embeddings use correct device in AnyTextControlNetModel * up * up * style * [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline * [UPDATE] Update example code in anytext.py to use correct font file and improve clarity * down * [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing * update pillow * [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity * [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file * [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency * 🆙 * style * [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py * style * Update examples/research_projects/anytext/README.md Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Remove commented-out image preparation code in AnyTextPipeline * Remove unnecessary blank line in README.md [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018) * update * update * update * update * update * update * update * update * update fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings (#11012) small fix on generating time_ids & embeddings [LoRA] support wan i2v loras from the world. (#11025) * support wan i2v loras from the world. * remove copied from. * upates * add lora. Fix SD3 IPAdapter feature extractor (#11027) chore: fix help messages in advanced diffusion examples (#10923) Fix missing **kwargs in lora_pipeline.py (#11011) * Update lora_pipeline.py * Apply style fixes * fix-copies --------- Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Fix for multi-GPU WAN inference (#10997) Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs Co-authored-by: Jimmy <39@🇺🇸.com> [Refactor] Clean up import utils boilerplate (#11026) * update * update * update Use `output_size` in `repeat_interleave` (#11030) [hybrid inference 🍯🐝] Add VAE encode (#11017) * [hybrid inference 🍯🐝] Add VAE encode * _toctree: add vae encode * Add endpoints, tests * vae_encode docs * vae encode benchmarks * api reference * changelog * Update docs/source/en/hybrid_inference/overview.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007) * Wan Pipeline scaling fix, type hint warning, multi generator fix * Apply suggestions from code review [LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044) * move to warning. * test related changes. Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827) * Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> making ```formatted_images``` initialization compact (#10801) compact writing Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820) * get_1d_rotary_pos_embed support npu * Update src/diffusers/models/embeddings.py --------- Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local> Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: YiYi Xu <yixu310@gmail.com> [Tests] restrict memory tests for quanto for certain schemes. (#11052) * restrict memory tests for quanto for certain schemes. * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fixes * style --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> [LoRA] feat: support non-diffusers wan t2v loras. (#11059) feat: support non-diffusers wan t2v loras. [examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051) Fix: dtype mismatch of prompt embeddings in sd3 controlnet training Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> reverts accidental change that removes attn_mask in attn. Improves fl… (#11065) reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop. Co-authored-by: Juan Acevedo <jfacevedo@google.com> Fix deterministic issue when getting pipeline dtype and device (#10696) Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> [Tests] add requires peft decorator. (#11037) * add requires peft decorator. * install peft conditionally. * conditional deps. Co-authored-by: DN6 <dhruv.nair@gmail.com> --------- Co-authored-by: DN6 <dhruv.nair@gmail.com> CogView4 Control Block (#10809) * cogview4 control training --------- Co-authored-by: OleehyO <leehy0357@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com> [CI] pin transformers version for benchmarking. (#11067) pin transformers version for benchmarking. updates Fix Wan I2V Quality (#11087) * fix_wan_i2v_quality * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update pipeline_wan_i2v.py --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: hlky <hlky@hlky.ac> LTX 0.9.5 (#10968) * update --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: hlky <hlky@hlky.ac> make PR GPU tests conditioned on styling. (#11099) Group offloading improvements (#11094) update Fix pipeline_flux_controlnet.py (#11095) * Fix pipeline_flux_controlnet.py * Fix style update readme instructions. (#11096) Co-authored-by: Juan Acevedo <jfacevedo@google.com> Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098) Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP Fix Group offloading behaviour when using streams (#11097) * update * update Quality options in `export_to_video` (#11090) * Quality options in `export_to_video` * make style improve more. add placeholders for docstrings. formatting. smol fix. solidify validation and annotation
* [WIP][LoRA] Implement hot-swapping of LoRA This PR adds the possibility to hot-swap LoRA adapters. It is WIP. Description As of now, users can already load multiple LoRA adapters. They can offload existing adapters or they can unload them (i.e. delete them). However, they cannot "hotswap" adapters yet, i.e. substitute the weights from one LoRA adapter with the weights of another, without the need to create a separate LoRA adapter. Generally, hot-swapping may not appear not super useful but when the model is compiled, it is necessary to prevent recompilation. See huggingface#9279 for more context. Caveats To hot-swap a LoRA adapter for another, these two adapters should target exactly the same layers and the "hyper-parameters" of the two adapters should be identical. For instance, the LoRA alpha has to be the same: Given that we keep the alpha from the first adapter, the LoRA scaling would be incorrect for the second adapter otherwise. Theoretically, we could override the scaling dict with the alpha values derived from the second adapter's config, but changing the dict will trigger a guard for recompilation, defeating the main purpose of the feature. I also found that compilation flags can have an impact on whether this works or not. E.g. when passing "reduce-overhead", there will be errors of the type: > input name: arg861_1. data pointer changed from 139647332027392 to 139647331054592 I don't know enough about compilation to determine whether this is problematic or not. Current state This is obviously WIP right now to collect feedback and discuss which direction to take this. If this PR turns out to be useful, the hot-swapping functions will be added to PEFT itself and can be imported here (or there is a separate copy in diffusers to avoid the need for a min PEFT version to use this feature). Moreover, more tests need to be added to better cover this feature, although we don't necessarily need tests for the hot-swapping functionality itself, since those tests will be added to PEFT. Furthermore, as of now, this is only implemented for the unet. Other pipeline components have yet to implement this feature. Finally, it should be properly documented. I would like to collect feedback on the current state of the PR before putting more time into finalizing it. * Reviewer feedback * Reviewer feedback, adjust test * Fix, doc * Make fix * Fix for possible g++ error * Add test for recompilation w/o hotswapping * Make hotswap work Requires huggingface/peft#2366 More changes to make hotswapping work. Together with the mentioned PEFT PR, the tests pass for me locally. List of changes: - docstring for hotswap - remove code copied from PEFT, import from PEFT now - adjustments to PeftAdapterMixin.load_lora_adapter (unfortunately, some state dict renaming was necessary, LMK if there is a better solution) - adjustments to UNet2DConditionLoadersMixin._process_lora: LMK if this is even necessary or not, I'm unsure what the overall relationship is between this and PeftAdapterMixin.load_lora_adapter - also in UNet2DConditionLoadersMixin._process_lora, I saw that there is no LoRA unloading when loading the adapter fails, so I added it there (in line with what happens in PeftAdapterMixin.load_lora_adapter) - rewritten tests to avoid shelling out, make the test more precise by making sure that the outputs align, parametrize it - also checked the pipeline code mentioned in this comment: huggingface#9453 (comment); when running this inside the with torch._dynamo.config.patch(error_on_recompile=True) context, there is no error, so I think hotswapping is now working with pipelines. * Address reviewer feedback: - Revert deprecated method - Fix PEFT doc link to main - Don't use private function - Clarify magic numbers - Add pipeline test Moreover: - Extend docstrings - Extend existing test for outputs != 0 - Extend existing test for wrong adapter name * Change order of test decorators parameterized.expand seems to ignore skip decorators if added in last place (i.e. innermost decorator). * Split model and pipeline tests Also increase test coverage by also targeting conv2d layers (support of which was added recently on the PEFT PR). * Reviewer feedback: Move decorator to test classes ... instead of having them on each test method. * Apply suggestions from code review Co-authored-by: hlky <hlky@hlky.ac> * Reviewer feedback: version check, TODO comment * Add enable_lora_hotswap method * Reviewer feedback: check _lora_loadable_modules * Revert changes in unet.py * Add possibility to ignore enabled at wrong time * Fix docstrings * Log possible PEFT error, test * Raise helpful error if hotswap not supported I.e. for the text encoder * Formatting * More linter * More ruff * Doc-builder complaint * Update docstring: - mention no text encoder support yet - make it clear that LoRA is meant - mention that same adapter name should be passed * Fix error in docstring * Update more methods with hotswap argument - SDXL - SD3 - Flux No changes were made to load_lora_into_transformer. * Add hotswap argument to load_lora_into_transformer For SD3 and Flux. Use shorter docstring for brevity. * Extend docstrings * Add version guards to tests * Formatting * Fix LoRA loading call to add prefix=None See: huggingface#10187 (comment) * Run make fix-copies * Add hot swap documentation to the docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Benjamin Bossan <benjamin.bossan@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* [WIP][LoRA] Implement hot-swapping of LoRA This PR adds the possibility to hot-swap LoRA adapters. It is WIP. Description As of now, users can already load multiple LoRA adapters. They can offload existing adapters or they can unload them (i.e. delete them). However, they cannot "hotswap" adapters yet, i.e. substitute the weights from one LoRA adapter with the weights of another, without the need to create a separate LoRA adapter. Generally, hot-swapping may not appear not super useful but when the model is compiled, it is necessary to prevent recompilation. See #9279 for more context. Caveats To hot-swap a LoRA adapter for another, these two adapters should target exactly the same layers and the "hyper-parameters" of the two adapters should be identical. For instance, the LoRA alpha has to be the same: Given that we keep the alpha from the first adapter, the LoRA scaling would be incorrect for the second adapter otherwise. Theoretically, we could override the scaling dict with the alpha values derived from the second adapter's config, but changing the dict will trigger a guard for recompilation, defeating the main purpose of the feature. I also found that compilation flags can have an impact on whether this works or not. E.g. when passing "reduce-overhead", there will be errors of the type: > input name: arg861_1. data pointer changed from 139647332027392 to 139647331054592 I don't know enough about compilation to determine whether this is problematic or not. Current state This is obviously WIP right now to collect feedback and discuss which direction to take this. If this PR turns out to be useful, the hot-swapping functions will be added to PEFT itself and can be imported here (or there is a separate copy in diffusers to avoid the need for a min PEFT version to use this feature). Moreover, more tests need to be added to better cover this feature, although we don't necessarily need tests for the hot-swapping functionality itself, since those tests will be added to PEFT. Furthermore, as of now, this is only implemented for the unet. Other pipeline components have yet to implement this feature. Finally, it should be properly documented. I would like to collect feedback on the current state of the PR before putting more time into finalizing it. * Reviewer feedback * Reviewer feedback, adjust test * Fix, doc * Make fix * Fix for possible g++ error * Add test for recompilation w/o hotswapping * Make hotswap work Requires huggingface/peft#2366 More changes to make hotswapping work. Together with the mentioned PEFT PR, the tests pass for me locally. List of changes: - docstring for hotswap - remove code copied from PEFT, import from PEFT now - adjustments to PeftAdapterMixin.load_lora_adapter (unfortunately, some state dict renaming was necessary, LMK if there is a better solution) - adjustments to UNet2DConditionLoadersMixin._process_lora: LMK if this is even necessary or not, I'm unsure what the overall relationship is between this and PeftAdapterMixin.load_lora_adapter - also in UNet2DConditionLoadersMixin._process_lora, I saw that there is no LoRA unloading when loading the adapter fails, so I added it there (in line with what happens in PeftAdapterMixin.load_lora_adapter) - rewritten tests to avoid shelling out, make the test more precise by making sure that the outputs align, parametrize it - also checked the pipeline code mentioned in this comment: #9453 (comment); when running this inside the with torch._dynamo.config.patch(error_on_recompile=True) context, there is no error, so I think hotswapping is now working with pipelines. * Address reviewer feedback: - Revert deprecated method - Fix PEFT doc link to main - Don't use private function - Clarify magic numbers - Add pipeline test Moreover: - Extend docstrings - Extend existing test for outputs != 0 - Extend existing test for wrong adapter name * Change order of test decorators parameterized.expand seems to ignore skip decorators if added in last place (i.e. innermost decorator). * Split model and pipeline tests Also increase test coverage by also targeting conv2d layers (support of which was added recently on the PEFT PR). * Reviewer feedback: Move decorator to test classes ... instead of having them on each test method. * Apply suggestions from code review Co-authored-by: hlky <hlky@hlky.ac> * Reviewer feedback: version check, TODO comment * Add enable_lora_hotswap method * Reviewer feedback: check _lora_loadable_modules * Revert changes in unet.py * Add possibility to ignore enabled at wrong time * Fix docstrings * Log possible PEFT error, test * Raise helpful error if hotswap not supported I.e. for the text encoder * Formatting * More linter * More ruff * Doc-builder complaint * Update docstring: - mention no text encoder support yet - make it clear that LoRA is meant - mention that same adapter name should be passed * Fix error in docstring * Update more methods with hotswap argument - SDXL - SD3 - Flux No changes were made to load_lora_into_transformer. * Add hotswap argument to load_lora_into_transformer For SD3 and Flux. Use shorter docstring for brevity. * Extend docstrings * Add version guards to tests * Formatting * Fix LoRA loading call to add prefix=None See: #10187 (comment) * Run make fix-copies * Add hot swap documentation to the docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feat: pipeline-level quant config. Co-authored-by: SunMarc <marc.sun@hotmail.fr> condition better. support mapping. improvements. [Quantization] Add Quanto backend (#10756) * update * updaet * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/quantization/quanto.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * Update src/diffusers/quantizers/quanto/utils.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> [Single File] Add single file loading for SANA Transformer (#10947) * added support for from_single_file * added diffusers mapping script * added testcase * bug fix * updated tests * corrected code quality * corrected code quality --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> [LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187) * updates * updates * updates * updates * notebooks revert * fix-copies. * seeing * fix * revert * fixes * fixes * fixes * remove print * fix * conflicts ii. * updates * fixes * better filtering of prefix. --------- Co-authored-by: hlky <hlky@hlky.ac> [LoRA] CogView4 (#10981) * update * make fix-copies * update [Tests] improve quantization tests by additionally measuring the inference memory savings (#11021) * memory usage tests * fixes * gguf [`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998) * Add initial template * Second template * feat: Add TextEmbeddingModule to AnyTextPipeline * feat: Add AuxiliaryLatentModule template to AnyTextPipeline * Add bert tokenizer from the anytext repo for now * feat: Update AnyTextPipeline's modify_prompt method This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe. * Fill in the `forward` pass of `AuxiliaryLatentModule` * `make style && make quality` * `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library` * Update error handling to raise and logging * Add `create_glyph_lines` function into `TextEmbeddingModule` * make style * Up * Up * Up * Up * Remove several comments * refactor: Remove ControlNetConditioningEmbedding and update code accordingly * Up * Up * up * refactor: Update AnyTextPipeline to include new optional parameters * up * feat: Add OCR model and its components * chore: Update `TextEmbeddingModule` to include OCR model components and dependencies * chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task * `make style` * refactor: Update `AnyTextPipeline`'s docstring * Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once * simplify * `make style` * Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function * Simplify for now * `make style` * Up * feat: Add scripts to convert AnyText controlnet to diffusers * `make style` * Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule` * make style * Up * Simplify * Up * feat: Add safetensors module for loading model file * Fix device issues * Up * Up * refactor: Simplify * refactor: Simplify code for loading models and handling data types * `make style` * refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule * refactor: Update dtype in embedding_manager.py to match proj.weight * Up * Add attribution and adaptation information to pipeline_anytext.py * Update usage example * Will refactor `controlnet_cond_embedding` initialization * Add `AnyTextControlNetConditioningEmbedding` template * Refactor organization * style * style * Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding` * Follow one-file policy * style * [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel * [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py * [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py * Refactor AnyTextControlNet to use configurable conditioning embedding channels * Complete control net conditioning embedding in AnyTextControlNetModel * up * [FIX] Ensure embeddings use correct device in AnyTextControlNetModel * up * up * style * [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline * [UPDATE] Update example code in anytext.py to use correct font file and improve clarity * down * [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing * update pillow * [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity * [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file * [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency * 🆙 * style * [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py * style * Update examples/research_projects/anytext/README.md Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Remove commented-out image preparation code in AnyTextPipeline * Remove unnecessary blank line in README.md [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018) * update * update * update * update * update * update * update * update * update fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings (#11012) small fix on generating time_ids & embeddings [LoRA] support wan i2v loras from the world. (#11025) * support wan i2v loras from the world. * remove copied from. * upates * add lora. Fix SD3 IPAdapter feature extractor (#11027) chore: fix help messages in advanced diffusion examples (#10923) Fix missing **kwargs in lora_pipeline.py (#11011) * Update lora_pipeline.py * Apply style fixes * fix-copies --------- Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Fix for multi-GPU WAN inference (#10997) Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs Co-authored-by: Jimmy <39@🇺🇸.com> [Refactor] Clean up import utils boilerplate (#11026) * update * update * update Use `output_size` in `repeat_interleave` (#11030) [hybrid inference 🍯🐝] Add VAE encode (#11017) * [hybrid inference 🍯🐝] Add VAE encode * _toctree: add vae encode * Add endpoints, tests * vae_encode docs * vae encode benchmarks * api reference * changelog * Update docs/source/en/hybrid_inference/overview.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007) * Wan Pipeline scaling fix, type hint warning, multi generator fix * Apply suggestions from code review [LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044) * move to warning. * test related changes. Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827) * Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> making ```formatted_images``` initialization compact (#10801) compact writing Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820) * get_1d_rotary_pos_embed support npu * Update src/diffusers/models/embeddings.py --------- Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local> Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: YiYi Xu <yixu310@gmail.com> [Tests] restrict memory tests for quanto for certain schemes. (#11052) * restrict memory tests for quanto for certain schemes. * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fixes * style --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> [LoRA] feat: support non-diffusers wan t2v loras. (#11059) feat: support non-diffusers wan t2v loras. [examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051) Fix: dtype mismatch of prompt embeddings in sd3 controlnet training Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> reverts accidental change that removes attn_mask in attn. Improves fl… (#11065) reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop. Co-authored-by: Juan Acevedo <jfacevedo@google.com> Fix deterministic issue when getting pipeline dtype and device (#10696) Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> [Tests] add requires peft decorator. (#11037) * add requires peft decorator. * install peft conditionally. * conditional deps. Co-authored-by: DN6 <dhruv.nair@gmail.com> --------- Co-authored-by: DN6 <dhruv.nair@gmail.com> CogView4 Control Block (#10809) * cogview4 control training --------- Co-authored-by: OleehyO <leehy0357@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com> [CI] pin transformers version for benchmarking. (#11067) pin transformers version for benchmarking. updates Fix Wan I2V Quality (#11087) * fix_wan_i2v_quality * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update pipeline_wan_i2v.py --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: hlky <hlky@hlky.ac> LTX 0.9.5 (#10968) * update --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: hlky <hlky@hlky.ac> make PR GPU tests conditioned on styling. (#11099) Group offloading improvements (#11094) update Fix pipeline_flux_controlnet.py (#11095) * Fix pipeline_flux_controlnet.py * Fix style update readme instructions. (#11096) Co-authored-by: Juan Acevedo <jfacevedo@google.com> Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098) Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP Fix Group offloading behaviour when using streams (#11097) * update * update Quality options in `export_to_video` (#11090) * Quality options in `export_to_video` * make style improve more. add placeholders for docstrings. formatting. smol fix. solidify validation and annotation * Revert "feat: pipeline-level quant config." This reverts commit 316ff46. * feat: implement pipeline-level quantization config Co-authored-by: SunMarc <marc@huggingface.co> * update * fixes * fix validation. * add tests and other improvements. * add tests * import quality * remove prints. * add docs. * fixes to docs. * doc fixes. * doc fixes. * add validation to the input quantization_config. * clarify recommendations. * docs * add to ci. * todo. --------- Co-authored-by: SunMarc <marc@huggingface.co>
What does this PR do?
As discussed with Dhruv, this PR improves our logging for LoRAs when the specified state dict isn't an appropriate one, leaving the
load_lora_weights()
,load_lora_into_text_encoder()
, andload_lora_adapter()
methods essentially a no-op.#9950 mentions this problem, and I think it's better to at least let the user know that the specified LoRA state dict is either not a correct one (which we already do) or is an ineffective one (this PR).
This PR also standardizes a bit how we log these kinds of info across our LoRA loading utilities.
Some comments are in-line.