Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

启用swanlab报错 #7173

Closed
1 task done
SunflowerColor opened this issue Mar 5, 2025 · 2 comments · Fixed by #7176
Closed
1 task done

启用swanlab报错 #7173

SunflowerColor opened this issue Mar 5, 2025 · 2 comments · Fixed by #7176
Labels
solved This problem has been already solved

Comments

@SunflowerColor
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

训练时勾选swanlab,发生报错,请问可能是什么问题?
系统环境如下:
- llamafactory` version: 0.9.2.dev0

  • Platform: Windows-11-10.0.26100-SP0
  • Python version: 3.12.8
  • PyTorch version: 2.6.0+cu126 (GPU)
  • Transformers version: 4.49.0
  • Datasets version: 3.2.0
  • Accelerate version: 1.2.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA GeForce RTX 4070 Laptop GPU
  • GPU number: 1
  • GPU memory: 8.00GB
  • Bitsandbytes version: 0.45.3
  • vLLM version: 0.7.3`

Reproduction

O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 8
\        /    Total batch size = 16 | Total steps = 60
 "-____-"     Number of trainable parameters = 9,232,384
swanlab: swanlab version 0.4.11 is available!  Upgrade: `pip install -U swanlab`
swanlab: Tracking run with swanlab version 0.4.10
swanlab: Run data will be saved locally in D:\Code\myllamafactory\swanlog\run-20250305_225609-a3b1799d
swanlab: 👋 Hi SunflowerColor, welcome to swanlab!
swanlab: Syncing run tiger-4 to the cloud
swanlab: 🌟 Run `swanlab watch D:\Code\myllamafactory\swanlog` to view SwanLab Experiment Dashboard locally
swanlab: 🏠 View project at https://swanlab.cn/@SunflowerColor/SafeKnowlege
swanlab: 🚀 View run at https://swanlab.cn/@SunflowerColor/SafeKnowlege/runs/zdjl66d3kmptlxo5i72gn
swanlab: Error happened while training
swanlab: 🌟 Run `swanlab watch D:\Code\myllamafactory\swanlog` to view SwanLab Experiment Dashboard locally
swanlab: 🏠 View project at https://swanlab.cn/@SunflowerColor/SafeKnowlege
swanlab: 🚀 View run at https://swanlab.cn/@SunflowerColor/SafeKnowlege/runs/zdjl66d3kmptlxo5i72gn
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\40801\AppData\Local\Programs\Python\Python312\Scripts\llamafactory-cli.exe\__main__.py", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "D:\Code\fromgithub\LLaMA-Factory\src\llamafactory\cli.py", line 112, in main
    run_exp()
  File "D:\Code\fromgithub\LLaMA-Factory\src\llamafactory\train\tuner.py", line 93, in run_exp
    _training_function(config={"args": args, "callbacks": callbacks})
  File "D:\Code\fromgithub\LLaMA-Factory\src\llamafactory\train\tuner.py", line 67, in _training_function
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "D:\Code\fromgithub\LLaMA-Factory\src\llamafactory\train\sft\workflow.py", line 102, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\40801\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\trainer.py", line 2241, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 213, in _fast_inner_training_loop
  File "C:\Users\40801\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\trainer_callback.py", line 507, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\40801\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\trainer_callback.py", line 557, in call_event
    result = getattr(callback, event)(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\40801\AppData\Local\Programs\Python\Python312\Lib\site-packages\swanlab\integration\transformers.py", line 215, in on_train_begin
    self.setup(args, state, model, **kwargs)
  File "D:\Code\fromgithub\LLaMA-Factory\src\llamafactory\train\trainer_utils.py", line 603, in setup
    swanlab_public_config = self._experiment.get_run().public.json()
                            ^^^^^^^^^^^^^^^^
'SwanLabCallbackExtension' object has no attribute '_experiment'

Others

No response

@SunflowerColor SunflowerColor added bug Something isn't working pending This problem is yet to be addressed labels Mar 5, 2025
@hiyouga
Copy link
Owner

hiyouga commented Mar 5, 2025

@Zeyi-Lin Do you have any thoughts?

@Zeyi-Lin
Copy link
Contributor

Zeyi-Lin commented Mar 5, 2025

Due to some changes in SwanLabCallback in version 0.4.10, you can first downgrade to version 0.4.9:

pip install swanlab==0.4.9

I will also submit a compatible PR to LLaMA Factory soon.

@Zeyi-Lin Zeyi-Lin mentioned this issue Mar 5, 2025
2 tasks
@hiyouga hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants