-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
微调后的模型无法输出特殊token #6345
微调后的模型无法输出特殊token #6345
Comments
添加参数 |
我也这样添加了,但是没有作用 modelmodel_name_or_path: /home/huggingface_down/Qwen/Qwen2.5-7B-Instruct #deepspeed: /home/MoE_LLM/LLaMA-Factory/examples/deepspeed/ds_z2_config.json flash_attn: auto methodstage: sft datasetdataset: sft_MetaMathQA-balanced-20k-cleaned outputoutput_dir: /home/MoE_LLM/saves_SFT/qwen2-7b/lora3_MetaMathQA-20k_true/ trainper_device_train_batch_size: 1 evalval_size: 0.02 eval_strategy: stepseval_strategy: steps |
任务:使用大模型做简单的文本二分类。
使用如下命令添加special_tokens,微调qwen
CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli train examples/train_lora/qwen2_special_token.yaml
启动服务CUDA_VISIBLE_DEVICES=0 API_PORT=8000 llamafactory-cli api examples/inference/qwen2.yaml,文件内容如下:
问题:
启动服务后,请求微调后的模型,无法输出新添加的特殊token。请问是训练参数设置错误,还是启动服务的方式错误?
The text was updated successfully, but these errors were encountered: