微调后的模型无法输出特殊token #6345

THUchenzhou · 2024-12-16T06:47:42Z

任务：使用大模型做简单的文本二分类。
使用如下命令添加special_tokens，微调qwen
CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli train examples/train_lora/qwen2_special_token.yaml

model_name_or_path: /data/Qwen/Qwen2.5-0.5B-Instruct

stage: sft
do_train: true
finetuning_type: lora
lora_target: all
deepspeed: examples/deepspeed/ds_z0_config.json
lora_rank: 16
lora_alpha: 32

dataset: special_token
template: qwen
cutoff_len: 4096
overwrite_cache: true
preprocessing_num_workers: 16

output_dir: saves/lora/sft/12_general/qwen25-0.5b/tag_15761_wo_lora_16_lr_500_special_token
logging_steps: 10
save_steps: 6000
plot_loss: true
overwrite_output_dir: true

per_device_train_batch_size: 4
gradient_accumulation_steps: 1
learning_rate: 5.00e-4
num_train_epochs: 1
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
new_special_tokens: "<|LABEL_0|>,<|LABEL_1|>"
resize_vocab: true
additional_target: embed_tokens,lm_head

启动服务CUDA_VISIBLE_DEVICES=0 API_PORT=8000 llamafactory-cli api examples/inference/qwen2.yaml，文件内容如下：

model_name_or_path:/data/Qwen/Qwen2.5-0.5B-Instruct
adapter_name_or_path: saves/lora/sft/12_general/qwen25-0.5b/tag_15761_wo_lora_16_lr_500_special_token
template: qwen
finetuning_type: lora

问题：
启动服务后，请求微调后的模型，无法输出新添加的特殊token。请问是训练参数设置错误，还是启动服务的方式错误？

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-12-17T11:36:07Z

添加参数 skip_special_tokens: true

BeSkyer · 2025-01-15T15:33:28Z

添加参数 skip_special_tokens: true

我也这样添加了，但是没有作用

model

model_name_or_path: /home/huggingface_down/Qwen/Qwen2.5-7B-Instruct

#deepspeed: /home/MoE_LLM/LLaMA-Factory/examples/deepspeed/ds_z2_config.json

flash_attn: auto

method

stage: sft
do_train: true
finetuning_type: lora
lora_target: all
lora_rank: 32
lora_alpha: 64
lora_dropout: 0.03

dataset

dataset: sft_MetaMathQA-balanced-20k-cleaned
template: qwen
cutoff_len: 1024
overwrite_cache: true
preprocessing_num_workers: 8

output

output_dir: /home/MoE_LLM/saves_SFT/qwen2-7b/lora3_MetaMathQA-20k_true/
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: false

train

per_device_train_batch_size: 1
gradient_accumulation_steps: 4
learning_rate: 1.0e-5
num_train_epochs: 5.0 #epochs
lr_scheduler_type: cosine
warmup_ratio: 0.03
bf16: true
ddp_timeout: 180000000
new_special_tokens : ",,,"
resize_vocab : true
additional_target : embed_tokens,lm_head
skip_special_tokens: true

eval

val_size: 0.02
per_device_eval_batch_size: 1

eval_strategy: steps

eval_strategy: steps
eval_steps: 50000
#resume_from_checkpoint : true

hiyouga · 2025-01-15T16:14:56Z

@BeSkyer Fixed in #6624, please update code

github-actions bot added the pending This problem is yet to be addressed label Dec 16, 2024

hiyouga added a commit that referenced this issue Dec 17, 2024

support control eos, fix #6345

eda76de

hiyouga mentioned this issue Dec 17, 2024

[infer] support control eos #6363

Merged

2 tasks

hiyouga closed this as completed in #6363 Dec 17, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Dec 17, 2024

hiyouga mentioned this issue Jan 15, 2025

[inference] fix stop token for object detection #6624

Merged

2 tasks

1587causalai pushed a commit to 1587causalai/llama_factory that referenced this issue Feb 18, 2025

support control eos, fix hiyouga#6345

050a231

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调后的模型无法输出特殊token #6345

微调后的模型无法输出特殊token #6345

THUchenzhou commented Dec 16, 2024

hiyouga commented Dec 17, 2024

BeSkyer commented Jan 15, 2025

hiyouga commented Jan 15, 2025

微调后的模型无法输出特殊token #6345

微调后的模型无法输出特殊token #6345

Comments

THUchenzhou commented Dec 16, 2024

hiyouga commented Dec 17, 2024

BeSkyer commented Jan 15, 2025

model

method

dataset

output

train

eval

eval_strategy: steps

hiyouga commented Jan 15, 2025