- It's still in progress ...
- 对于chatglm, qwen2, gpt2等主流模型,我们在原有的model architecture 文件的基础上实现了 pre-train + finetune 的流程 (仅使用2000个Question-Answering样本)
- 90% of the model codes are copyed from huggingface transformers library but with some modifications and lots of chinese comment to help you understand the model architecture.
- baichuan
- chatglm
- chatglm2
- chatglm3
- chatglm4
- dbrx
- gemma
- grok1
- llama2
- mixtral
- moss
- qwen
- qwen2
- gpt2
- t5
- llava
- ViT
- 这里我们就拿
qwen2
模型的文件夹结构举例,其余所有模型都非常类似
qwen2
├── README.md
├── configs
│ ├── config.py
│ ├── ds_config.json
|—— models
| ├── __init__.py
| ├── modeling_qwen2.py
| ├── tokenization_qwen2.py
| ├── configuration_qwen2.py
|—— finetune
| ├── __init__.py
| ├── sft_trainer.py
|—— pretrain
| ├── __init__.py
| ├── pretrain.py
|—— evaluation
| ├── __init__.py
| ├── evaluate.py
|—— utils
|—— main.py
|—— cli_demo.py
|—— web_demo.py
- all the training process snapshot and training results are stored in
运行结果截图
folder.
ChatGLM2
only supports transformers version== 4.41.2
, please downgrade usingpip install transformers==4.41.2
ChatGLM2
is already finished with comprehensive explaination and comment added to the files.llama
modeling file is finished and commented, but you can not run it now because some API inllama
is too old, we want to focus on the latest models like qwen who largely appliestransformers
library.qwen
is finished with the following content:qwen
modeling file is finished and commented.qwen
finetuning filefinetune_standard.py
is finished and commented.qwen
pretraining filepretrain.py
is finished and commented.qwen
gptq quantization filerun_gptq.py
is finished and commented.qwen
VLLM implementation filevllm_wrapper.py
is finished and commented.
- if we have time, a tutorial of Beam Search can be updated.
- AutoDL Cloud Platform
- then, make sure to pre-download the model weight (e.g. ChatGLM2-6B on the huggingface) to the local storage (e.g.,
/root/autodl-tmp/models/chatglm2
).
cd chatglm/models/chatglm2
python main.py
- please check
model_compare.xml