Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2VL using nearest neighbour downsampling by default #7125

Closed
1 task done
selflein opened this issue Mar 1, 2025 · 3 comments · Fixed by #7143 or #7181
Closed
1 task done

Qwen2VL using nearest neighbour downsampling by default #7125

selflein opened this issue Mar 1, 2025 · 3 comments · Fixed by #7143 or #7181
Labels
solved This problem has been already solved

Comments

@selflein
Copy link

selflein commented Mar 1, 2025

Reminder

  • I have read the above rules and searched the existing issues.

System Info

- `llamafactory` version: 0.9.2.dev0
- Platform: Linux-4.15.0-213-generic-x86_64-with-glibc2.27
- Python version: 3.11.9
- PyTorch version: 2.5.1
- Transformers version: 4.49.0
- Datasets version: 3.2.0
- Accelerate version: 0.34.0
- PEFT version: 0.12.0
- TRL version: 0.9.6
- DeepSpeed version: 0.15.4
- vLLM version: 0.7.3.dev0+g0408efc6.d20250208

Reproduction

- Train Qwen2VL model using any config and a dataset that contains images `> image_resolution`
- Images are resized using nearest neighbor interpolation  
  • here for Qwen2 specifically
  • and in the base class here

Others

Nearest neigbor downsampling incurs a large loss in image quality (small lines vanish, see the attached image). Qwen2VL generally uses BICUBIC downsampling instead to circumvent the this problem.

Image

Generally, it would be great if one could directly configure with image_resolution and use the transformers preprocessor that is called after anyway.

@selflein selflein added bug Something isn't working pending This problem is yet to be addressed labels Mar 1, 2025
@hiyouga
Copy link
Owner

hiyouga commented Mar 3, 2025

We have fixed it in #7143, thank you for reporting

@hiyouga hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Mar 3, 2025
@howardchenhd
Copy link

"[Qwen2VL] generally uses BICUBIC downsampling instead to circumvent the this problem." Does that mean Qwen2VL should use BICUBIC? but now it seems Qwen2vl still uses the nearest neighbor in Qwen2vlPlugin._preprocess_image.

@hiyouga
Copy link
Owner

hiyouga commented Mar 6, 2025

@howardchenhd my bad, I'll fix it soon

@hiyouga hiyouga mentioned this issue Mar 6, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants