Qwen2VL using nearest neighbour downsampling by default #7125

selflein · 2025-03-01T20:49:10Z

Reminder

I have read the above rules and searched the existing issues.

System Info

- `llamafactory` version: 0.9.2.dev0
- Platform: Linux-4.15.0-213-generic-x86_64-with-glibc2.27
- Python version: 3.11.9
- PyTorch version: 2.5.1
- Transformers version: 4.49.0
- Datasets version: 3.2.0
- Accelerate version: 0.34.0
- PEFT version: 0.12.0
- TRL version: 0.9.6
- DeepSpeed version: 0.15.4
- vLLM version: 0.7.3.dev0+g0408efc6.d20250208

Reproduction

- Train Qwen2VL model using any config and a dataset that contains images `> image_resolution`
- Images are resized using nearest neighbor interpolation

here for Qwen2 specifically
and in the base class here

Others

Nearest neigbor downsampling incurs a large loss in image quality (small lines vanish, see the attached image). Qwen2VL generally uses BICUBIC downsampling instead to circumvent the this problem.

Generally, it would be great if one could directly configure with image_resolution and use the transformers preprocessor that is called after anyway.

The text was updated successfully, but these errors were encountered:

hiyouga · 2025-03-03T16:17:23Z

We have fixed it in #7143, thank you for reporting

howardchenhd · 2025-03-06T02:26:32Z

"[Qwen2VL] generally uses BICUBIC downsampling instead to circumvent the this problem." Does that mean Qwen2VL should use BICUBIC? but now it seems Qwen2vl still uses the nearest neighbor in Qwen2vlPlugin._preprocess_image.

hiyouga · 2025-03-06T06:49:05Z

@howardchenhd my bad, I'll fix it soon

selflein added bug Something isn't working pending This problem is yet to be addressed labels Mar 1, 2025

hiyouga mentioned this issue Mar 3, 2025

[data] use bicubic resampler #7143

Merged

2 tasks

hiyouga closed this as completed in #7143 Mar 3, 2025

hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Mar 3, 2025

hiyouga mentioned this issue Mar 4, 2025

[fix] use bicubic resampler for resizing image volcengine/verl#474

Merged

hiyouga mentioned this issue Mar 6, 2025

[data] fix mm template #7181

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2VL using nearest neighbour downsampling by default #7125

Qwen2VL using nearest neighbour downsampling by default #7125

selflein commented Mar 1, 2025 •

edited

Loading

hiyouga commented Mar 3, 2025 •

edited

Loading

howardchenhd commented Mar 6, 2025

hiyouga commented Mar 6, 2025

Qwen2VL using nearest neighbour downsampling by default #7125

Qwen2VL using nearest neighbour downsampling by default #7125

Comments

selflein commented Mar 1, 2025 • edited Loading

Reminder

System Info

Reproduction

Others

hiyouga commented Mar 3, 2025 • edited Loading

howardchenhd commented Mar 6, 2025

hiyouga commented Mar 6, 2025

selflein commented Mar 1, 2025 •

edited

Loading

hiyouga commented Mar 3, 2025 •

edited

Loading