Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does tgi support image resize for qwen2-vl pipeline? #2920

Open
1 of 4 tasks
AHEADer opened this issue Jan 16, 2025 · 1 comment
Open
1 of 4 tasks

Does tgi support image resize for qwen2-vl pipeline? #2920

AHEADer opened this issue Jan 16, 2025 · 1 comment

Comments

@AHEADer
Copy link

AHEADer commented Jan 16, 2025

System Info

I try to deploy a qwen2-vl fine-tuned model with tgi and vllm, and I've found some results between these two frameworks are different. Seems that tgi consume more tokens compared to vLLM. I checked TGI's code and seems there miss the image resize logic? For Qwen2-VL pipeline, we will resize the image based on two args max_pixels and min_pixels.

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Deploy a Qwen2-VL-7B model on the inference endpoint, and upload a large image will trigger an error that the input tokens are larger than 32768


### Expected behavior

The server will resize the image based on preprocessor_config.json(max_pixels and min_pixels) and make sure the image tokens will not be too many for a request.
@AHEADer AHEADer closed this as completed Jan 16, 2025
@AHEADer AHEADer changed the title Different result when eval with TGI and vLLM, need to know if the preprocessing is right or not Does tgi support image resize for qwen2-vl pipeline? Jan 22, 2025
@AHEADer AHEADer reopened this Jan 22, 2025
@ashwani-bhat
Copy link

@AHEADer can you provide the docker command you are using with Qwen2-VL-7B?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants