Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting token flags still results in console warning #195

Open
2 of 4 tasks
endomorphosis opened this issue Jul 28, 2024 · 0 comments
Open
2 of 4 tasks

setting token flags still results in console warning #195

endomorphosis opened this issue Jul 28, 2024 · 0 comments

Comments

@endomorphosis
Copy link

System Info

Intel Gaudi 2 x 8 server

docker run -p 8080:80 -v $volume:/data --runtime=habana -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model --sharded true --num-shard 8 --max-input-tokens 4096 --max-total-tokens 8192 --max-batch-prefill-tokens 8242

2024-07-28T02:19:03.072034Z INFO text_generation_launcher: Model supports up to 8192 but tgi will now set its default to 4096 instead. This is to save VRAM by refusing large prompts in order to allow more users on the same hardware. You can increase that size using --max-batch-prefill-tokens=8242 --max-total-tokens=8192 --max-input-tokens=8191.

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

docker run -p 8080:80 -v $volume:/data --runtime=habana -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model --sharded true --num-shard 8 --max-input-tokens 4096 --max-total-tokens 8192 --max-batch-prefill-tokens 8242

Expected behavior

no warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant