Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable kv cache compression for fp vlm #1080

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

eaidova
Copy link
Collaborator

@eaidova eaidova commented Dec 19, 2024

What does this PR do?

Fixes issue with failed minicpmv with ov nightly. starting from 2024.6 openvino will use kv cache compression by default enabled, that may impact model accuracy, but identify when it should be disabled can not be predicted on runtime level, so we proposed addition of specific hint for such models (by our agreement it should be done for noncompressed models only) - extended this approach to handle language models as part visual language models

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@eaidova eaidova added the openvino-test Trigger OpenVINO slow tests label Dec 19, 2024
@eaidova eaidova force-pushed the ea/enable_rt_info_for_vlm branch from ede24e1 to 33cef0f Compare December 19, 2024 06:39
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@eaidova
Copy link
Collaborator Author

eaidova commented Dec 19, 2024

@echarlaix @IlyasMoutawwakil could you please take a look? ov 2024/6 release happened couple of hours ago, so this minicpmv test failure should be visible on main branch

@eaidova eaidova removed the request for review from glegendre01 December 19, 2024 12:46
@nikita-savelyevv nikita-savelyevv mentioned this pull request Dec 19, 2024
3 tasks
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the rapid fix @eaidova

@echarlaix echarlaix merged commit 8ef3997 into huggingface:main Dec 19, 2024
20 of 28 checks passed
AlexKoff88 pushed a commit that referenced this pull request Dec 23, 2024
* Support AWQ models

* Add tests

* Add dependencies

* Fix tests

* enable awq export only if ov support it

* fix style (#2)

* disable awq and gptq install for old torch (#3)

* fix style

* disable autogptq and autoawq install for old transformers testing

* separate common quant models patching and gptq (#4)

* disable windows install (#5)

* separate common quant models patching and gptq

* disable awq windows

* skip logits check for quantized models (#6)

* fix test after rebase

* fix testing condition for 2024.6 and unpatch in case if failed

* Fix qwen2-vl tests (#1084)

* Skip private mdoel loading test for external contributors (#1082)

* Fix reshaping unet if timestep is 0d tensor (#1083)

* Disable kv cache compression for fp vlm (#1080)

* Support AWQ models

* Add tests

* Add dependencies

* Fix tests

* enable awq export only if ov support it

* fix style (#2)

* disable awq and gptq install for old torch (#3)

* fix style

* disable autogptq and autoawq install for old transformers testing

* separate common quant models patching and gptq (#4)

* disable windows install (#5)

* separate common quant models patching and gptq

* disable awq windows

* skip logits check for quantized models (#6)

* fix test after rebase

* fix testing condition for 2024.6 and unpatch in case if failed

* add necessary packages in test_openvino_full

* fix code style after rebase (#7)

---------

Co-authored-by: eaidova <[email protected]>
Co-authored-by: Nikita Savelyev <[email protected]>
Co-authored-by: Ella Charlaix <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openvino-test Trigger OpenVINO slow tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants